high_scalability high_scalability-2010 high_scalability-2010-842 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: You're Doing it Wrong by Poul-Henning Kamp. Don't look so guilty, he's not talking about you know what, he's talking about writing high-performance server programs: Not just wrong as in not perfect, but wrong as in wasting half, or more, of your performance. What good is an O(log2(n)) algorithm if those operations cause page faults and slow disk operations? For most relevant datasets an O(n) or even an O(n^2) algorithm, which avoids page faults, will run circles around it. A Microsoft Windows Azure primer: the basics by Peter Bright. Nice article explaining the basics of Azure and how it compares to Google and Amazon. A call to change the name from NoSQL to Postmodern Databases . Interesting idea, but the problem is the same one I have for Postmodern Art, when is it? I always feel like I'm in the post-post modern period, yet for art it's really in the early 1900s. Let's save future developers from this existential time crisis. Constructions from Dots and Lines by M
sentIndex sentText sentNum sentScore
1 Don't look so guilty, he's not talking about you know what, he's talking about writing high-performance server programs: Not just wrong as in not perfect, but wrong as in wasting half, or more, of your performance. [sent-2, score-0.456]
2 What good is an O(log2(n)) algorithm if those operations cause page faults and slow disk operations? [sent-3, score-0.139]
3 For most relevant datasets an O(n) or even an O(n^2) algorithm, which avoids page faults, will run circles around it. [sent-4, score-0.099]
4 I always feel like I'm in the post-post modern period, yet for art it's really in the early 1900s. [sent-9, score-0.354]
5 Let's save future developers from this existential time crisis. [sent-10, score-0.099]
6 Delightful yet in-depth explanation of the complex world of graph data structures. [sent-13, score-0.162]
7 To make use of the graphs beyond simply representing their explicit structure, graph traversal frameworks and algorithms have been developed in order to shape graphs by driving the evolution of the entities that they model—e. [sent-14, score-0.509]
8 humans and their relationships to one another and the objects of their world Scaling the Social Graph in the Cloud using InfiniteGraph by Lead Architect Darren Wood. [sent-16, score-0.081]
9 This was the talk he gave at Gluecon and was good intro to their product and the challenges of distributing graph data across more than one node. [sent-17, score-0.255]
10 All at prices up to 25% less than at neighborhood stores . [sent-25, score-0.093]
11 In my more luddite moments I have to hope robots can afford to by those products too. [sent-26, score-0.289]
12 Parallelism is not new; the realization that it is essential for continued progress in high-performance computing is. [sent-30, score-0.103]
13 Parallelism is not yet a paradigm, but may become so if enough people adopt it as the standard practice and standard way of thinking about computation. [sent-31, score-0.099]
14 Often we think that everything can be well designed with the relational model, but this may be not true, just think the effort we need to do every time we map our Java objects, also with modern ORMs. [sent-39, score-0.193]
wordName wordTfidf (topN-words)
[('postmodern', 0.253), ('robots', 0.198), ('peter', 0.191), ('art', 0.173), ('graph', 0.162), ('faults', 0.139), ('wrong', 0.134), ('june', 0.133), ('azure', 0.122), ('warehouseby', 0.115), ('devloper', 0.115), ('jobshuffington', 0.115), ('darren', 0.115), ('guilty', 0.115), ('resurgence', 0.115), ('relational', 0.111), ('dutch', 0.108), ('parallelismby', 0.108), ('cloudusing', 0.108), ('delightful', 0.108), ('luca', 0.103), ('lorenzo', 0.103), ('northscale', 0.103), ('administratorget', 0.103), ('dots', 0.103), ('infinitegraph', 0.103), ('realization', 0.103), ('yet', 0.099), ('primer', 0.099), ('circles', 0.099), ('existential', 0.099), ('gluecon', 0.099), ('algorithm', 0.096), ('talking', 0.094), ('intro', 0.093), ('neighborhood', 0.093), ('rodriguez', 0.093), ('moments', 0.091), ('marko', 0.091), ('wow', 0.089), ('aaron', 0.089), ('traversal', 0.089), ('jack', 0.089), ('robot', 0.087), ('toadvertisea', 0.087), ('usfor', 0.087), ('graphs', 0.086), ('representing', 0.086), ('modern', 0.082), ('objects', 0.081)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999988 842 high scalability-2010-06-16-Hot Scalability Links for June 16, 2010
Introduction: You're Doing it Wrong by Poul-Henning Kamp. Don't look so guilty, he's not talking about you know what, he's talking about writing high-performance server programs: Not just wrong as in not perfect, but wrong as in wasting half, or more, of your performance. What good is an O(log2(n)) algorithm if those operations cause page faults and slow disk operations? For most relevant datasets an O(n) or even an O(n^2) algorithm, which avoids page faults, will run circles around it. A Microsoft Windows Azure primer: the basics by Peter Bright. Nice article explaining the basics of Azure and how it compares to Google and Amazon. A call to change the name from NoSQL to Postmodern Databases . Interesting idea, but the problem is the same one I have for Postmodern Art, when is it? I always feel like I'm in the post-post modern period, yet for art it's really in the early 1900s. Let's save future developers from this existential time crisis. Constructions from Dots and Lines by M
2 0.19120276 628 high scalability-2009-06-13-Neo4j - a Graph Database that Kicks Buttox
Introduction: Update: Social networks in the database: using a graph database . A nice post on representing, traversing, and performing other common social network operations using a graph database. If you are Digg or LinkedIn you can build your own speedy graph database to represent your complex social network relationships. For those of more modest means Neo4j , a graph database, is a good alternative. A graph is a collection nodes (things) and edges (relationships) that connect pairs of nodes. Slap properties (key-value pairs) on nodes and relationships and you have a surprisingly powerful way to represent most anything you can think of. In a graph database "relationships are first-class citizens. They connect two nodes and both nodes and relationships can hold an arbitrary amount of key-value pairs. So you can look at a graph database as a key-value store, with full support for relationships." A graph looks something like: For more lovely examples take a look at the Graph Image Gal
3 0.16927856 626 high scalability-2009-06-10-Paper: Graph Databases and the Future of Large-Scale Knowledge Management
Introduction: Relational databases, document databases, and distributed hash tables get most of the hype these days, but there's another option: graph databases. Back to the future it seems. Here's a really interesting paper by Marko A. Rodriguez introducing the graph model and it's extension to representing the world wide web of data. Modern day open source and commercial graph databases can store on the order of 1 billion relationships with some databases reaching the 10 billion mark. These developments are making the graph database practical for applications that require large-scale knowledge structures. Moreover, with the Web of Data standards set forth by the Linked Data community, it is possible to interlink graph databases across the web into a giant global knowledge structure. This talk will discuss graph databases, their underlying data model, their querying mechanisms, and the benefits of the graph data structure for modeling and analysis.
Introduction: On the surface nothing appears more different than soft data and hard raw materials like iron. Then isn’t it ironic , in the Alanis Morissette sense, that in this Age of Information, great wealth still lies hidden deep beneath piles of stuff? It's so strange how directly digging for dollars in data parallels the great wealth producing models of the Industrial Revolution. The piles of stuff is the Internet. It takes lots of prospecting to find the right stuff. Mighty web crawling machines tirelessly collect stuff, bringing it into their huge maws, then depositing load after load into rack after rack of distributed file system machines. Then armies of still other machines take this stuff and strip out the valuable raw materials, which in the Information Age, are endless bytes of raw data. Link clicks, likes, page views, content, head lines, searches, inbound links, outbound links, search clicks, hashtags, friends, purchases: anything and everything you do on the Internet is a valu
5 0.14113565 835 high scalability-2010-06-03-Hot Scalability Links for June 3, 2010
Introduction: How Big is a Yottabyte? Not so big that the NSA can't hope to store it says CrunchGear : There are a thousand gigabytes in a terabyte, a thousand terabytes in a petabyte, a thousand petabytes in an exabyte, a thousand exabytes in a zettabyte, and a thousand zettabytes in a yottabyte. In other words, a yottabyte is 1,000,000,000,000,000GB . The CMS data aggregation system . The Large Hadron Collider project is using MongoDB as a cache . Here we discuss a new data aggregation system which consumes, indexes and delivers information from different relational and non-relational data sources to answer cross data-service queries and explore meta-data associated with petabytes of experimental data. Google I/O 2010 Videos are up available (many of them anyway). You might be particularly interested in Google Storage for Developers , Building high-throughput data pipelines with Google App Engine , Batch data processing with App Engine , BigQuery and Prediction APIs , and Measure
6 0.13113543 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?
7 0.11975721 805 high scalability-2010-04-06-Strategy: Make it Really Fast vs Do the Work Up Front
8 0.11353302 827 high scalability-2010-05-14-Hot Scalability Links for May 14, 2010
9 0.11241044 806 high scalability-2010-04-08-Hot Scalability Links for April 8, 2010
10 0.11212064 931 high scalability-2010-10-28-Notes from A NOSQL Evening in Palo Alto
11 0.10783795 1088 high scalability-2011-07-27-Making Hadoop 1000x Faster for Graph Problems
12 0.10698718 1064 high scalability-2011-06-20-35+ Use Cases for Choosing Your Next NoSQL Database
13 0.10612437 848 high scalability-2010-06-25-Hot Scalability Links for June 25, 2010
14 0.10603467 802 high scalability-2010-04-01-Hot Scalability Links for April 1, 2010
15 0.10585365 1136 high scalability-2011-11-03-Paper: G2 : A Graph Processing System for Diagnosing Distributed Systems
16 0.10556062 621 high scalability-2009-06-06-Graph server
18 0.10394548 750 high scalability-2009-12-16-Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud
19 0.10384048 327 high scalability-2008-05-27-How I Learned to Stop Worrying and Love Using a Lot of Disk Space to Scale
20 0.10135383 1603 high scalability-2014-02-28-Stuff The Internet Says On Scalability For February 28th, 2014
topicId topicWeight
[(0, 0.183), (1, 0.072), (2, 0.042), (3, 0.07), (4, 0.052), (5, 0.039), (6, -0.119), (7, 0.027), (8, 0.028), (9, 0.059), (10, 0.031), (11, -0.039), (12, -0.049), (13, -0.02), (14, -0.021), (15, -0.051), (16, 0.015), (17, 0.105), (18, 0.037), (19, 0.065), (20, -0.065), (21, -0.047), (22, 0.031), (23, -0.037), (24, -0.018), (25, 0.029), (26, 0.003), (27, 0.038), (28, 0.028), (29, -0.007), (30, -0.023), (31, -0.007), (32, 0.05), (33, 0.016), (34, 0.013), (35, -0.0), (36, 0.012), (37, -0.015), (38, 0.033), (39, 0.047), (40, -0.059), (41, 0.033), (42, -0.035), (43, 0.001), (44, -0.016), (45, -0.038), (46, 0.028), (47, 0.037), (48, 0.026), (49, 0.005)]
simIndex simValue blogId blogTitle
same-blog 1 0.95673722 842 high scalability-2010-06-16-Hot Scalability Links for June 16, 2010
Introduction: You're Doing it Wrong by Poul-Henning Kamp. Don't look so guilty, he's not talking about you know what, he's talking about writing high-performance server programs: Not just wrong as in not perfect, but wrong as in wasting half, or more, of your performance. What good is an O(log2(n)) algorithm if those operations cause page faults and slow disk operations? For most relevant datasets an O(n) or even an O(n^2) algorithm, which avoids page faults, will run circles around it. A Microsoft Windows Azure primer: the basics by Peter Bright. Nice article explaining the basics of Azure and how it compares to Google and Amazon. A call to change the name from NoSQL to Postmodern Databases . Interesting idea, but the problem is the same one I have for Postmodern Art, when is it? I always feel like I'm in the post-post modern period, yet for art it's really in the early 1900s. Let's save future developers from this existential time crisis. Constructions from Dots and Lines by M
2 0.74862784 1136 high scalability-2011-11-03-Paper: G2 : A Graph Processing System for Diagnosing Distributed Systems
Introduction: One of the problems in building distributed systems is figuring out what the heck is going on. Usually endless streams of log files are consulted like ancients using entrails to divine the will of the Gods. To rise above these ancient practices we must rise another level of abstraction and that's the approach described in a Microsoft research paper: G2: A Graph Processing System for Diagnosing Distributed Systems , which uses execution graphs that model runtime events and their correlations in distributed systems . The problem with these schemes is viewing applications, written by programmers in low level code, as execution graphs. But we're heading in this direction in any case. To program a warehouse or an internet sized computer we'll have to write at higher levels of abstraction so code can be executed transparently at runtime on these giant distributed computers. There are many advantages to this approach, fault diagnosis and performance monitoring are just one of the wins
3 0.74227297 827 high scalability-2010-05-14-Hot Scalability Links for May 14, 2010
Introduction: Lots of good ones this week... Scalability, Availability & Stability Patterns . Jonas Boner has 197 slides covering a very wide range of scalability topics. One stop scalability shopping. Horizontal Scalability via Transient, Shardable, and Share-Nothing Resources . Heroku's Adam Wiggins shares what they've learned about scaling based on their experiences building a cloud platform and the hundreds of apps running on it. He describes the next generation architecture he thinks all software should follow in the future. Scalability of the Hadoop Distributed File System . Konstantin V. Shvachko writes a great post analyzing if the limitations imposed on a distributed file system by the single-node namespace server architecture can support 100,000 clients and petabytes of files. Cassandra by Example . Eric Evans created a nice Cassandra tutorial using building a Twitter clone as an example. Many people want to see more data modeling examples. Here you are. UpSizeR: Synthet
4 0.74004853 155 high scalability-2007-11-15-Video: Dryad: A general-purpose distributed execution platform
Introduction: Dryad is Microsoft's answer to Google's map-reduce . What's the question: How do you process really large amounts of data? My initial impression of Dryad is it's like a giant Unix command line filter on steroids. There are lots of inputs, outputs, tees, queues, and merge sorts all connected together by a master exec program. What else does Dryad have to offer the scalable infrastructure wars? Dryad models programs as the execution of a directed acyclic graph. Each vertex is a program and edges are typed communication channels (files, TCP pipes, and shared memory channels within a process). Map-reduce uses a different model. It's more like a large distributed sort where the programmer defines functions for mapping, partitioning, and reducing. Each approach seems to borrow from the spirit of its creating organization. The graph approach seems a bit too complicated and map-reduce seems a bit too simple. How ironic, in the Alanis Morissette sense. Dryad is a middleware layer that ex
5 0.73109168 628 high scalability-2009-06-13-Neo4j - a Graph Database that Kicks Buttox
Introduction: Update: Social networks in the database: using a graph database . A nice post on representing, traversing, and performing other common social network operations using a graph database. If you are Digg or LinkedIn you can build your own speedy graph database to represent your complex social network relationships. For those of more modest means Neo4j , a graph database, is a good alternative. A graph is a collection nodes (things) and edges (relationships) that connect pairs of nodes. Slap properties (key-value pairs) on nodes and relationships and you have a surprisingly powerful way to represent most anything you can think of. In a graph database "relationships are first-class citizens. They connect two nodes and both nodes and relationships can hold an arbitrary amount of key-value pairs. So you can look at a graph database as a key-value store, with full support for relationships." A graph looks something like: For more lovely examples take a look at the Graph Image Gal
6 0.72799206 766 high scalability-2010-01-26-Product: HyperGraphDB - A Graph Database
7 0.72264338 1406 high scalability-2013-02-14-When all the Program's a Graph - Prismatic's Plumbing Library
8 0.72260565 631 high scalability-2009-06-15-Large-scale Graph Computing at Google
9 0.72246665 860 high scalability-2010-07-17-Hot Scalability Links for July 17, 2010
10 0.71990365 626 high scalability-2009-06-10-Paper: Graph Databases and the Future of Large-Scale Knowledge Management
11 0.70725566 801 high scalability-2010-03-30-Running Large Graph Algorithms - Evaluation of Current State-of-the-Art and Lessons Learned
12 0.69489187 1603 high scalability-2014-02-28-Stuff The Internet Says On Scalability For February 28th, 2014
13 0.68413454 1285 high scalability-2012-07-18-Disks Ain't Dead Yet: GraphChi - a disk-based large-scale graph computation
14 0.68401152 973 high scalability-2011-01-14-Stuff The Internet Says On Scalability For January 14, 2011
15 0.68222564 805 high scalability-2010-04-06-Strategy: Make it Really Fast vs Do the Work Up Front
16 0.68105567 1385 high scalability-2013-01-11-Stuff The Internet Says On Scalability For January 11, 2013
17 0.6707986 183 high scalability-2007-12-12-Report from OpenSocial Meetup at Google
18 0.66883719 797 high scalability-2010-03-19-Hot Scalability Links for March 19, 2010
19 0.66624844 1163 high scalability-2011-12-23-Stuff The Internet Says On Scalability For December 23, 2011
20 0.66550493 1607 high scalability-2014-03-07-Stuff The Internet Says On Scalability For March 7th, 2014
topicId topicWeight
[(1, 0.13), (2, 0.171), (27, 0.013), (30, 0.024), (39, 0.238), (40, 0.043), (51, 0.028), (56, 0.02), (61, 0.096), (79, 0.118), (85, 0.028), (94, 0.023)]
simIndex simValue blogId blogTitle
same-blog 1 0.86359334 842 high scalability-2010-06-16-Hot Scalability Links for June 16, 2010
Introduction: You're Doing it Wrong by Poul-Henning Kamp. Don't look so guilty, he's not talking about you know what, he's talking about writing high-performance server programs: Not just wrong as in not perfect, but wrong as in wasting half, or more, of your performance. What good is an O(log2(n)) algorithm if those operations cause page faults and slow disk operations? For most relevant datasets an O(n) or even an O(n^2) algorithm, which avoids page faults, will run circles around it. A Microsoft Windows Azure primer: the basics by Peter Bright. Nice article explaining the basics of Azure and how it compares to Google and Amazon. A call to change the name from NoSQL to Postmodern Databases . Interesting idea, but the problem is the same one I have for Postmodern Art, when is it? I always feel like I'm in the post-post modern period, yet for art it's really in the early 1900s. Let's save future developers from this existential time crisis. Constructions from Dots and Lines by M
2 0.85712236 901 high scalability-2010-09-16-How Can the Large Hadron Collider Withstand One Petabyte of Data a Second?
Introduction: Why is there something rather than nothing? That's the kind of question the Large Hadron Collider in CERN is hopefully poised to answer. And what is the output of this beautiful 17-mile long, 6 billion dollar wabi-sabish proton smashing machine? Data. Great heaping torrents of Grand Canyon sized data. 15 million gigabytes every year. That's 1000 times the information printed in books every year. It's so much data 10,000 scientists will use a grid of 80,000+ computers , in 300 computer centers , in 50 different countries just to help make sense of it all. How will all this data be collected, transported, stored, and analyzed? It turns out, using what amounts to sort of Internet of Particles instead of an Internet of Things. Two good articles have recently shed some electro-magnetic energy in the human visible spectrum on the IT aspects of the collider: LHC computing grid pushes petabytes of data, beats expectations by John Timmer on Ars Technica and an overview of the Br
3 0.84497863 585 high scalability-2009-04-29-How to choice and build perfect server
Introduction: There are a lot of questions about the server components, and how to choice and/or build perfect server with consider the power consumption. So I decide to write about this topic . Key Points: What kind of components the servers needs The Green Computing and the Servers components. How much power the server consume. Choice the right components: Processors, HDD, RAID, Memory Build Server, or buy?
4 0.82946908 1498 high scalability-2013-08-07-RAFT - In Search of an Understandable Consensus Algorithm
Introduction: If like many humans you've found even Paxos Made Simple a bit difficult to understand, you might enjoy RAFT as described in In Search of an Understandable Consensus Algorithm by Stanford's Diego Ongaro and John Ousterhout. The video presentation of the paper is given by John Ousterhout . Both the paper and the video are delightfully accessible. mcherm has a good summary of the paper: A consensus algorithm is: a cluster of servers should record a series of records ("log entries") in response to requests from clients of the cluster. (It may also take action based on those entries.) It does so in a way that guarantees that the responses seen by clients of the cluster will be consistent EVEN in the face of servers crashing in unpredictable ways (but not loosing data that was synched to disk), and networks introducing unpredictable delays or communication blockages. Here's what Raft does. First, it elects a leader, then the leader records the master version of the log, t
5 0.81458253 653 high scalability-2009-07-08-Servers Component - How to choice and build perfect server
Introduction: There are a lot of questions about how the server components, and how to build perfect server with consider the power consumption. Today I will discuss the Server components, and how we can choice better server components with consider the power consumption, efficacy, performance, and price. Key points: What kind of components the servers needs? The Green Computing and the Servers components How much power the server consume Choice the right components: Processor Hard Disk Drive Memory Operating system Build Server, or buy?
6 0.81146544 571 high scalability-2009-04-15-Using HTTP cache headers effectively
7 0.80374837 690 high scalability-2009-08-31-Scaling MySQL on Amazon Web Services
8 0.79990906 1145 high scalability-2011-11-18-Stuff The Internet Says On Scalability For November 18, 2011
9 0.78344595 1148 high scalability-2011-11-29-DataSift Architecture: Realtime Datamining at 120,000 Tweets Per Second
10 0.77960879 1189 high scalability-2012-02-07-Hypertable Routs HBase in Performance Test -- HBase Overwhelmed by Garbage Collection
12 0.7596879 1102 high scalability-2011-08-22-Strategy: Run a Scalable, Available, and Cheap Static Site on S3 or GitHub
15 0.75101829 835 high scalability-2010-06-03-Hot Scalability Links for June 3, 2010
16 0.74393117 679 high scalability-2009-08-11-13 Scalability Best Practices
17 0.7357651 1559 high scalability-2013-12-06-Stuff The Internet Says On Scalability For December 6th, 2013
18 0.73552573 1037 high scalability-2011-05-10-Viddler Architecture - 7 Million Embeds a Day and 1500 Req-Sec Peak
19 0.73500609 972 high scalability-2011-01-11-Google Megastore - 3 Billion Writes and 20 Billion Read Transactions Daily
20 0.73376191 1502 high scalability-2013-08-16-Stuff The Internet Says On Scalability For August 16, 2013