high_scalability high_scalability-2010 high_scalability-2010-787 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Getting Real about NoSQL and the SQL-Isn't-Scalable Lie by Dennis Forbes . Buoyed by Canada's Olympic success, Dennis is going for the gold in that least real of sports, the NoSQL vs SQL pursuit. Design Patterns for Distributed Non-Relational Databases by Todd Lipcon. Great coverage of consistent hashing, consitency models, data models, storage layouts, log-structured merge trees, and gossip protocols. Brewer's CAP Conjecture is False. Jim Starkey makes the case the CAP is crap. Kaazing Pushes Web Sockets to Make Browsers Real Time. Bi-directional communication comes to the web, but shouldn't sockets be able to accept connections too? 4 Months with Cassandra, a love story . Cloudkick likes its Linear scalability, Massive write performance, Low operational costs. We'll likely keep moving more data into Cassandra as we need to, but for some data the ability to write arbitrary SQL queries is still very useful . CS 525: Advanced Distributed Systems. A
sentIndex sentText sentNum sentScore
1 Buoyed by Canada's Olympic success, Dennis is going for the gold in that least real of sports, the NoSQL vs SQL pursuit. [sent-2, score-0.169]
2 Great coverage of consistent hashing, consitency models, data models, storage layouts, log-structured merge trees, and gossip protocols. [sent-4, score-0.239]
3 Bi-directional communication comes to the web, but shouldn't sockets be able to accept connections too? [sent-8, score-0.165]
4 A very thorough list of papers on distributed systems. [sent-13, score-0.084]
5 You will see some posts around the web suggesting how you can use AppEngine like a Content Delivery Network – just ignore them, they are just being silly. [sent-21, score-0.196]
6 There is nothing CDN like AppEngine at all (at least not at the moment). [sent-22, score-0.084]
7 It’s a great service and I love it, but my understanding is that there are two datacentre’s serving AppEngine applications – a primary and a backup, not a globally distributed network of edge servers. [sent-23, score-0.137]
8 Royans Scalability links for Feb 28th 2010 Open source helps Facebook achieve massive app scalability by Rodney Gedda . [sent-24, score-0.314]
9 David Recordon, senior open programs manager at Facebook, talks about how the social networking giant uses open source tools to achieve its massive app scalablilty. [sent-25, score-0.518]
10 What I’ve learned over the years is that strong consistency, if done well, can scale to very high levels. [sent-30, score-0.103]
11 At Glue, we'll explore the new technologies that are forming around web applications in a post-cloud world. [sent-33, score-0.226]
12 At Glue, we'll aim to explore things like: APIs and Protocols, Languages/Frameworks, Formats/Standards, Open Data, Platforms/Providers, Storage, Identity. [sent-34, score-0.12]
13 After 10 years of development and 7 years of 24/7 production, the open source nosql graph database Neo4j has finally been released as 1. [sent-37, score-0.529]
wordName wordTfidf (topN-words)
[('appengine', 0.266), ('consistency', 0.214), ('cap', 0.194), ('glue', 0.177), ('sockets', 0.165), ('nosql', 0.141), ('love', 0.137), ('massive', 0.134), ('consitency', 0.13), ('liebydennis', 0.13), ('depression', 0.13), ('kaazing', 0.13), ('nav', 0.13), ('eventual', 0.128), ('starkey', 0.122), ('paulo', 0.122), ('explore', 0.12), ('olympic', 0.117), ('suggesting', 0.112), ('databasesby', 0.112), ('feb', 0.112), ('dennis', 0.109), ('congratulations', 0.109), ('gossip', 0.109), ('conjecture', 0.106), ('forming', 0.106), ('years', 0.103), ('models', 0.103), ('open', 0.102), ('cs', 0.101), ('layouts', 0.101), ('achieve', 0.1), ('todd', 0.094), ('entertaining', 0.094), ('cloudkick', 0.091), ('cassandra', 0.088), ('dynamo', 0.087), ('browsers', 0.086), ('creativity', 0.086), ('canada', 0.085), ('gold', 0.085), ('explores', 0.085), ('least', 0.084), ('thorough', 0.084), ('ignore', 0.084), ('pushes', 0.083), ('clocks', 0.082), ('vector', 0.082), ('sports', 0.082), ('source', 0.08)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000004 787 high scalability-2010-03-03-Hot Scalability Links for March 3, 2010
Introduction: Getting Real about NoSQL and the SQL-Isn't-Scalable Lie by Dennis Forbes . Buoyed by Canada's Olympic success, Dennis is going for the gold in that least real of sports, the NoSQL vs SQL pursuit. Design Patterns for Distributed Non-Relational Databases by Todd Lipcon. Great coverage of consistent hashing, consitency models, data models, storage layouts, log-structured merge trees, and gossip protocols. Brewer's CAP Conjecture is False. Jim Starkey makes the case the CAP is crap. Kaazing Pushes Web Sockets to Make Browsers Real Time. Bi-directional communication comes to the web, but shouldn't sockets be able to accept connections too? 4 Months with Cassandra, a love story . Cloudkick likes its Linear scalability, Massive write performance, Low operational costs. We'll likely keep moving more data into Cassandra as we need to, but for some data the ability to write arbitrary SQL queries is still very useful . CS 525: Advanced Distributed Systems. A
2 0.18130729 1153 high scalability-2011-12-08-Update on Scalable Causal Consistency For Wide-Area Storage With COPS
Introduction: Here are a few updates on the article Paper: Don’t Settle For Eventual: Scalable Causal Consistency For Wide-Area Storage With COPS from Mike Freedman and Wyatt Lloyd. Q: How software architectures could change in response to casual+ consistency? A : I don't really think they would much. Somebody would still run a two-tier architecture in their datacenter: a front-tier of webservers running both (say) PHP and our client library, and a back tier of storage nodes running COPS. (I'm not sure if it was obvious given the discussion of our "thick" client -- you should think of the COPS client dropping in where a memcache client library does...albeit ours has per-session state.) Q: Why not just use vector clocks? A : The problem with vector clocks and scalability has always been that the size of vector clocks in O(N), where N is the number of nodes. So if we want to scale to a datacenter with 10K nodes, each piece of metadata must have size O(10K). And in fact, vector
3 0.17679179 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?
Introduction: It's a truism that we should choose the right tool for the job . Everyone says that. And who can disagree? The problem is this is not helpful advice without being able to answer more specific questions like: What jobs are the tools good at? Will they work on jobs like mine? Is it worth the risk to try something new when all my people know something else and we have a deadline to meet? How can I make all the tools work together? In the NoSQL space this kind of real-world data is still a bit vague. When asked, vendors tend to give very general answers like NoSQL is good for BigData or key-value access. What does that mean for for the developer in the trenches faced with the task of solving a specific problem and there are a dozen confusing choices and no obvious winner? Not a lot. It's often hard to take that next step and imagine how their specific problems could be solved in a way that's worth taking the trouble and risk. Let's change that. What problems are you using NoSQL to sol
4 0.16355726 1022 high scalability-2011-04-13-Paper: NoSQL Databases - NoSQL Introduction and Overview
Introduction: Christof Strauch, from Stuttgart Media University, has written an incredible 120+ page paper titled NoSQL Databases as an introduction and overview to NoSQL databases . The paper was written between 2010-06 and 2011-02, so it may be a bit out of date, but if you are looking to take in the NoSQL world in one big gulp, this is your chance. I asked Christof to give us a short taste of what he was trying to accomplish in his paper: The paper aims at giving a systematic and thorough introduction and overview of the NoSQL field by assembling information dispersed among blogs, wikis and scientific papers. It firstly discusses reasons, rationales and motives for the development and usage of nonrelational database systems. These can be summarized by the need for high scalability, the processing of large amounts of data, the ability to distribute data among many (often commodity) servers, consequently a distribution-aware design of DBMSs. The paper then introduces fundamental concepts,
5 0.15347014 687 high scalability-2009-08-24-How Google Serves Data from Multiple Datacenters
Introduction: Update: Streamy Explains CAP and HBase's Approach to CAP . We plan to employ inter-cluster replication, with each cluster located in a single DC. Remote replication will introduce some eventual consistency into the system, but each cluster will continue to be strongly consistent. Ryan Barrett, Google App Engine datastore lead, gave this talk Transactions Across Datacenters (and Other Weekend Projects) at the Google I/O 2009 conference. While the talk doesn't necessarily break new technical ground, Ryan does an excellent job explaining and evaluating the different options you have when architecting a system to work across multiple datacenters. This is called multihoming , operating from multiple datacenters simultaneously. As multihoming is one of the most challenging tasks in all computing, Ryan's clear and thoughtful style comfortably leads you through the various options. On the trip you learn: The different multi-homing options are: Backups, Master-Slave, Multi-M
6 0.13373797 301 high scalability-2008-04-08-Google AppEngine - A First Look
8 0.12932983 658 high scalability-2009-07-17-Against all the odds
9 0.12310211 676 high scalability-2009-08-08-Yahoo!'s PNUTS Database: Too Hot, Too Cold or Just Right?
10 0.1227544 739 high scalability-2009-11-09-10 NoSQL Systems Reviewed
11 0.12264086 931 high scalability-2010-10-28-Notes from A NOSQL Evening in Palo Alto
12 0.12222917 950 high scalability-2010-11-30-NoCAP – Part III – GigaSpaces clustering explained..
13 0.12104702 797 high scalability-2010-03-19-Hot Scalability Links for March 19, 2010
14 0.12073346 1017 high scalability-2011-04-06-Netflix: Run Consistency Checkers All the time to Fixup Transactions
15 0.11103118 879 high scalability-2010-08-12-Think of Latency as a Pseudo-permanent Network Partition
16 0.10973976 883 high scalability-2010-08-20-Hot Scalability Links For Aug 20, 2010
17 0.10947437 802 high scalability-2010-04-01-Hot Scalability Links for April 1, 2010
18 0.10842393 1450 high scalability-2013-05-01-Myth: Eric Brewer on Why Banks are BASE Not ACID - Availability Is Revenue
20 0.10682806 1142 high scalability-2011-11-14-Using Gossip Protocols for Failure Detection, Monitoring, Messaging and Other Good Things
topicId topicWeight
[(0, 0.189), (1, 0.05), (2, 0.046), (3, 0.078), (4, 0.065), (5, 0.103), (6, -0.094), (7, -0.02), (8, 0.027), (9, 0.031), (10, 0.006), (11, 0.008), (12, -0.115), (13, -0.058), (14, -0.048), (15, 0.028), (16, 0.091), (17, 0.036), (18, -0.019), (19, -0.079), (20, 0.044), (21, 0.062), (22, 0.016), (23, -0.057), (24, 0.003), (25, -0.051), (26, 0.04), (27, 0.01), (28, 0.018), (29, -0.118), (30, -0.027), (31, -0.029), (32, 0.007), (33, 0.016), (34, -0.004), (35, -0.028), (36, -0.033), (37, 0.02), (38, 0.007), (39, 0.012), (40, 0.016), (41, 0.051), (42, -0.008), (43, 0.021), (44, -0.039), (45, 0.017), (46, -0.011), (47, 0.021), (48, -0.035), (49, -0.028)]
simIndex simValue blogId blogTitle
same-blog 1 0.9602465 787 high scalability-2010-03-03-Hot Scalability Links for March 3, 2010
Introduction: Getting Real about NoSQL and the SQL-Isn't-Scalable Lie by Dennis Forbes . Buoyed by Canada's Olympic success, Dennis is going for the gold in that least real of sports, the NoSQL vs SQL pursuit. Design Patterns for Distributed Non-Relational Databases by Todd Lipcon. Great coverage of consistent hashing, consitency models, data models, storage layouts, log-structured merge trees, and gossip protocols. Brewer's CAP Conjecture is False. Jim Starkey makes the case the CAP is crap. Kaazing Pushes Web Sockets to Make Browsers Real Time. Bi-directional communication comes to the web, but shouldn't sockets be able to accept connections too? 4 Months with Cassandra, a love story . Cloudkick likes its Linear scalability, Massive write performance, Low operational costs. We'll likely keep moving more data into Cassandra as we need to, but for some data the ability to write arbitrary SQL queries is still very useful . CS 525: Advanced Distributed Systems. A
2 0.8005619 1022 high scalability-2011-04-13-Paper: NoSQL Databases - NoSQL Introduction and Overview
Introduction: Christof Strauch, from Stuttgart Media University, has written an incredible 120+ page paper titled NoSQL Databases as an introduction and overview to NoSQL databases . The paper was written between 2010-06 and 2011-02, so it may be a bit out of date, but if you are looking to take in the NoSQL world in one big gulp, this is your chance. I asked Christof to give us a short taste of what he was trying to accomplish in his paper: The paper aims at giving a systematic and thorough introduction and overview of the NoSQL field by assembling information dispersed among blogs, wikis and scientific papers. It firstly discusses reasons, rationales and motives for the development and usage of nonrelational database systems. These can be summarized by the need for high scalability, the processing of large amounts of data, the ability to distribute data among many (often commodity) servers, consequently a distribution-aware design of DBMSs. The paper then introduces fundamental concepts,
3 0.72044665 930 high scalability-2010-10-28-NoSQL Took Away the Relational Model and Gave Nothing Back
Introduction: Update : Benjamin Black said he was the source of the quote and also said I was wrong about what he meant. His real point: The meaning of the statement was that NoSQL systems (really the various map-reduce systems) are lacking a standard model for describing and querying and that developing one should be a high priority task for them. At the A NoSQL Evening in Palo Alto , an audience member, sorry, I couldn't tell who, said something I found really interesting: NoSQL took away the relational model and gave nothing back. The idea being that NoSQL has focussed on ease of use, scalability, performance, etc, but it has lost the idea of how data relates to other data. True to its name, the relational model is very good at capturing a managing relationships. With NoSQL all relationships have been pushed back onto the poor programmer to implement in code rather than the database managing it. We've sacrificed usability. NoSQL is about concurrency, latency, and scalability, but it
4 0.71875417 1459 high scalability-2013-05-16-Paper: Warp: Multi-Key Transactions for Key-Value Stores
Introduction: Looks like an interesting take on "a completely asynchronous, low-latency transaction management protocol, in line with the fully distributed NoSQL architecture." Warp: Multi-Key Transactions for Key-Value Stores  overview: Implementing ACID transactions has been a longstanding challenge for NoSQL systems. Because these systems are based on a sharded architecture, transactions necessarily require coordination across multiple servers. Past work in this space has relied either on heavyweight protocols such as Paxos or clock synchronization for this coordination. This paper presents a novel protocol for coordinating distributed transactions with ACID semantics on top of a sharded data store. Called linear transactions, this protocol achieves scalability by distributing the coordination task to only those servers that hold relevant data for each transaction. It achieves high performance by serializing only those transactions whose concurrent execution could potentially yield a vio
Introduction: Abstract: When designing distributed web services, there are three properties that are commonly desired: consistency, availability, and partition tolerance. It is impossible to achieve all three. In this note, we prove this conjecture in the asynchronous network model, and then discuss solutions to this dilemma in the partially synchronous model.
6 0.7118476 1153 high scalability-2011-12-08-Update on Scalable Causal Consistency For Wide-Area Storage With COPS
7 0.70532274 1450 high scalability-2013-05-01-Myth: Eric Brewer on Why Banks are BASE Not ACID - Availability Is Revenue
9 0.69246185 676 high scalability-2009-08-08-Yahoo!'s PNUTS Database: Too Hot, Too Cold or Just Right?
10 0.68248564 649 high scalability-2009-07-02-Product: Facebook's Cassandra - A Massive Distributed Store
11 0.68172473 1374 high scalability-2012-12-18-Georeplication: When Bad Things Happen to Good Systems
12 0.67595625 890 high scalability-2010-09-01-Paper: The Case for Determinism in Database Systems
13 0.66387314 883 high scalability-2010-08-20-Hot Scalability Links For Aug 20, 2010
14 0.6626547 1327 high scalability-2012-09-21-Stuff The Internet Says On Scalability For September 21, 2012
15 0.65318871 860 high scalability-2010-07-17-Hot Scalability Links for July 17, 2010
16 0.64635384 1097 high scalability-2011-08-12-Stuff The Internet Says On Scalability For August 12, 2011
17 0.64505643 797 high scalability-2010-03-19-Hot Scalability Links for March 19, 2010
18 0.64373785 931 high scalability-2010-10-28-Notes from A NOSQL Evening in Palo Alto
19 0.6416446 1017 high scalability-2011-04-06-Netflix: Run Consistency Checkers All the time to Fixup Transactions
20 0.63961828 874 high scalability-2010-08-07-ArchCamp: Scalable Databases (NoSQL)
topicId topicWeight
[(1, 0.174), (2, 0.183), (10, 0.026), (19, 0.015), (27, 0.018), (30, 0.023), (47, 0.029), (61, 0.156), (79, 0.089), (85, 0.028), (91, 0.018), (94, 0.045), (97, 0.107)]
simIndex simValue blogId blogTitle
same-blog 1 0.95634079 787 high scalability-2010-03-03-Hot Scalability Links for March 3, 2010
Introduction: Getting Real about NoSQL and the SQL-Isn't-Scalable Lie by Dennis Forbes . Buoyed by Canada's Olympic success, Dennis is going for the gold in that least real of sports, the NoSQL vs SQL pursuit. Design Patterns for Distributed Non-Relational Databases by Todd Lipcon. Great coverage of consistent hashing, consitency models, data models, storage layouts, log-structured merge trees, and gossip protocols. Brewer's CAP Conjecture is False. Jim Starkey makes the case the CAP is crap. Kaazing Pushes Web Sockets to Make Browsers Real Time. Bi-directional communication comes to the web, but shouldn't sockets be able to accept connections too? 4 Months with Cassandra, a love story . Cloudkick likes its Linear scalability, Massive write performance, Low operational costs. We'll likely keep moving more data into Cassandra as we need to, but for some data the ability to write arbitrary SQL queries is still very useful . CS 525: Advanced Distributed Systems. A
2 0.93721902 856 high scalability-2010-07-12-Creating Scalable Digital Libraries
Introduction: Like many other media content providers, libraries and museums are increasingly moving their content onto the Web. While the move itself is no easy process (with digitization, web development, and training costs), being able to successfully deliver content to a wide audience is an ongoing concern, particularly for large libraries. Much of the concern is financial, as most libraries do not have the internal budget or outside investors that for-profit businesses enjoy. Even large university libraries will face serious budget constraints that even other university departments, such as science and technology would not face. Creating a scalable infrastructure and also distributing a large digital collection that can handle multiple requests, requires planning that many librarians have not even imagined. They must stop thinking in terms of "one-item-per-customer" and start thinking in terms of numerous users accessing the same information simultaneously. Content Delivery Network
3 0.93505424 1347 high scalability-2012-10-25-Not All Regions are Created Equal - South America Es Bueno
Introduction: Rodrigo Campos shared some interesting benchmark results for AWS instances in the South America Region (SĂŁo Paulo) and US East Region (North Virginia). He summarized the results in this thread on the Guerrilla Capacity Planning list: The same instance type shows a different behavior depending on the region that it is running, this is particularly critical if you depend on multiple regions for disaster recovery or geographical load balancing Generally speaking performance is better and more consistent in the South America region when compared to Virginia, this is probably due to the fact that SA region was the latest to be deployed. Maybe the SA datacenter uses new server models or it is just underutilized, but this is a wild guess Write performance for the medium instance type in Virginia abruptly decays, dropping from almost 400 call/s to something around 300 calls/s, this is not very clear in the scatter plot but if you draw a time-based chart you can clearly see this pa
4 0.93384677 931 high scalability-2010-10-28-Notes from A NOSQL Evening in Palo Alto
Introduction: I along with 180 other people and veritable who's who of NoSQL vendors, attended the A NoSQL Evening in Palo Alto NoSQL Meetup on Tuesday. The format was a panel of 10 vendors--10gen, Basho, CouchOne, Cloudant, Cloudera, GoGrid, InfiniteGraph, Membase, Riptano, Scality--sitting in two rows of chairs in front of what seemed like a pretty diverse audience. Tim Anglade (founder, A NOSQL Summer) moderated. Tim kept things moving by asking a few leading questions and the panel chimed in with answers. Quite a few questions came from the audience, which was refreshing. Overall a genial evening with some good discussion. I was pleased that the panel members didn't just automatically slip into marketing speak. Most of the discussions were on point rather than just another excuse to hit the talking points. There were some complaints about the talk not being technical enough, but I don't think that was really the purpose of this kind of talk. The panel format is excellent at giving a wide ra
5 0.93078732 1289 high scalability-2012-07-23-State of the CDN: More Traffic, Stable Prices, More Products, Profits - Not So Much
Introduction: CDNs ( content delivery networks ) are the secret shadow super powers behind the web and Dan Rayburn at streamingmedia.com is the go to investigative reporter for quality information on CDNs. Every year Dan has a Content Delivery Summit on all things CDN and those videos are now available . Dan also gives a kind of state of the industry talk where he does something wonderful, he gives real numbers and prices. Dan really knows his stuff and is an excellent speaker, so watch the video, but here’s my gloss on the state of the CDN so far this year: Massive growth . Large customers are expecting 126% growth in video traffic over last year; medium size customers are seeing 48% traffic growth, small sized customer are seeing 73.3% traffic growth. More traffic != More profit . Traffic growth doesn’t lead to more profit because the traffic growth is concentrated in larger customers that can make the best deals. Video takes up the largest amount of traffic on a
6 0.92775738 1089 high scalability-2011-07-29-Stuff The Internet Says On Scalability For July 29, 2011
7 0.92703152 383 high scalability-2008-09-10-Shard servers -- go big or small?
8 0.92677814 775 high scalability-2010-02-10-ElasticSearch - Open Source, Distributed, RESTful Search Engine
9 0.92379761 950 high scalability-2010-11-30-NoCAP – Part III – GigaSpaces clustering explained..
10 0.92345607 944 high scalability-2010-11-17-Some Services are More Equal than Others
11 0.92339808 1180 high scalability-2012-01-24-The State of NoSQL in 2012
12 0.92329168 152 high scalability-2007-11-13-Flickr Architecture
13 0.92296869 1031 high scalability-2011-04-28-PaaS on OpenStack - Run Applications on Any Cloud, Any Time Using Any Thing
14 0.9228971 1399 high scalability-2013-02-05-Ask HighScalability: Memcached and Relations
15 0.92271888 810 high scalability-2010-04-14-Parallel Information Retrieval and Other Search Engine Goodness
16 0.92044169 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?
17 0.91942257 1040 high scalability-2011-05-13-Stuff The Internet Says On Scalability For May 13, 2011
18 0.91931248 1002 high scalability-2011-03-09-Productivity vs. Control tradeoffs in PaaS
19 0.91865516 1440 high scalability-2013-04-15-Scaling Pinterest - From 0 to 10s of Billions of Page Views a Month in Two Years
20 0.91733474 1461 high scalability-2013-05-20-The Tumblr Architecture Yahoo Bought for a Cool Billion Dollars