high_scalability high_scalability-2007 high_scalability-2007-139 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Update 2 : Read/WriteWeb has a good article talking about the scalability issues of relational databases and how Dynamo solves them: Amazon Dynamo: The Next Generation Of Virtual Distributed Storage . But since Dynamo is just another frustrating walled garden protected by barbed wire and guard dogs, its relevance is somewhat overstated. Update : Greg Linden has a take on the paper where he questions some of Amazon's design choices: emphasizing write availability over fast reads, a lack of indexing support, use of random distribution for load balancing, and punting on some scalability issues. Werner Vogels, Amazon's avuncular CTO, just announced a new paper on the internal database technology Amazon uses to handle tens of millions customers. I'll dive into more details later, but I thought you'd want to read it hot off the blog. The bad news is it won't be a service. They are keeping this tech not so secret, but very safe. Happily, it's another real-life example to learn from.
sentIndex sentText sentNum sentScore
1 But since Dynamo is just another frustrating walled garden protected by barbed wire and guard dogs, its relevance is somewhat overstated. [sent-2, score-0.623]
2 Update : Greg Linden has a take on the paper where he questions some of Amazon's design choices: emphasizing write availability over fast reads, a lack of indexing support, use of random distribution for load balancing, and punting on some scalability issues. [sent-3, score-0.468]
3 Werner Vogels, Amazon's avuncular CTO, just announced a new paper on the internal database technology Amazon uses to handle tens of millions customers. [sent-4, score-0.343]
4 From the abstract you can get a feel for what the paper is about: Reliability at massive scale is one of the biggest challenges we face at Amazon. [sent-10, score-0.148]
5 com, one of the largest e-commerce operations in the world; even the slightest outage has significant financial consequences and impacts customer trust. [sent-11, score-0.188]
6 com platform, which provides services for many web sites worldwide, is implemented on top of an infrastructure of tens of thousands of servers and network components located in many datacenters around the world. [sent-13, score-0.198]
7 At this scale, small and large components fail continuously and the way persistent state is managed in the face of these failures drives the reliability and scalability of the software systems. [sent-14, score-0.162]
8 This paper presents the design and implementation of Dynamo, a highly available key-value storage system that some of Amazon’s core services use to provide an “always-on” experience. [sent-15, score-0.22]
9 To achieve this level of availability, Dynamo sacrifices consistency under certain failure scenarios. [sent-16, score-0.181]
10 It makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use. [sent-17, score-0.15]
11 Take that Google :-) Their purposeful embracing of probability and manged centers of uncertainty must be dizzying for those from a RDBMS background. [sent-21, score-0.578]
12 You write something and it's assumed consistent, correct, and durable. [sent-23, score-0.084]
13 Now, how do you do this at scale across multiple data centers under failure conditions? [sent-24, score-0.171]
14 So Amazon says writes must go through and we will deal with the complexities that model generates. [sent-26, score-0.165]
15 I love it, because when delve into these problems you realize you need this type of functionality, but it's too complex, so you back away and continue trying to force a square peg in a round whole. [sent-29, score-0.411]
16 To have no fear to go where your requirements leads you is real engineering. [sent-30, score-0.074]
17 I'd love to be a fly in those debugging sessions. [sent-32, score-0.126]
18 But infrastructure takes on self-consciousness of its own when dealing with complex problems, so you just have to deal with knowing you don't know anymore. [sent-33, score-0.08]
19 When you get over your initial "that can't be true" reaction and embrace it, you get something like Dynamo. [sent-35, score-0.156]
20 I'd really love to hear what you guys think about Dynamo. [sent-36, score-0.126]
wordName wordTfidf (topN-words)
[('dynamo', 0.483), ('rdbms', 0.179), ('paper', 0.148), ('amazon', 0.129), ('love', 0.126), ('tens', 0.124), ('delve', 0.12), ('dizzying', 0.12), ('punting', 0.12), ('thatgoogle', 0.12), ('thesoftware', 0.12), ('dogs', 0.112), ('emphasizing', 0.112), ('sacrifices', 0.107), ('slightest', 0.107), ('purposeful', 0.103), ('guard', 0.1), ('walled', 0.1), ('centers', 0.097), ('happily', 0.095), ('peg', 0.095), ('garden', 0.093), ('uncertainty', 0.091), ('embracing', 0.089), ('relevance', 0.089), ('availability', 0.088), ('reliability', 0.088), ('complexities', 0.085), ('frustrating', 0.085), ('assumed', 0.084), ('impacts', 0.081), ('protected', 0.081), ('reaction', 0.081), ('deal', 0.08), ('inthe', 0.08), ('probability', 0.078), ('embrace', 0.075), ('vogels', 0.075), ('resolution', 0.075), ('wire', 0.075), ('crap', 0.075), ('conflict', 0.075), ('failure', 0.074), ('fear', 0.074), ('components', 0.074), ('solves', 0.073), ('presents', 0.072), ('impressions', 0.072), ('announced', 0.071), ('square', 0.07)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999988 139 high scalability-2007-10-30-Paper: Dynamo: Amazon’s Highly Available Key-value Store
Introduction: Update 2 : Read/WriteWeb has a good article talking about the scalability issues of relational databases and how Dynamo solves them: Amazon Dynamo: The Next Generation Of Virtual Distributed Storage . But since Dynamo is just another frustrating walled garden protected by barbed wire and guard dogs, its relevance is somewhat overstated. Update : Greg Linden has a take on the paper where he questions some of Amazon's design choices: emphasizing write availability over fast reads, a lack of indexing support, use of random distribution for load balancing, and punting on some scalability issues. Werner Vogels, Amazon's avuncular CTO, just announced a new paper on the internal database technology Amazon uses to handle tens of millions customers. I'll dive into more details later, but I thought you'd want to read it hot off the blog. The bad news is it won't be a service. They are keeping this tech not so secret, but very safe. Happily, it's another real-life example to learn from.
2 0.1653706 96 high scalability-2007-09-18-Amazon Architecture
Introduction: This is a wonderfully informative Amazon update based on Joachim Rohde's discovery of an interview with Amazon's CTO. You'll learn about how Amazon organizes their teams around services, the CAP theorem of building scalable systems, how they deploy software, and a lot more. Many new additions from the ACM Queue article have also been included. Amazon grew from a tiny online bookstore to one of the largest stores on earth. They did it while pioneering new and interesting ways to rate, review, and recommend products. Greg Linden shared is version of Amazon's birth pangs in a series of blog articles Site: http://amazon.com Information Sources Early Amazon by Greg Linden How Linux saved Amazon millions Interview Werner Vogels - Amazon's CTO Asynchronous Architectures - a nice summary of Werner Vogels' talk by Chris Loosley Learning from the Amazon technology platform - A Conversation with Werner Vogels Werner Vogels' Weblog - building scalable and robus
3 0.15744984 1022 high scalability-2011-04-13-Paper: NoSQL Databases - NoSQL Introduction and Overview
Introduction: Christof Strauch, from Stuttgart Media University, has written an incredible 120+ page paper titled NoSQL Databases as an introduction and overview to NoSQL databases . The paper was written between 2010-06 and 2011-02, so it may be a bit out of date, but if you are looking to take in the NoSQL world in one big gulp, this is your chance. I asked Christof to give us a short taste of what he was trying to accomplish in his paper: The paper aims at giving a systematic and thorough introduction and overview of the NoSQL field by assembling information dispersed among blogs, wikis and scientific papers. It firstly discusses reasons, rationales and motives for the development and usage of nonrelational database systems. These can be summarized by the need for high scalability, the processing of large amounts of data, the ability to distribute data among many (often commodity) servers, consequently a distribution-aware design of DBMSs. The paper then introduces fundamental concepts,
4 0.1271293 38 high scalability-2007-07-30-Build an Infinitely Scalable Infrastructure for $100 Using Amazon Services
Introduction: Can you really create an infinitely scalable infrastructure for less than $100 using Amazon's storage, grid, and queuing services platform? It appears so, at least for the right application. Amazon beams a spot light on the future battle of the roll-your-own versus the connect-the-dots approach to building next generation websites using core external services. Their argument is strong. Using Amazon's platform you can quickly build an infrastructure that would otherwise take an eternity to make, a pile of money to create, and an unbounded mass of people to implement and maintain. Yet Amazon doesn't provide SLAs, so you can you really trust them with your crown jewels? Facebook recently leap frogged Amazon's vision with an even more comprehensive set of services. The battle for the future is on. Site: http://aws.amazon.com/ Information Sources Slides: Building Highly Scalable Web Applications Podcast: Technometria: Amazon Web Services Amazon Services Home . Platform
5 0.12663791 687 high scalability-2009-08-24-How Google Serves Data from Multiple Datacenters
Introduction: Update: Streamy Explains CAP and HBase's Approach to CAP . We plan to employ inter-cluster replication, with each cluster located in a single DC. Remote replication will introduce some eventual consistency into the system, but each cluster will continue to be strongly consistent. Ryan Barrett, Google App Engine datastore lead, gave this talk Transactions Across Datacenters (and Other Weekend Projects) at the Google I/O 2009 conference. While the talk doesn't necessarily break new technical ground, Ryan does an excellent job explaining and evaluating the different options you have when architecting a system to work across multiple datacenters. This is called multihoming , operating from multiple datacenters simultaneously. As multihoming is one of the most challenging tasks in all computing, Ryan's clear and thoughtful style comfortably leads you through the various options. On the trip you learn: The different multi-homing options are: Backups, Master-Slave, Multi-M
6 0.11496449 1180 high scalability-2012-01-24-The State of NoSQL in 2012
7 0.11489492 480 high scalability-2008-12-30-Scalability Perspectives #5: Werner Vogels – The Amazon Technology Platform
8 0.10928847 1508 high scalability-2013-08-28-Sean Hull's 20 Biggest Bottlenecks that Reduce and Slow Down Scalability
10 0.10791803 507 high scalability-2009-02-03-Paper: Optimistic Replication
11 0.10771761 744 high scalability-2009-11-24-Hot Scalability Links for Nov 24 2009
12 0.10477489 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?
13 0.10308234 1098 high scalability-2011-08-15-Should any cloud be considered one availability zone? The Amazon experience says yes.
14 0.10113873 787 high scalability-2010-03-03-Hot Scalability Links for March 3, 2010
15 0.10041384 750 high scalability-2009-12-16-Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud
17 0.098544598 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT
18 0.097463556 1033 high scalability-2011-05-02-The Updated Big List of Articles on the Amazon Outage
19 0.096170768 693 high scalability-2009-09-03-Storage Systems for High Scalable Systems presentation
20 0.095595077 1348 high scalability-2012-10-26-Stuff The Internet Says On Scalability For October 26, 2012
topicId topicWeight
[(0, 0.199), (1, 0.084), (2, 0.003), (3, 0.071), (4, -0.006), (5, 0.001), (6, -0.033), (7, -0.074), (8, 0.011), (9, -0.029), (10, -0.006), (11, 0.011), (12, -0.061), (13, -0.025), (14, 0.029), (15, -0.003), (16, 0.073), (17, -0.002), (18, 0.01), (19, -0.014), (20, 0.076), (21, 0.018), (22, 0.007), (23, -0.01), (24, -0.044), (25, -0.041), (26, 0.036), (27, 0.043), (28, 0.021), (29, 0.005), (30, -0.015), (31, 0.009), (32, 0.029), (33, -0.042), (34, 0.018), (35, -0.023), (36, 0.01), (37, 0.09), (38, 0.031), (39, 0.024), (40, -0.026), (41, -0.024), (42, -0.003), (43, -0.034), (44, 0.01), (45, 0.015), (46, -0.006), (47, -0.039), (48, 0.009), (49, -0.042)]
simIndex simValue blogId blogTitle
same-blog 1 0.9765932 139 high scalability-2007-10-30-Paper: Dynamo: Amazon’s Highly Available Key-value Store
Introduction: Update 2 : Read/WriteWeb has a good article talking about the scalability issues of relational databases and how Dynamo solves them: Amazon Dynamo: The Next Generation Of Virtual Distributed Storage . But since Dynamo is just another frustrating walled garden protected by barbed wire and guard dogs, its relevance is somewhat overstated. Update : Greg Linden has a take on the paper where he questions some of Amazon's design choices: emphasizing write availability over fast reads, a lack of indexing support, use of random distribution for load balancing, and punting on some scalability issues. Werner Vogels, Amazon's avuncular CTO, just announced a new paper on the internal database technology Amazon uses to handle tens of millions customers. I'll dive into more details later, but I thought you'd want to read it hot off the blog. The bad news is it won't be a service. They are keeping this tech not so secret, but very safe. Happily, it's another real-life example to learn from.
2 0.81673855 96 high scalability-2007-09-18-Amazon Architecture
Introduction: This is a wonderfully informative Amazon update based on Joachim Rohde's discovery of an interview with Amazon's CTO. You'll learn about how Amazon organizes their teams around services, the CAP theorem of building scalable systems, how they deploy software, and a lot more. Many new additions from the ACM Queue article have also been included. Amazon grew from a tiny online bookstore to one of the largest stores on earth. They did it while pioneering new and interesting ways to rate, review, and recommend products. Greg Linden shared is version of Amazon's birth pangs in a series of blog articles Site: http://amazon.com Information Sources Early Amazon by Greg Linden How Linux saved Amazon millions Interview Werner Vogels - Amazon's CTO Asynchronous Architectures - a nice summary of Werner Vogels' talk by Chris Loosley Learning from the Amazon technology platform - A Conversation with Werner Vogels Werner Vogels' Weblog - building scalable and robus
3 0.7910288 1431 high scalability-2013-03-29-Stuff The Internet Says On Scalability For March 29, 2013
Introduction: Hey, it's HighScalability time: ( Ukrainian daredevil scaling buildings) 44.6 billion - Tumblr posts; 300 Gb/s - DDoS DNS amplification attacks; 100 million - Eventbrite tickets processed. Quotable Quotes: @tveskov : Alan Kay: “The past 30 years have been completely mundane. It’s all been scaling (of old technology) and Angry Birds” @phrawzty : OH "Complexity is accelerating. We must augment our ability to manage it." #monitorama @stallent : new wave of apps that are bringing the server doing actual legit valuable work back into vogue know scaling is more than hard @calvdee : Fending off a 300Gb/s DDoS attacks would constitute a feat of #highscalability @solarce : "A spider farted in Finland and screwed up my IOPS!" -- @lusis #monitorama @cra : Listening to Shawn Pearce talk about scaling #git at Google with JGit... Android AOSP repos: 19.4GB, 2.5Mreq/day, 5.0TB/day #e
4 0.78787863 925 high scalability-2010-10-22-Paper: Netflix’s Transition to High-Availability Storage Systems
Introduction: In an audacious move for such an established property, Netflix is moving their website out of the comfort of their own datacenter and into the wilds of the Amazon cloud. This paper by Netflix's Siddharth “Sid” Anand, Netflix’s Transition to High-Availability Storage Systems , gives a detailed look at this transition and does a deep dive on SimpleDB best practices, focussing especially on techniques useful to those who are making the move from a RDBMS. Sid is going to give a talk at QCon based on this paper and he would appreciate your feedback. So if you have any comments or thoughts please comment here or email Sid at r39132@hotmail.com or Twitter at @r39132 Here's the introduction from the paper: Circa late 2008, Netflix had a single data center. This single data center raised a few concerns. As a single-point-of-failure (a.k.a. SPOF), it represented a liability – data center outages meant interruptions to service and negative customer impact. Additionally, with growth in both
Introduction: Many enterprises' high-availability architecture is based on the assumption that you can prevent failure from happening by putting all your critical data in a centralized database, back it up with expensive storage, and replicate it somehow between the sites. As I argued in one of my previous posts ( Why Existing Databases (RAC) are So Breakable! ) many of those assumptions are broken at their core, as storage is doomed to failure just like any other device, expensive hardware doesn’t make things any better and database replication is often not enough. One of the main lessons that we can take from the likes of Amazon and Google is that the right way to ensure continuous high availability is by designing our system to cope with failure. We need to assume that what we tend to think of as unthinkable will probably happen, as that’s the nature of failure. So rather than trying to prevent failures, we need to build a system that will tolerate them. As we can learn from a recent outage
7 0.7597506 288 high scalability-2008-03-25-Paper: On Designing and Deploying Internet-Scale Services
8 0.75415277 1028 high scalability-2011-04-22-Stuff The Internet Says On Scalability For April 22, 2011
9 0.74373335 1368 high scalability-2012-12-07-Stuff The Internet Says On Scalability For December 7, 2012
10 0.73645788 1348 high scalability-2012-10-26-Stuff The Internet Says On Scalability For October 26, 2012
11 0.73525292 757 high scalability-2010-01-04-11 Strategies to Rock Your Startup’s Scalability in 2010
12 0.73252928 1206 high scalability-2012-03-09-Stuff The Internet Says On Scalability For March 9, 2012
13 0.73123831 1420 high scalability-2013-03-08-Stuff The Internet Says On Scalability For March 8, 2013
14 0.72875512 950 high scalability-2010-11-30-NoCAP – Part III – GigaSpaces clustering explained..
15 0.72758871 1062 high scalability-2011-06-15-101 Questions to Ask When Considering a NoSQL Database
16 0.7242654 1487 high scalability-2013-07-05-Stuff The Internet Says On Scalability For July 5, 2013
17 0.72285658 1219 high scalability-2012-03-30-Stuff The Internet Says On Scalability For March 30, 2012
18 0.7210812 1513 high scalability-2013-09-06-Stuff The Internet Says On Scalability For September 6, 2013
19 0.71742284 1642 high scalability-2014-05-02-Stuff The Internet Says On Scalability For May 2nd, 2014
20 0.71560228 1527 high scalability-2013-10-04-Stuff The Internet Says On Scalability For October 4th, 2013
topicId topicWeight
[(1, 0.131), (2, 0.203), (5, 0.015), (10, 0.059), (18, 0.161), (30, 0.038), (40, 0.022), (61, 0.13), (79, 0.114), (85, 0.042), (94, 0.011)]
simIndex simValue blogId blogTitle
1 0.9390626 1140 high scalability-2011-11-10-Kill the Telcos Save the Internet - The Unsocial Network
Introduction: Someone is killing the Internet. Since you probably use the Internet everyday you might find this surprising. It almost sounds silly, and the reason is technical, but our crack team of networking experts has examined the patient and made the diagnosis. What did they find? Diagnostic team : the Packet Pushers gang ( Greg Ferro , Jan Zorz , Ivan Pepelnjak ) in the podcast How We Are Killing the Internet . Diagnosis : invasive tunnelation. ( tubes anyone? ) Prognosis : even Dr. House might not be able to help. Cure : go back to what the Internet was; kill the tunnels; route IPv4 and IPv6; have public addresses on everything; disrupt the telcos. This is a classic story in a strange setting--the network--but the themes are universal: centralization vs. decentralization (that's where the telcos obviously come in), good vs. evil, order vs. disorder, tyranny vs. freedom, change vs. stasis, simplicity vs. complexity. And it's all being carried out on battlefield few get
2 0.93317819 475 high scalability-2008-12-22-SLAs in the SaaS space
Introduction: This may be a bit higher level then the general discussion here, but I think this is an important issue in how it relates to reliability and uptime. What kind of SLAs should we be expecting from SaaS services and platforms (e.g. AWS, Google App Engine, Google Premium Apps, salesforce.com, etc.)? Up to today, most SaaS services either have no SLAs or offer very weak penalties. What will it take to get these services up to the point where they can offer the SLAs that users (and more importantly, businesses) require? I presume most of the members here want to see more movement into the cloud and to SaaS services, and I'm thinking that until we see more substantial SLA guarantees, most businesses will continue to shy away as long as they can. Would love to hear what others think. Or am I totally off base?
same-blog 3 0.92946702 139 high scalability-2007-10-30-Paper: Dynamo: Amazon’s Highly Available Key-value Store
Introduction: Update 2 : Read/WriteWeb has a good article talking about the scalability issues of relational databases and how Dynamo solves them: Amazon Dynamo: The Next Generation Of Virtual Distributed Storage . But since Dynamo is just another frustrating walled garden protected by barbed wire and guard dogs, its relevance is somewhat overstated. Update : Greg Linden has a take on the paper where he questions some of Amazon's design choices: emphasizing write availability over fast reads, a lack of indexing support, use of random distribution for load balancing, and punting on some scalability issues. Werner Vogels, Amazon's avuncular CTO, just announced a new paper on the internal database technology Amazon uses to handle tens of millions customers. I'll dive into more details later, but I thought you'd want to read it hot off the blog. The bad news is it won't be a service. They are keeping this tech not so secret, but very safe. Happily, it's another real-life example to learn from.
4 0.92373997 1344 high scalability-2012-10-19-Stuff The Internet Says On Scalability For October 19, 2012
Introduction: It's HighScalability Time: @davilagrau : Youtube, GitHub,..., Are cloud services facing a entropic limit to scalability? Async all the way down? The Tyranny of the Clock : The cost of logic and memory dominated Turing's thinking, but today, communication rather than logic should dominate our thinking. Clock-free design uses less than half, about 40%, as much energy per addition as its clocked counterpart. We can regain the efficiency of local decision making by revolting against the pervasive beat of an external clock. Why Google Compute Engine for OpenStack . Smart move. Having OpenStack work inside a super charged cloud, in private clouds, and as a bridge between the two ought to be quite attractive to developers looking for some sort of ally for independence. All it will take are a few victories to cement new alliances. 3 Lessons That Startups Can Learn From Facebook’s Failed Credits Experiment . I thought this was a great idea too. So what happened? FACEBOOK DID NOT
5 0.90154153 379 high scalability-2008-09-04-Database question for upcoming project
Introduction: We will be developing an RIA that will have a lot of database access. Think something like a QuickBooks but with about 50 transactions entered per hour per user. Users will be in the system for 7 to 9 hours a day and there will be around 20,000 users, all logged in at the same time. Reporting will be done just like a QuickBooks style app plus a lot of extra things you don't do in QuickBooks. Our operations is familiar with W2003 Server and MS SQL Server so they are recommending we stick with that. I originally requested Linux and PostgreSQL. How far can a single database server get me? If we have a 4 processor, 8 core, 128gb server, how far am I going to get before I need to shard or do something else? I know there are a lot of factors involved but in general for this size of a site, what should the strategy be? I've read almost all articles on this website but most of the applications are not RIA type of apps with this type of usage or they are architectures for
7 0.8759613 1654 high scalability-2014-06-05-Cloud Architecture Revolution
8 0.86796623 1153 high scalability-2011-12-08-Update on Scalable Causal Consistency For Wide-Area Storage With COPS
9 0.8671096 187 high scalability-2007-12-14-The Current Pros and Cons List for SimpleDB
10 0.86630428 1147 high scalability-2011-11-25-Stuff The Internet Says On Scalability For November 25, 2011
11 0.86559355 857 high scalability-2010-07-13-DbShards Part Deux - The Internals
12 0.8655172 1600 high scalability-2014-02-21-Stuff The Internet Says On Scalability For February 21st, 2014
13 0.86517406 998 high scalability-2011-03-03-Stack Overflow Architecture Update - Now at 95 Million Page Views a Month
14 0.86513782 961 high scalability-2010-12-21-SQL + NoSQL = Yes !
15 0.86465567 1302 high scalability-2012-08-10-Stuff The Internet Says On Scalability For August 10, 2012
16 0.86461669 1020 high scalability-2011-04-12-Caching and Processing 2TB Mozilla Crash Reports in memory with Hazelcast
17 0.86412871 1559 high scalability-2013-12-06-Stuff The Internet Says On Scalability For December 6th, 2013
18 0.86367726 931 high scalability-2010-10-28-Notes from A NOSQL Evening in Palo Alto
19 0.86259031 856 high scalability-2010-07-12-Creating Scalable Digital Libraries
20 0.86258441 1109 high scalability-2011-09-02-Stuff The Internet Says On Scalability For September 2, 2011