high_scalability high_scalability-2009 high_scalability-2009-734 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Life beyond Distributed Transactions: an Apostate’s Opinion by Pat Helland. In particular, we focus on the implications that fall out of assuming we cannot have large-scale distributed transactions. T ragedy of the Commons, and Cold Starts - Cold application starts on Google App Engine kill your application's responsiveness. Intel’s 1M IOPS desktop SSD setup by Kevin Burton. What do you get when you take 7 Intel SSDs and throw them in a desktop? 1M IOPS Videos from NoSQL Berlin sessions. Nicely done talks on CAP, MongoDB, Redis, 4th generation object databases, CouchDB, and Riak. Designs, Lessons and Advice from Building Large Distributed Systems by Jeff Dean of Google describing how they do their thing. Here are some glosses on the talk by Greg Linden and James Hamilton. You really can't do better than Greg and James. Advice from Google on Large Distributed Systems by Greg Linden. A nice summary of Jeff Dean's talk. A standard Google server
sentIndex sentText sentNum sentScore
1 Life beyond Distributed Transactions: an Apostate’s Opinion by Pat Helland. [sent-1, score-0.072]
2 In particular, we focus on the implications that fall out of assuming we cannot have large-scale distributed transactions. [sent-2, score-0.189]
3 T ragedy of the Commons, and Cold Starts - Cold application starts on Google App Engine kill your application's responsiveness. [sent-3, score-0.091]
4 Intel’s 1M IOPS desktop SSD setup by Kevin Burton. [sent-4, score-0.186]
5 What do you get when you take 7 Intel SSDs and throw them in a desktop? [sent-5, score-0.09]
6 Nicely done talks on CAP, MongoDB, Redis, 4th generation object databases, CouchDB, and Riak. [sent-7, score-0.2]
7 Designs, Lessons and Advice from Building Large Distributed Systems by Jeff Dean of Google describing how they do their thing. [sent-8, score-0.101]
8 A standard Google server appears to have about 16G RAM and 2T of disk; Things will crash. [sent-13, score-0.098]
9 ; When designing for scale, you should design for expected load, ensure it still works at x10, but don't worry about scaling to x100 . [sent-15, score-0.223]
10 A data center wide storage hierarchy; Failure Inevitable; Excellent set of distributed systems rules of thumb; Typical first year for a new cluster; GFS Usage at Google; Working on next generation Big Table system called Spanner. [sent-17, score-0.447]
wordName wordTfidf (topN-words)
[('dean', 0.336), ('greg', 0.263), ('advice', 0.246), ('cold', 0.194), ('desktop', 0.186), ('google', 0.186), ('jeff', 0.178), ('apostate', 0.176), ('james', 0.172), ('distributed', 0.161), ('commons', 0.152), ('pat', 0.143), ('systemsby', 0.14), ('berlin', 0.136), ('thumb', 0.134), ('generation', 0.132), ('lessons', 0.13), ('gfs', 0.123), ('inevitable', 0.123), ('starts', 0.118), ('opinion', 0.117), ('spanner', 0.117), ('hierarchy', 0.107), ('linden', 0.106), ('assuming', 0.103), ('kevin', 0.103), ('describing', 0.101), ('appears', 0.098), ('fall', 0.096), ('iops', 0.094), ('implications', 0.093), ('nicely', 0.092), ('kill', 0.091), ('ssds', 0.091), ('throw', 0.09), ('cap', 0.087), ('intel', 0.087), ('couchdb', 0.087), ('large', 0.081), ('rules', 0.079), ('expected', 0.077), ('worry', 0.076), ('systems', 0.075), ('ssd', 0.074), ('beyond', 0.072), ('summary', 0.07), ('design', 0.07), ('building', 0.069), ('redis', 0.069), ('talks', 0.068)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000001 734 high scalability-2009-10-30-Hot Scalabilty Links for October 30 2009
Introduction: Life beyond Distributed Transactions: an Apostate’s Opinion by Pat Helland. In particular, we focus on the implications that fall out of assuming we cannot have large-scale distributed transactions. T ragedy of the Commons, and Cold Starts - Cold application starts on Google App Engine kill your application's responsiveness. Intel’s 1M IOPS desktop SSD setup by Kevin Burton. What do you get when you take 7 Intel SSDs and throw them in a desktop? 1M IOPS Videos from NoSQL Berlin sessions. Nicely done talks on CAP, MongoDB, Redis, 4th generation object databases, CouchDB, and Riak. Designs, Lessons and Advice from Building Large Distributed Systems by Jeff Dean of Google describing how they do their thing. Here are some glosses on the talk by Greg Linden and James Hamilton. You really can't do better than Greg and James. Advice from Google on Large Distributed Systems by Greg Linden. A nice summary of Jeff Dean's talk. A standard Google server
2 0.19491301 409 high scalability-2008-10-13-Challenges from large scale computing at Google
Introduction: From Greg Linden on a talk Google Fellow Jeff Dean gave last week at University of Washington Computer Science titled "Research Challenges Inspired by Large-Scale Computing at Google" : Coming away from the talk, the biggest points for me were the considerable interest in reducing costs (especially reducing power costs), the suggestion that the Google cluster may eventually contain 10M machines at 1k locations, and the call to action for researchers on distributed systems and databases to think orders of magnitude bigger than they often are, not about running on hundreds of machines in one location, but hundreds of thousands of machines across many locations.
3 0.17865458 1328 high scalability-2012-09-24-Google Spanner's Most Surprising Revelation: NoSQL is Out and NewSQL is In
Introduction: Google recently released a paper on Spanner , their planet enveloping tool for organizing the world’s monetizable information. Reading the Spanner paper I felt it had that chiseled in stone feel that all of Google’s best papers have. An instant classic. Jeff Dean foreshadowed Spanner’s humungousness as early as 2009 . Now Spanner seems fully online, just waiting to handle “millions of machines across hundreds of datacenters and trillions of database rows.” Wow. The Wise have yet to weigh in on Spanner en masse. I look forward to more insightful commentary. There’s a lot to make sense of. What struck me most in the paper was a deeply buried section essentially describing Google’s motivation for shifting away from NoSQL and to NewSQL . The money quote: We believe it is better to have application programmers deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions. This rea
4 0.17235148 309 high scalability-2008-04-23-Behind The Scenes of Google Scalability
Introduction: The recent Data-Intensive Computing Symposium brought together experts in system design, programming, parallel algorithms, data management, scientific applications, and information-based applications to better understand existing capabilities in the development and application of large-scale computing systems, and to explore future opportunities. Google Fellow Jeff Dean had a very interesting presentation on Handling Large Datasets at Google: Current Systems and Future Directions. He discussed: • Hardware infrastructure • Distributed systems infrastructure: –Scheduling system –GFS –BigTable –MapReduce • Challenges and Future Directions –Infrastructure that spans all datacenters –More automation It is really like a "How does Google work" presentation in ~60 slides? Check out the slides and the video !
5 0.13113619 288 high scalability-2008-03-25-Paper: On Designing and Deploying Internet-Scale Services
Introduction: Greg Linden links to a heavily lesson ladened LISA 2007 paper titled On Designing and Deploying Internet-Scale Services by James Hamilton of the Windows Live Services Platform group. I know people crave nitty-gritty details, but this isn't a how to configure a web server article. It hitches you to a rocket and zooms you up to 50,000 feet so you can take a look at best web operations practices from a broad, yet practical perspective. The author and his team of contributors obviously have a lot of in the trenches experience. Many non-obvious topics are covered. And there's a lot to learn from. The paper has too many details to cover here, but the big sections are: Recommendations Automatic Management and Provisioning Dependency Management Release Cycle and Testing Operations and Capacity Planning Graceful Degradation and Admission Control Customer Self-Provisioning and Self-Help Customer and Press Communication Plan In the recommendations we see some of our o
6 0.12622534 1291 high scalability-2012-07-25-Vertical Scaling Ascendant - How are SSDs Changing Architectures?
7 0.12553932 448 high scalability-2008-11-22-Google Architecture
8 0.12352481 1520 high scalability-2013-09-20-Stuff The Internet Says On Scalability For September 20, 2013
9 0.11505577 1535 high scalability-2013-10-21-Google's Sanjay Ghemawat on What Made Google Google and Great Big Data Career Advice
11 0.10765267 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?
12 0.10741648 257 high scalability-2008-02-22-Kevin's Great Adventures in SSDland
13 0.10658537 517 high scalability-2009-02-21-Google AppEngine - A Second Look
14 0.10504031 946 high scalability-2010-11-22-Strategy: Google Sends Canary Requests into the Data Mine
15 0.10440327 1479 high scalability-2013-06-21-Stuff The Internet Says On Scalability For June 21, 2013
16 0.10289138 978 high scalability-2011-01-26-Google Pro Tip: Use Back-of-the-envelope-calculations to Choose the Best Design
17 0.10240961 650 high scalability-2009-07-02-Product: Hbase
18 0.096811973 1089 high scalability-2011-07-29-Stuff The Internet Says On Scalability For July 29, 2011
19 0.094410792 666 high scalability-2009-07-30-Learn How to Think at Scale
20 0.092922345 1055 high scalability-2011-06-08-Stuff to Watch from Google IO 2011
topicId topicWeight
[(0, 0.154), (1, 0.075), (2, 0.004), (3, 0.101), (4, 0.019), (5, 0.072), (6, -0.021), (7, 0.03), (8, 0.079), (9, 0.03), (10, -0.014), (11, -0.096), (12, -0.035), (13, 0.012), (14, -0.008), (15, -0.004), (16, -0.028), (17, -0.048), (18, 0.04), (19, -0.012), (20, 0.101), (21, 0.026), (22, -0.048), (23, -0.117), (24, -0.043), (25, 0.004), (26, 0.024), (27, 0.066), (28, -0.117), (29, 0.023), (30, 0.026), (31, -0.062), (32, 0.056), (33, -0.006), (34, -0.013), (35, 0.003), (36, -0.006), (37, -0.035), (38, 0.028), (39, 0.047), (40, 0.045), (41, -0.045), (42, 0.007), (43, 0.029), (44, -0.008), (45, -0.072), (46, -0.024), (47, -0.028), (48, -0.021), (49, -0.01)]
simIndex simValue blogId blogTitle
same-blog 1 0.97555333 734 high scalability-2009-10-30-Hot Scalabilty Links for October 30 2009
Introduction: Life beyond Distributed Transactions: an Apostate’s Opinion by Pat Helland. In particular, we focus on the implications that fall out of assuming we cannot have large-scale distributed transactions. T ragedy of the Commons, and Cold Starts - Cold application starts on Google App Engine kill your application's responsiveness. Intel’s 1M IOPS desktop SSD setup by Kevin Burton. What do you get when you take 7 Intel SSDs and throw them in a desktop? 1M IOPS Videos from NoSQL Berlin sessions. Nicely done talks on CAP, MongoDB, Redis, 4th generation object databases, CouchDB, and Riak. Designs, Lessons and Advice from Building Large Distributed Systems by Jeff Dean of Google describing how they do their thing. Here are some glosses on the talk by Greg Linden and James Hamilton. You really can't do better than Greg and James. Advice from Google on Large Distributed Systems by Greg Linden. A nice summary of Jeff Dean's talk. A standard Google server
2 0.77847666 1328 high scalability-2012-09-24-Google Spanner's Most Surprising Revelation: NoSQL is Out and NewSQL is In
Introduction: Google recently released a paper on Spanner , their planet enveloping tool for organizing the world’s monetizable information. Reading the Spanner paper I felt it had that chiseled in stone feel that all of Google’s best papers have. An instant classic. Jeff Dean foreshadowed Spanner’s humungousness as early as 2009 . Now Spanner seems fully online, just waiting to handle “millions of machines across hundreds of datacenters and trillions of database rows.” Wow. The Wise have yet to weigh in on Spanner en masse. I look forward to more insightful commentary. There’s a lot to make sense of. What struck me most in the paper was a deeply buried section essentially describing Google’s motivation for shifting away from NoSQL and to NewSQL . The money quote: We believe it is better to have application programmers deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions. This rea
3 0.77495682 409 high scalability-2008-10-13-Challenges from large scale computing at Google
Introduction: From Greg Linden on a talk Google Fellow Jeff Dean gave last week at University of Washington Computer Science titled "Research Challenges Inspired by Large-Scale Computing at Google" : Coming away from the talk, the biggest points for me were the considerable interest in reducing costs (especially reducing power costs), the suggestion that the Google cluster may eventually contain 10M machines at 1k locations, and the call to action for researchers on distributed systems and databases to think orders of magnitude bigger than they often are, not about running on hundreds of machines in one location, but hundreds of thousands of machines across many locations.
Introduction: In a People of ACM interview with Sanjay Ghemawat , a Google Fellow in the Systems Infrastructure Group (MapReduce, BigTable, Spanner, GFS, etc), talks about a few interesting aspects of Google's culture. What Made Google Google Progress is a modern idea. The conviction that future can be changed for the better through individual advancement and action has over hundreds of years driven an exponential growth in the technome. What drives progress? Challenges. Individuals finding and defeating a challenge. There's usually something someone wants to do so badly that they put in the effort, the thought, and the money into solving all the problems. The results are often something new and amazing. And so it was for Google: The main motivation behind the development of much of Google's infrastructure was the challenge of keeping up with ever-growing data sets. For example, at the same time Google's web search was gaining usage very quickly, we were also scaling up the size of ou
5 0.71064085 1117 high scalability-2011-09-16-Stuff The Internet Says On Scalability For September 16, 2011
Introduction: Between love and madness lies HighScalability : Google now 10x better : MapReduce sorts 1 petabyte of data using 8000 computers in 33 minutes; 1 Billion on Social Networks ; Tumblr at 10 Billion Posts ; Twitter at 100 Million Users ; Testing at Google Scale : 1800 builds, 120 million test suites, 60 million tests run daily. From the Dash Memo on Google's Plan: Go is a very promising systems-programming language in the vein of C++. We fully hope and expect that Go becomes the standard back-end language at Google over the next few years. On GAE Go can load from a cold start in 100ms and the typical instance size is 4MB. Is it any wonder Go is a go? Should we expect to see Java and Python deprecated because Go is so much cheaper to run at scale? Potent Quotables: @caciufo : 30x more scalability w/ many-core. So perf doesn't have to level out or vex programmers. #IDF2011 @joerglew : Evaluating divide&conquer; vs. master-slave architecture for wor
6 0.70848012 1107 high scalability-2011-08-29-The Three Ages of Google - Batch, Warehouse, Instant
8 0.69582671 309 high scalability-2008-04-23-Behind The Scenes of Google Scalability
9 0.69540817 75 high scalability-2007-08-28-Google Utilities : An online google guide,tools and Utilities.
10 0.68981206 871 high scalability-2010-08-04-Dremel: Interactive Analysis of Web-Scale Datasets - Data as a Programming Paradigm
11 0.68407446 211 high scalability-2008-01-13-Google Reveals New MapReduce Stats
12 0.67499405 640 high scalability-2009-06-28-Google Voice Architecture
13 0.66779387 483 high scalability-2009-01-04-Paper: MapReduce: Simplified Data Processing on Large Clusters
14 0.66174269 1143 high scalability-2011-11-16-Google+ Infrastructure Update - the JavaScript Story
15 0.65741384 223 high scalability-2008-01-25-Google: Introduction to Distributed System Design
16 0.65278327 362 high scalability-2008-08-11-Distributed Computing & Google Infrastructure
17 0.64832163 1010 high scalability-2011-03-24-Strategy: Disk Backup for Speed, Tape Backup to Save Your Bacon, Just Ask Google
18 0.64108366 1181 high scalability-2012-01-25-Google Goes MoreSQL with Tenzing - SQL Over MapReduce
19 0.63632935 949 high scalability-2010-11-29-Stuff the Internet Says on Scalability For November 29th, 2010
20 0.62809348 1479 high scalability-2013-06-21-Stuff The Internet Says On Scalability For June 21, 2013
topicId topicWeight
[(1, 0.095), (2, 0.231), (10, 0.081), (30, 0.039), (47, 0.02), (61, 0.095), (77, 0.044), (79, 0.141), (83, 0.14), (85, 0.017)]
simIndex simValue blogId blogTitle
same-blog 1 0.93447906 734 high scalability-2009-10-30-Hot Scalabilty Links for October 30 2009
Introduction: Life beyond Distributed Transactions: an Apostate’s Opinion by Pat Helland. In particular, we focus on the implications that fall out of assuming we cannot have large-scale distributed transactions. T ragedy of the Commons, and Cold Starts - Cold application starts on Google App Engine kill your application's responsiveness. Intel’s 1M IOPS desktop SSD setup by Kevin Burton. What do you get when you take 7 Intel SSDs and throw them in a desktop? 1M IOPS Videos from NoSQL Berlin sessions. Nicely done talks on CAP, MongoDB, Redis, 4th generation object databases, CouchDB, and Riak. Designs, Lessons and Advice from Building Large Distributed Systems by Jeff Dean of Google describing how they do their thing. Here are some glosses on the talk by Greg Linden and James Hamilton. You really can't do better than Greg and James. Advice from Google on Large Distributed Systems by Greg Linden. A nice summary of Jeff Dean's talk. A standard Google server
2 0.91069466 963 high scalability-2010-12-23-Paper: CRDTs: Consistency without concurrency control
Introduction: For a great Christmas read forget The Night Before Christmas , a heart warming poem written by Clement Moore for his children, that created the modern idea of Santa Clause we all know and anticipate each Christmas eve. Instead, curl up with a some potent eggnog , nog being any drink made with rum, and read CRDTs: Consistency without concurrency control  by Mihai Letia, Nuno Preguiça, and Marc Shapiro, which talks about CRDTs (Commutative Replicated Data Type), a data type whose operations commute when they are concurrent . From the introduction, which also serves as a nice concise overview of distributed consistency issues: Shared read-only data is easy to scale by using well-understood replication techniques. However, sharing mutable data at a large scale is a difficult problem, because of the CAP impossibility result [5]. Two approaches dominate in practice. One ensures scalability by giving up consistency guarantees, for instance using the Last-Writer-Wins (LWW) approach [
3 0.91015303 236 high scalability-2008-02-03-Ideas on how to scale a shared inventory database???
Introduction: We have a database today that holds all of our shared inventory. How do we scale out ? We run into concurrency issues today as mutliple users may want to access the same inventory,etc. Im sure its a common problem.. So how do folks implement this while also having faster response to available inventory and also ensuring no downtime Thanks
4 0.90239489 1203 high scalability-2012-03-02-Stuff The Internet Says On Scalability For March 2, 2012
Introduction: Please don't squeeze the HighScalability: Quotable quotes: @karmafile : "Scalability" is a much more evil word than we make it out to be @ostaquet : More hardware won't solve #SQL resp. time issues; proper indexing does. @datachick : All computing technology is the rearrangement of data. Data is the center of the universe @jamesurquhart : "Complexity is a characteristic of the system, not of the parts in it." Data is the star of the cat walk, looking fierce in Ilya Katsov's impeccably constructed post on NoSQL Data Modeling Techniques : In this article I provide a short comparison of NoSQL system families from the data modeling point of view and digest several common modeling techniques. Peter Burns talks computer nanosecond time scales as a human might experience them. Your memory == computer registers , L1 cache == papers kept close by, L2 cache == books, RAM == the library down the street, and going to disk is a 3 year odessy for data. F
Introduction: Performance guru Steve Souders gave his keynote presentation, Cache is King! ( slides ), at the HTML5DevCon, besides being an extremely clear explanation of how caching works on the Internet and how to optimize your use of HTTP to get the best performance, Steve ran experiments that found some surprising results on what gave the best web site performance improvements. In his base line test, page loads took 7.65 seconds (median of three runs). What change--Fast Network, No Javascript, or Primed Cache--would make the biggest performance improvement? It was Primed Cache. Fast Network - Using a fast FIOS network the load time was 4.13 seconds. Steve was surprised how big a difference this made, given how much work must happen in the browser. No JavaScript - 4.74 seconds after disabling JavaScript. Both reduces transfers and skips parsing by the browser. Steve thought the effect would have been larger. Primed Cache - 3.46 seconds using a warm cache, less than half than the
6 0.89915556 1291 high scalability-2012-07-25-Vertical Scaling Ascendant - How are SSDs Changing Architectures?
7 0.89752638 1026 high scalability-2011-04-18-6 Ways Not to Scale that Will Make You Hip, Popular and Loved By VCs
8 0.8971954 1159 high scalability-2011-12-19-How Twitter Stores 250 Million Tweets a Day Using MySQL
9 0.89137477 687 high scalability-2009-08-24-How Google Serves Data from Multiple Datacenters
10 0.89107358 780 high scalability-2010-02-19-Twitter’s Plan to Analyze 100 Billion Tweets
11 0.88970393 1207 high scalability-2012-03-12-Google: Taming the Long Latency Tail - When More Machines Equals Worse Results
12 0.88897514 1112 high scalability-2011-09-07-What Google App Engine Price Changes Say About the Future of Web Architecture
13 0.88823229 1186 high scalability-2012-02-02-The Data-Scope Project - 6PB storage, 500GBytes-sec sequential IO, 20M IOPS, 130TFlops
14 0.88769776 1491 high scalability-2013-07-15-Ask HS: What's Wrong with Twitter, Why Isn't One Machine Enough?
15 0.88701528 709 high scalability-2009-09-19-Space Based Programming in .NET
16 0.88565814 619 high scalability-2009-06-05-HotPads Shows the True Cost of Hosting on Amazon
17 0.8854143 1148 high scalability-2011-11-29-DataSift Architecture: Realtime Datamining at 120,000 Tweets Per Second
18 0.88537931 672 high scalability-2009-08-06-An Unorthodox Approach to Database Design : The Coming of the Shard
19 0.88514179 1460 high scalability-2013-05-17-Stuff The Internet Says On Scalability For May 17, 2013
20 0.88458854 1395 high scalability-2013-01-28-DuckDuckGo Architecture - 1 Million Deep Searches a Day and Growing