high_scalability high_scalability-2008 high_scalability-2008-387 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: How do we scale datacenters? Should we build a few mammoth million machine datacenters or many smaller micro datacenters? Intuitively we usually go with a bigger is better economies of scale type argument, but it may not be so. What works for Walmart may not work for White Box World. Mega datacenters may actually exhibit diseconomies of scale. It may be better to run applications over many distributed micro datacenters instead of one large one. This paper by Ken Church, Albert Greenberg, and James Hamilton, all from Microsoft, takes a look at the different issues and concludes: Putting it all together, the micro model offers a design point with attractive performance, reliability, scale and cost. Given how much the industry is currently investing in the mega model, the industry would do well to consider the micro alternative. Related Articles Embarrasingly Distributed Cloud Services by James Hamilton Diseconomies of Scale by James Hamilton. Architecture
sentIndex sentText sentNum sentScore
1 Should we build a few mammoth million machine datacenters or many smaller micro datacenters? [sent-2, score-0.638]
2 Intuitively we usually go with a bigger is better economies of scale type argument, but it may not be so. [sent-3, score-0.355]
3 What works for Walmart may not work for White Box World. [sent-4, score-0.105]
4 It may be better to run applications over many distributed micro datacenters instead of one large one. [sent-6, score-0.743]
5 This paper by Ken Church, Albert Greenberg, and James Hamilton, all from Microsoft, takes a look at the different issues and concludes: Putting it all together, the micro model offers a design point with attractive performance, reliability, scale and cost. [sent-7, score-0.649]
6 Given how much the industry is currently investing in the mega model, the industry would do well to consider the micro alternative. [sent-8, score-0.912]
7 Enterprise Data Center Design and Methodology is a practical guide to designing a data center from inception through construction. [sent-12, score-0.488]
8 The fundamental design principles take a simple, flexible, and modular approach based on accurate, real-world requirements and capacities. [sent-13, score-0.666]
9 This approach contradicts the conventional (but totally inadequate) method of using square footage to determine basic capacities like power and cooling requirements. [sent-14, score-1.117]
wordName wordTfidf (topN-words)
[('datacenters', 0.338), ('james', 0.311), ('micro', 0.3), ('mega', 0.253), ('modular', 0.193), ('servicesby', 0.159), ('albert', 0.159), ('contradicts', 0.159), ('footage', 0.159), ('center', 0.146), ('inadequate', 0.143), ('inception', 0.137), ('church', 0.137), ('capacities', 0.137), ('design', 0.127), ('ken', 0.126), ('investing', 0.121), ('scaleby', 0.121), ('walmart', 0.121), ('economies', 0.119), ('concludes', 0.119), ('industry', 0.119), ('methodology', 0.115), ('exhibit', 0.115), ('cooling', 0.115), ('requirements', 0.114), ('rob', 0.108), ('may', 0.105), ('conventional', 0.095), ('square', 0.093), ('accurate', 0.091), ('argument', 0.088), ('hamilton', 0.087), ('attractive', 0.084), ('white', 0.084), ('fundamental', 0.082), ('totally', 0.077), ('guide', 0.076), ('method', 0.076), ('principles', 0.075), ('approach', 0.075), ('break', 0.072), ('model', 0.071), ('determine', 0.069), ('practical', 0.068), ('scale', 0.067), ('putting', 0.066), ('bigger', 0.064), ('basic', 0.062), ('designing', 0.061)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 387 high scalability-2008-09-22-Paper: On Delivering Embarrassingly Distributed Cloud Services
Introduction: How do we scale datacenters? Should we build a few mammoth million machine datacenters or many smaller micro datacenters? Intuitively we usually go with a bigger is better economies of scale type argument, but it may not be so. What works for Walmart may not work for White Box World. Mega datacenters may actually exhibit diseconomies of scale. It may be better to run applications over many distributed micro datacenters instead of one large one. This paper by Ken Church, Albert Greenberg, and James Hamilton, all from Microsoft, takes a look at the different issues and concludes: Putting it all together, the micro model offers a design point with attractive performance, reliability, scale and cost. Given how much the industry is currently investing in the mega model, the industry would do well to consider the micro alternative. Related Articles Embarrasingly Distributed Cloud Services by James Hamilton Diseconomies of Scale by James Hamilton. Architecture
2 0.1595272 687 high scalability-2009-08-24-How Google Serves Data from Multiple Datacenters
Introduction: Update: Streamy Explains CAP and HBase's Approach to CAP . We plan to employ inter-cluster replication, with each cluster located in a single DC. Remote replication will introduce some eventual consistency into the system, but each cluster will continue to be strongly consistent. Ryan Barrett, Google App Engine datastore lead, gave this talk Transactions Across Datacenters (and Other Weekend Projects) at the Google I/O 2009 conference. While the talk doesn't necessarily break new technical ground, Ryan does an excellent job explaining and evaluating the different options you have when architecting a system to work across multiple datacenters. This is called multihoming , operating from multiple datacenters simultaneously. As multihoming is one of the most challenging tasks in all computing, Ryan's clear and thoughtful style comfortably leads you through the various options. On the trip you learn: The different multi-homing options are: Backups, Master-Slave, Multi-M
Introduction: All in all this is still my favorite post and I still think it's an accurate vision of a future. Not everyone agrees, but I guess we'll see..."But it is not complicated. [There's] just a lot of it." \--Richard Feynmanon how the immense variety of the world arises from simple rules.Contents:Have We Reached the End of Scaling?Applications Become Black Boxes Using Markets to Scale and Control CostsLet's Welcome our Neo-Feudal OverlordsThe Economic Argument for the Ambient CloudWhat Will Kill the Cloud?The Amazing Collective Compute Power of the Ambient CloudUsing the Ambient Cloud as an Application RuntimeApplications as Virtual StatesConclusionWe have not yet begun to scale. The world is still fundamentally disconnected and for all our wisdom we are still in the earliest days of learning how to build truly large planet-scaling applications.Today 350 million users on Facebook is a lot of users and five million followers on Twitter is a lot of followers. This may seem like a lot now, but c
4 0.097366169 750 high scalability-2009-12-16-Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud
Introduction: "But it is not complicated. [There's] just a lot of it." \--Richard Feynmanon how the immense variety of the world arises from simple rules.Contents:Have We Reached the End of Scaling?Applications Become Black Boxes Using Markets to Scale and Control CostsLet's Welcome our Neo-Feudal OverlordsThe Economic Argument for the Ambient CloudWhat Will Kill the Cloud?The Amazing Collective Compute Power of the Ambient CloudUsing the Ambient Cloud as an Application RuntimeApplications as Virtual StatesConclusionWe have not yet begun to scale. The world is still fundamentally disconnected and for all our wisdom we are still in the earliest days of learning how to build truly large planet-scaling applications.Today 350 million users on Facebook is a lot of users and five million followers on Twitter is a lot of followers. This may seem like a lot now, but consider we have no planet wide applications yet. None.Tomorrow the numbers foreshadow a newCambrian explosionof connectivity that will look as
5 0.091798961 734 high scalability-2009-10-30-Hot Scalabilty Links for October 30 2009
Introduction: Life beyond Distributed Transactions: an Apostate’s Opinion by Pat Helland. In particular, we focus on the implications that fall out of assuming we cannot have large-scale distributed transactions. T ragedy of the Commons, and Cold Starts - Cold application starts on Google App Engine kill your application's responsiveness. Intel’s 1M IOPS desktop SSD setup by Kevin Burton. What do you get when you take 7 Intel SSDs and throw them in a desktop? 1M IOPS Videos from NoSQL Berlin sessions. Nicely done talks on CAP, MongoDB, Redis, 4th generation object databases, CouchDB, and Riak. Designs, Lessons and Advice from Building Large Distributed Systems by Jeff Dean of Google describing how they do their thing. Here are some glosses on the talk by Greg Linden and James Hamilton. You really can't do better than Greg and James. Advice from Google on Large Distributed Systems by Greg Linden. A nice summary of Jeff Dean's talk. A standard Google server
6 0.091263562 288 high scalability-2008-03-25-Paper: On Designing and Deploying Internet-Scale Services
7 0.089763373 1493 high scalability-2013-07-17-Steve Ballmer Says Microsoft has Over 1 Million Servers - What Does that Really Mean?
9 0.08594846 806 high scalability-2010-04-08-Hot Scalability Links for April 8, 2010
10 0.081642762 827 high scalability-2010-05-14-Hot Scalability Links for May 14, 2010
11 0.080728278 649 high scalability-2009-07-02-Product: Facebook's Cassandra - A Massive Distributed Store
12 0.080208831 284 high scalability-2008-03-19-RAD Lab is Creating a Datacenter Operating System
13 0.077092804 1419 high scalability-2013-03-07-It's a VM Wasteland - A Near Optimal Packing of VMs to Machines Reduces TCO by 22%
14 0.07603275 1392 high scalability-2013-01-23-Building Redundant Datacenter Networks is Not For Sissies - Use an Outside WAN Backbone
15 0.075672559 736 high scalability-2009-11-04-Damn, Which Database do I Use Now?
16 0.075328745 761 high scalability-2010-01-17-Applications Become Black Boxes Using Markets to Scale and Control Costs
17 0.074381292 595 high scalability-2009-05-08-Publish-subscribe model does not scale?
18 0.074183419 913 high scalability-2010-10-01-Hot Scalability Links For Oct 1, 2010
19 0.074142948 1287 high scalability-2012-07-20-Stuff The Internet Says On Scalability For July 20, 2012
20 0.073287927 1527 high scalability-2013-10-04-Stuff The Internet Says On Scalability For October 4th, 2013
topicId topicWeight
[(0, 0.118), (1, 0.041), (2, 0.021), (3, 0.075), (4, -0.034), (5, 0.011), (6, -0.022), (7, -0.013), (8, -0.021), (9, 0.015), (10, -0.005), (11, 0.01), (12, -0.042), (13, 0.02), (14, 0.023), (15, 0.028), (16, 0.011), (17, 0.008), (18, -0.018), (19, 0.024), (20, 0.024), (21, 0.056), (22, -0.057), (23, -0.043), (24, -0.004), (25, 0.019), (26, 0.003), (27, 0.014), (28, -0.016), (29, -0.025), (30, 0.001), (31, 0.029), (32, -0.011), (33, 0.008), (34, -0.002), (35, 0.026), (36, 0.002), (37, 0.0), (38, 0.003), (39, 0.028), (40, 0.007), (41, 0.001), (42, -0.018), (43, 0.012), (44, 0.029), (45, -0.053), (46, 0.003), (47, -0.011), (48, 0.023), (49, -0.029)]
simIndex simValue blogId blogTitle
same-blog 1 0.93799341 387 high scalability-2008-09-22-Paper: On Delivering Embarrassingly Distributed Cloud Services
Introduction: How do we scale datacenters? Should we build a few mammoth million machine datacenters or many smaller micro datacenters? Intuitively we usually go with a bigger is better economies of scale type argument, but it may not be so. What works for Walmart may not work for White Box World. Mega datacenters may actually exhibit diseconomies of scale. It may be better to run applications over many distributed micro datacenters instead of one large one. This paper by Ken Church, Albert Greenberg, and James Hamilton, all from Microsoft, takes a look at the different issues and concludes: Putting it all together, the micro model offers a design point with attractive performance, reliability, scale and cost. Given how much the industry is currently investing in the mega model, the industry would do well to consider the micro alternative. Related Articles Embarrasingly Distributed Cloud Services by James Hamilton Diseconomies of Scale by James Hamilton. Architecture
2 0.77916348 284 high scalability-2008-03-19-RAD Lab is Creating a Datacenter Operating System
Introduction: The RAD Lab (Reliable Adaptive Distributed Systems Laboratory) wants to leapfrog the Big Switch and create The Next Big Switch, skipping the cloud/utility evolutionary stage altogether. This hyper-evolutionary niche buster develops technology so advanced the cloud disperses and you can go back to building your own personal datacenters again. Where Google took years to create their datacenters, using a prefab Datacenter Operating System you might create your own in a long holiday weekend. Not St. Patrick's of course. Their vision: Enable one person to invent and run the next revolutionary IT service, operationally expressing a new business idea as a multi-million-user service over the course of a long weekend. By doing so we hope to enable an Internet "Fortune 1 million". How? By wizardry in the form of a “datacenter operating system” created from a pinch of "statistical machine learning (SML)" and a tincture of "recent insights from networking and distributed systems." Bu
Introduction: Google has released an epic second edition of their ground breaking The Datacenter as a Computer book. It's called an introduction, but at 156 pages I would love to see what the Advanced version would look like! John Fries in a G+ comment has what I think is a perfect summary of the ultimate sense of the book: It's funny, when I was at Google I was initially quite intimidated by interacting with an enormous datacenter, and then I started imagining the entire datacenter was shrunk down into a small box sitting on my desk, and realized it was just another machine and the physical size didn't matter anymore It's such a far ranging book that it's impossible to characterize simply. It covers an amazing diversity of topics, from an introduction to warehouse-scale computing; workloads and software infrastructure; hardware; datacenter architecture; energy and power efficiency; cost structures; how to deal with failures and repairs; and it closes with a discussion of key challenge
4 0.69625854 1219 high scalability-2012-03-30-Stuff The Internet Says On Scalability For March 30, 2012
Introduction: Choosy Mothers Choose HighScalability: Quotable quotes: @itarradellas : "Revolutions in science have often been preceded by revolutions in measurement" @jasongorman : Use dependency injection, not Spring. Use event-driven, asynchronous I/O, not Node.js. Use MVC, not http://ASP.NET MVC etc etc @bernardgolden : #netflix uses most aggressive #aws reservation system. Gets pricing down to ~ 33% of "list' pricing. @ikarzali : Hey, for all facebook's talk at scalability conferences, I have to say Timeline is super slow(!) Howz that memcache workin out for you now? Yahoo! : Amazon's Game-Changing Cloud Was Built By Some Guys In South Africa Foursquare : 1.5 billion check-ins from 15 million people at 30 million different places. How OMGPOP scaled to 36 million users in three weeks . Draw Something has been downloaded 35+ million times; 1 billion pictures created at 3,000 pictures per second; Couchbase is used as the database; SoftLayer is thei
5 0.69276178 1327 high scalability-2012-09-21-Stuff The Internet Says On Scalability For September 21, 2012
Introduction: It's HighScalability Time: @5h15h : Walmart took 40years to get their data warehouse at 400 terabytes. Facebook probably generates that every 4 days Should your database failover automatically or wait for the guiding hands of a helpful human? Jeremy Zawodny in Handling Database Failover at Craigslist says Craigslist and Yahoo! handle failovers manually. Knowing when a failure has happened is so error prone it's better to put in a human breaker in the loop. Others think this could be a SLA buster as write requests can't be processed while the decision is being made. Main issue is knowing anything is true in a distributed system is hard. Review of a paper about scalable things, MPI, and granularity . If you like to read informed critiques that begin with phrases like "this is simply not true" or "utter garbage" then you might find this post by Sébastien Boisvert to be entertaining. The Big Switch: How We Rebuilt Wanelo from Scratch and Lived to Tell About It . Complete
6 0.68030673 1182 high scalability-2012-01-27-Stuff The Internet Says On Scalability For January 27, 2012
7 0.67583066 687 high scalability-2009-08-24-How Google Serves Data from Multiple Datacenters
8 0.67442483 879 high scalability-2010-08-12-Think of Latency as a Pseudo-permanent Network Partition
9 0.66839653 806 high scalability-2010-04-08-Hot Scalability Links for April 8, 2010
10 0.66637737 1137 high scalability-2011-11-04-Stuff The Internet Says On Scalability For November 4, 2011
11 0.66120446 1530 high scalability-2013-10-11-Stuff The Internet Says On Scalability For October 11th, 2013
12 0.65649569 839 high scalability-2010-06-09-Paper: Propagation Networks: A Flexible and Expressive Substrate for Computation
13 0.65526074 913 high scalability-2010-10-01-Hot Scalability Links For Oct 1, 2010
14 0.65369636 1270 high scalability-2012-06-22-Stuff The Internet Says On Scalability For June 22, 2012
15 0.65285861 1344 high scalability-2012-10-19-Stuff The Internet Says On Scalability For October 19, 2012
16 0.65151095 1177 high scalability-2012-01-19-Is it time to get rid of the Linux OS model in the cloud?
17 0.65133893 1637 high scalability-2014-04-25-Stuff The Internet Says On Scalability For April 25th, 2014
18 0.64964086 1338 high scalability-2012-10-11-RAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store
20 0.64856774 288 high scalability-2008-03-25-Paper: On Designing and Deploying Internet-Scale Services
topicId topicWeight
[(1, 0.17), (2, 0.198), (30, 0.039), (34, 0.191), (56, 0.021), (61, 0.026), (79, 0.119), (85, 0.074), (94, 0.059)]
simIndex simValue blogId blogTitle
1 0.94125485 777 high scalability-2010-02-15-Scaling Ambition at StackOverflow
Introduction: Joel Spolsky and Jeff Atwood are raising VC money for StackOverflow. This is interesting for three reasons: 1) Joel has always seemed like a keep it small and grow organically type of guy, so this is a big step in a different direction. 2) It means they think there's a very big market in the Q&A; space and they mean to capture as much as the market as possible. 3) Most importantly for this blog, Joel gives some good advice on when to stay fresh and local and when it's time to jump for the brass ring, scale up your ambition, and go for VC money. Please see Joel's blog post for the details, but here's when to go VC: There’s a land grab going on. There is a provable concept that’s repeatable. The business itself could benefit from the publicity. The investor will add substantial value to the business. The business can potentially have a big exit or become a large, publically traded company. The founders are not in it for their own personal aggrandizement. Joel t
same-blog 2 0.9084869 387 high scalability-2008-09-22-Paper: On Delivering Embarrassingly Distributed Cloud Services
Introduction: How do we scale datacenters? Should we build a few mammoth million machine datacenters or many smaller micro datacenters? Intuitively we usually go with a bigger is better economies of scale type argument, but it may not be so. What works for Walmart may not work for White Box World. Mega datacenters may actually exhibit diseconomies of scale. It may be better to run applications over many distributed micro datacenters instead of one large one. This paper by Ken Church, Albert Greenberg, and James Hamilton, all from Microsoft, takes a look at the different issues and concludes: Putting it all together, the micro model offers a design point with attractive performance, reliability, scale and cost. Given how much the industry is currently investing in the mega model, the industry would do well to consider the micro alternative. Related Articles Embarrasingly Distributed Cloud Services by James Hamilton Diseconomies of Scale by James Hamilton. Architecture
3 0.89657116 1594 high scalability-2014-02-12-Paper: Network Stack Specialization for Performance
Introduction: In the scalability is specialization department here is an interesting paper presented at HotNets '13 on high performance networking: Network Stack Specialization for Performance . The idea is generalizing a service so it fits in the kernel comes at a high performance cost. So move TCP into user space. The result is a web server with ~3.5x the throughput of Nginx "while experiencing low CPU utilization, linear scaling on multicore systems, and saturating current NIC hardware." Here's a good description of the paper published on Layer 9 : Traditionally, servers and OSes have been built to be general purpose. However now we have a high degree of specialization. In fact, in a big web service, you might have thousands of machines dedicated to one function. Therefore, there's scope for specialization. This paper looks at a specific opportunity in that space. Network stacks today are good for high throughput with large transfers, but not small files (which are common in web browsi
4 0.89493734 830 high scalability-2010-05-25-Strategy: Rule of 3 Admins to Save Your Sanity
Introduction: The idea came up in this Hacker News thread , commenting on a 37signals interview, that having three system administrators is the minimum optimal number of admins. Everyone wants to lower their costs by having each admin administer a lot of machines. The problem is when you have fewer than three admins you can never get a break from the constant corrosive pressure of always being on call. When every moment of your life you are dreading the next emergency, it eats at you. Having three admins solves that problem. With three admins you can: Go on a real vacation. The two remaining admins can switch off being on call. Not be on call all the time. A larger shop will naturally have more admins so it's not as big an issue, but at smaller shops trying to minimize head count, carrying three admins (or people in those roles) might be something to consider.
5 0.89027661 1092 high scalability-2011-08-04-Jim Starkey is Creating a Brave New World by Rethinking Databases for the Cloud
Introduction: Jim Starkey , founder of NuoDB , in this thread on the Cloud Computing group, delivers a masterful post on why he thinks the relational model is the best overall compromise amongst the different options, why NewSQL can free itself from the limitations of legacy SQL architectures, and how this creates a brave new lock free world.... I'll [Jim Starkey] go into more detail later in the post for those who care, but the executive summary goes like this: Network latency is relatively high and human attention span is relatively low. So human facing computer systems have to perform their work in a small number of trips between the client and the database server. But the human condition leads inexorably to data complexity. There are really only two strategies to manage this problem. One is to use coarse granularity storage, glombing together related data into a single blob and letting intelligence on the client make sense of it. The other is storing fine granularity data on the s
6 0.87131912 114 high scalability-2007-10-07-Product: Wackamole
7 0.83759028 859 high scalability-2010-07-14-DynaTrace's Top 10 Performance Problems taken from Zappos, Monster, Thomson and Co
8 0.83747154 1557 high scalability-2013-12-02-Evolution of Bazaarvoice’s Architecture to 500M Unique Users Per Month
9 0.83499068 1378 high scalability-2012-12-28-Stuff The Internet Says On Scalability For December 28, 2012
10 0.83094591 304 high scalability-2008-04-19-How to build a real-time analytics system?
11 0.8184554 1053 high scalability-2011-06-06-Apple iCloud: Syncing and Distributed Storage Over Streaming and Centralized Storage
12 0.81778526 1102 high scalability-2011-08-22-Strategy: Run a Scalable, Available, and Cheap Static Site on S3 or GitHub
13 0.81707072 1015 high scalability-2011-04-01-Stuff The Internet Says On Scalability For April 1, 2011
14 0.8164894 865 high scalability-2010-07-27-A Metric A$$-Ton of Joe Stump: The Cloud is Cheaper than Bare Metal
15 0.81544614 1275 high scalability-2012-07-02-C is for Compute - Google Compute Engine (GCE)
16 0.81501162 1476 high scalability-2013-06-14-Stuff The Internet Says On Scalability For June 14, 2013
17 0.81454879 558 high scalability-2009-04-06-How do you monitor the performance of your cluster?
18 0.81454474 327 high scalability-2008-05-27-How I Learned to Stop Worrying and Love Using a Lot of Disk Space to Scale
19 0.81408125 325 high scalability-2008-05-25-How do you explain cloud computing to your grandma?
20 0.8140375 1447 high scalability-2013-04-26-Stuff The Internet Says On Scalability For April 26, 2013