high_scalability high_scalability-2012 high_scalability-2012-1310 knowledge-graph by maker-knowledge-mining

1310 high scalability-2012-08-23-Economies of Scale in the Datacenter: Gmail is 100x Cheaper to Run than Your Own Server


meta infos for this blog

Source: html

Introduction: Urs Hoelzle , infrastructure guru and SVP at Google, made a really interesting statement about the economics of scale in the datacenter: We’ve shown that when you run a large application in the datacenter, like Gmail, you can, compared to a small organization running their own email server, you can save nearly a factor of 100 in terms of compute and energy, when you run it at scale. My first thought was shock at the magnitude of the difference. 100x is a chasm crosser. Then I thought about Gmail, it's horizontally scalable using technologies that are following Moore's Law (storage and compute), latency requirements are lax, a commodity network is sufficient, and it can be highly automated so management costs scale slower than users. After that it's a simple matter of software :-) Oh, and developing a market where it's "cheaper to run a large thing than a small thing."


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 My first thought was shock at the magnitude of the difference. [sent-2, score-0.52]

2 Then I thought about Gmail, it's horizontally scalable using technologies that are following Moore's Law (storage and compute), latency requirements are lax, a commodity network is sufficient, and it can be highly automated so management costs scale slower than users. [sent-4, score-1.25]

3 After that it's a simple matter of software :-) Oh, and developing a market where it's "cheaper to run a large thing than a small thing. [sent-5, score-0.769]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('gmail', 0.332), ('svp', 0.271), ('areally', 0.271), ('shock', 0.226), ('urs', 0.22), ('chasm', 0.215), ('datacenter', 0.205), ('guru', 0.182), ('compute', 0.174), ('thought', 0.169), ('moore', 0.16), ('economics', 0.157), ('sufficient', 0.149), ('run', 0.139), ('shown', 0.138), ('horizontally', 0.135), ('oh', 0.135), ('factor', 0.128), ('law', 0.127), ('energy', 0.126), ('magnitude', 0.125), ('slower', 0.122), ('organization', 0.121), ('small', 0.116), ('commodity', 0.114), ('cheaper', 0.114), ('developing', 0.105), ('automated', 0.105), ('compared', 0.103), ('terms', 0.103), ('nearly', 0.102), ('email', 0.101), ('market', 0.101), ('requirements', 0.097), ('matter', 0.095), ('following', 0.094), ('save', 0.086), ('large', 0.084), ('technologies', 0.082), ('scale', 0.076), ('thing', 0.075), ('costs', 0.068), ('highly', 0.065), ('made', 0.064), ('management', 0.062), ('latency', 0.061), ('google', 0.057), ('infrastructure', 0.056), ('simple', 0.054), ('interesting', 0.053)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999982 1310 high scalability-2012-08-23-Economies of Scale in the Datacenter: Gmail is 100x Cheaper to Run than Your Own Server

Introduction: Urs Hoelzle , infrastructure guru and SVP at Google, made a really interesting statement about the economics of scale in the datacenter: We’ve shown that when you run a large application in the datacenter, like Gmail, you can, compared to a small organization running their own email server, you can save nearly a factor of 100 in terms of compute and energy, when you run it at scale. My first thought was shock at the magnitude of the difference. 100x is a chasm crosser. Then I thought about Gmail, it's horizontally scalable using technologies that are following Moore's Law (storage and compute), latency requirements are lax, a commodity network is sufficient, and it can be highly automated so management costs scale slower than users. After that it's a simple matter of software :-) Oh, and developing a market where it's "cheaper to run a large thing than a small thing."

2 0.12053291 1647 high scalability-2014-05-14-Google Says Cloud Prices Will Follow Moore’s Law: Are We All Renters Now?

Introduction: After Google cut prices  on their Google Cloud Platform Amazon quickly followed with their own price cuts . Even more interesting is what the future holds for pricing. The near future looks great. After that? We'll see. Adrian Cockcroft highlights that Google thinks prices should follow Moore’s law, which means we should expect prices to halve every 18-24 months. That's good news. Greater cost certainty means you can make much more aggressive build out plans. With the savings you can hire more people, handle more customers, and add those media rich features you thought you couldn't afford. Design is directly related to costs. Without Google competing with Amazon there's little doubt the price reduction curve would be much less favorable. As a late cloud entrant Google is now in a customer acquisition phase, so they are willing to pay for customers, which means lower prices are an acceptable cost of doing business. Profit and high margins are not the objective. Getting marke

3 0.11968176 1010 high scalability-2011-03-24-Strategy: Disk Backup for Speed, Tape Backup to Save Your Bacon, Just Ask Google

Introduction: In Stack Overflow Architecture Update - Now At 95 Million Page Views A Month , a commenter expressed surprise about Stack Overflow's backup strategy:  Backup is to disk for fast retrieval and to tape for historical archiving. The comment was: Really? People still do this? I know some organizations invested a tremendous amount in automated, robotic tape backup, but seriously, a site founded in 2008 is backing up to tape? The Case of the Missing Gmail Accounts I admit that I was surprised at this strategy too. In this age of copying data to disk three times for safety, I also wondered if tape backups were still necessary? Then, like in a movie, an event happened that made sense of everything, Google suffered the quintessential  #firstworldproblem , gmail accounts went missing! Queue emphatic music. And what's more they were taking a long time to come back. There was a palpable fear in the land that email accounts might never be restored. Think about that. They might ne

4 0.11435767 1316 high scalability-2012-09-04-Changing Architectures: New Datacenter Networks Will Set Your Code and Data Free

Introduction: One consequence of IT standardization and commodification has been Google’s datacenter is the computer view of the world. In that view all compute resources (memory, CPU, storage) are fungible. They are interchangeable and location independent, individual computers lose identity and become just a part of a service. Thwarting that nirvana has been the abysmal performance of commodity datacenter networks which have caused the preference of architectures that favor the collocation of state and behaviour on the same box. MapReduce famously ships code over to storage nodes for just this reason. Change the network and you change the fundamental assumption driving collocation based software architectures. You are then free to store data anywhere and move compute anywhere you wish. The datacenter becomes the computer. On the host side with an x8 slot running at PCI-Express 3.0 speeds able to push 8GB/sec (that’s bytes) of bandwidth in both directions, we have

5 0.1133379 750 high scalability-2009-12-16-Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud

Introduction: "But it is not complicated. [There's] just a lot of it." \--Richard Feynmanon how the immense variety of the world arises from simple rules.Contents:Have We Reached the End of Scaling?Applications Become Black Boxes Using Markets to Scale and Control CostsLet's Welcome our Neo-Feudal OverlordsThe Economic Argument for the Ambient CloudWhat Will Kill the Cloud?The Amazing Collective Compute Power of the Ambient CloudUsing the Ambient Cloud as an Application RuntimeApplications as Virtual StatesConclusionWe have not yet begun to scale. The world is still fundamentally disconnected and for all our wisdom we are still in the earliest days of learning how to build truly large planet-scaling applications.Today 350 million users on Facebook is a lot of users and five million followers on Twitter is a lot of followers. This may seem like a lot now, but consider we have no planet wide applications yet. None.Tomorrow the numbers foreshadow a newCambrian explosionof connectivity that will look as

6 0.11318462 1355 high scalability-2012-11-05-Gone Fishin': Building Super Scalable Systems: Blade Runner Meets Autonomic Computing In The Ambient Cloud

7 0.10920486 340 high scalability-2008-06-06-Economies of Non-Scale

8 0.098280393 1589 high scalability-2014-02-03-How Google Backs Up the Internet Along With Exabytes of Other Data

9 0.094393022 1107 high scalability-2011-08-29-The Three Ages of Google - Batch, Warehouse, Instant

10 0.093608491 1207 high scalability-2012-03-12-Google: Taming the Long Latency Tail - When More Machines Equals Worse Results

11 0.092723668 706 high scalability-2009-09-16-The VeriScale Architecture - Elasticity and efficiency for private clouds

12 0.092124388 1116 high scalability-2011-09-15-Paper: It's Time for Low Latency - Inventing the 1 Microsecond Datacenter

13 0.091847226 328 high scalability-2008-05-27-Scalable virus scanning for web-applications

14 0.089791916 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT

15 0.085790068 1269 high scalability-2012-06-20-iDoneThis - Scaling an Email-based App from Scratch

16 0.082834408 1392 high scalability-2013-01-23-Building Redundant Datacenter Networks is Not For Sissies - Use an Outside WAN Backbone

17 0.081830978 1505 high scalability-2013-08-22-The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second edition

18 0.080987178 761 high scalability-2010-01-17-Applications Become Black Boxes Using Markets to Scale and Control Costs

19 0.080365926 169 high scalability-2007-12-01-many website, one setup, many databases

20 0.079353154 195 high scalability-2007-12-28-Amazon's EC2: Pay as You Grow Could Cut Your Costs in Half


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.116), (1, 0.049), (2, 0.028), (3, 0.067), (4, -0.043), (5, -0.052), (6, 0.009), (7, 0.005), (8, -0.028), (9, 0.02), (10, -0.031), (11, -0.051), (12, 0.019), (13, 0.048), (14, 0.052), (15, 0.008), (16, -0.038), (17, 0.013), (18, 0.017), (19, 0.025), (20, 0.012), (21, 0.049), (22, -0.006), (23, -0.06), (24, 0.006), (25, 0.024), (26, 0.011), (27, -0.03), (28, -0.029), (29, -0.036), (30, -0.027), (31, -0.014), (32, -0.012), (33, -0.023), (34, -0.01), (35, -0.004), (36, -0.003), (37, 0.015), (38, 0.004), (39, 0.026), (40, 0.028), (41, 0.025), (42, 0.012), (43, -0.004), (44, -0.042), (45, 0.012), (46, 0.019), (47, -0.073), (48, 0.006), (49, -0.038)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96665108 1310 high scalability-2012-08-23-Economies of Scale in the Datacenter: Gmail is 100x Cheaper to Run than Your Own Server

Introduction: Urs Hoelzle , infrastructure guru and SVP at Google, made a really interesting statement about the economics of scale in the datacenter: We’ve shown that when you run a large application in the datacenter, like Gmail, you can, compared to a small organization running their own email server, you can save nearly a factor of 100 in terms of compute and energy, when you run it at scale. My first thought was shock at the magnitude of the difference. 100x is a chasm crosser. Then I thought about Gmail, it's horizontally scalable using technologies that are following Moore's Law (storage and compute), latency requirements are lax, a commodity network is sufficient, and it can be highly automated so management costs scale slower than users. After that it's a simple matter of software :-) Oh, and developing a market where it's "cheaper to run a large thing than a small thing."

2 0.72153074 1505 high scalability-2013-08-22-The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second edition

Introduction: Google has released an epic second edition of their ground breaking  The Datacenter as a Computer  book. It's called an introduction, but at 156 pages I would love to see what the Advanced version would look like! John Fries in a G+ comment has what I think is a perfect summary of the ultimate sense of the book: It's funny, when I was at Google I was initially quite intimidated by interacting with an enormous datacenter, and then I started imagining the entire datacenter was shrunk down into a small box sitting on my desk, and realized it was just another machine and the physical size didn't matter anymore It's such a far ranging book that it's impossible to characterize simply. It covers an amazing diversity of topics, from an introduction to warehouse-scale computing; workloads and software infrastructure; hardware; datacenter architecture; energy and power efficiency; cost structures; how to deal with failures and repairs; and it closes with a discussion of key challenge

3 0.7209065 1647 high scalability-2014-05-14-Google Says Cloud Prices Will Follow Moore’s Law: Are We All Renters Now?

Introduction: After Google cut prices  on their Google Cloud Platform Amazon quickly followed with their own price cuts . Even more interesting is what the future holds for pricing. The near future looks great. After that? We'll see. Adrian Cockcroft highlights that Google thinks prices should follow Moore’s law, which means we should expect prices to halve every 18-24 months. That's good news. Greater cost certainty means you can make much more aggressive build out plans. With the savings you can hire more people, handle more customers, and add those media rich features you thought you couldn't afford. Design is directly related to costs. Without Google competing with Amazon there's little doubt the price reduction curve would be much less favorable. As a late cloud entrant Google is now in a customer acquisition phase, so they are willing to pay for customers, which means lower prices are an acceptable cost of doing business. Profit and high margins are not the objective. Getting marke

4 0.71783608 284 high scalability-2008-03-19-RAD Lab is Creating a Datacenter Operating System

Introduction: The RAD Lab (Reliable Adaptive Distributed Systems Laboratory) wants to leapfrog the Big Switch and create The Next Big Switch, skipping the cloud/utility evolutionary stage altogether. This hyper-evolutionary niche buster develops technology so advanced the cloud disperses and you can go back to building your own personal datacenters again. Where Google took years to create their datacenters, using a prefab Datacenter Operating System you might create your own in a long holiday weekend. Not St. Patrick's of course. Their vision: Enable one person to invent and run the next revolutionary IT service, operationally expressing a new business idea as a multi-million-user service over the course of a long weekend. By doing so we hope to enable an Internet "Fortune 1 million". How? By wizardry in the form of a “datacenter operating system” created from a pinch of "statistical machine learning (SML)" and a tincture of "recent insights from networking and distributed systems." Bu

5 0.70327383 765 high scalability-2010-01-25-Let's Welcome our Neo-Feudal Overlords

Introduction: This is an excerpt from my article Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud. There's a pattern, already begun, that has accelerated by the need for applications to scale and increase complexity, the end result of which will be that applications give up their independence and enter a kind of feudal relationship with their platform provider. To understand how this process works, like a glacier slowly and inevitably carving out a deep river valley, here's the type of question I get quite a lot: I've learned PHP and MySQL and I've built a web app that I HOPE will receive traffic comparable to eBay's with a similar database structure. I see all these different opinions and different techniques and languages being recommended and it's so confusing. All I want is perhaps one book or one website that focuses on PHP and MySQL and building a large database web app like eBay's. Does something like this exist? I'm always at a loss fo

6 0.69615293 1107 high scalability-2011-08-29-The Three Ages of Google - Batch, Warehouse, Instant

7 0.68152601 761 high scalability-2010-01-17-Applications Become Black Boxes Using Markets to Scale and Control Costs

8 0.67608678 1116 high scalability-2011-09-15-Paper: It's Time for Low Latency - Inventing the 1 Microsecond Datacenter

9 0.6739831 750 high scalability-2009-12-16-Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud

10 0.67383051 1355 high scalability-2012-11-05-Gone Fishin': Building Super Scalable Systems: Blade Runner Meets Autonomic Computing In The Ambient Cloud

11 0.67307836 1275 high scalability-2012-07-02-C is for Compute - Google Compute Engine (GCE)

12 0.655177 1627 high scalability-2014-04-07-Google Finds: Centralized Control, Distributed Data Architectures Work Better than Fully Decentralized Architectures

13 0.64835578 1548 high scalability-2013-11-13-Google: Multiplex Multiple Works Loads on Computers to Increase Machine Utilization and Save Money

14 0.64774066 1599 high scalability-2014-02-19-Planetary-Scale Computing Architectures for Electronic Trading and How Algorithms Shape Our World

15 0.64133823 1540 high scalability-2013-10-30-Strategy: Use Your Quantum Computer Lab to Tell Intentional Blinks from Involuntary Blinks

16 0.63163471 1316 high scalability-2012-09-04-Changing Architectures: New Datacenter Networks Will Set Your Code and Data Free

17 0.63151401 1075 high scalability-2011-07-07-Myth: Google Uses Server Farms So You Should Too - Resurrection of the Big-Ass Machines

18 0.63058609 1584 high scalability-2014-01-22-How would you build the next Internet? Loons, Drones, Copters, Satellites, or Something Else?

19 0.62952077 768 high scalability-2010-02-01-What Will Kill the Cloud?

20 0.62514305 1612 high scalability-2014-03-14-Stuff The Internet Says On Scalability For March 14th, 2014


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.154), (2, 0.173), (10, 0.017), (36, 0.271), (61, 0.066), (79, 0.123), (85, 0.024), (94, 0.053)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.88665259 4 high scalability-2007-07-10-Webcast: Advanced Database High Availability and Scalability Solutions

Introduction: If MySQL, PostgreSQL or EnterpriseDB High-Availability and Scalability issues are on your plate, you'll find this webcast very informative. Highly recommended! Webcast starts on Thursday, July 12, 2007 at 10:00AM PDT (1:00PM EDT, 18:00GMT). Duration: 50 minutes, plus Q&A; Advanced Database High-Availability and Scalability Solutions ImageProgram Agenda Disk Based Replication • Overview, major features • Benefits, use cases • Limitations and challenges Master/Slave Asynchronous Replication • Overview, major features • Benefits, use cases • Limitations and challenges Synchronous Multi-Master Cluster: Continuent uni/cluster • Cluster overview, major features • Cluster benefits, use cases • Limitations and challenges Product Positioning: HA Continuum • Comparisons • Key differentiators • How to pick the right solution Continuent Professional Services • HA Quick Assessment Service • HA JumpStart Implementation Services Q&A;

2 0.87160808 608 high scalability-2009-05-27-The Future of the Parallelism and its Challenges

Introduction: The Future of the Parallelism and its Challenges Research and education in Parallel computing technologies is more important than ever. Here I present a perspective on the past contributions, current status, and future direction of the parallelism technologies. While machine power will grow impressively, increased parallelism, rather than clock rate, will be driving force in computing in the foreseeable future. This ongoing shift toward parallel architectural paradigms is one of the greatest challenges for the microprocessor and software industries. In 2005, Justin Ratter, chief technology officer of Intel Corporation, said ‘We are at the cusp of a transition to multicore, multithreaded architectures, and we still have not demonstrated the ease of programming the move will require…’ Key points: A Little history Parallelism Challenges Under the hood, Parallelism Challenges Synchronization problems CAS problems The future of the parallelism

same-blog 3 0.86933005 1310 high scalability-2012-08-23-Economies of Scale in the Datacenter: Gmail is 100x Cheaper to Run than Your Own Server

Introduction: Urs Hoelzle , infrastructure guru and SVP at Google, made a really interesting statement about the economics of scale in the datacenter: We’ve shown that when you run a large application in the datacenter, like Gmail, you can, compared to a small organization running their own email server, you can save nearly a factor of 100 in terms of compute and energy, when you run it at scale. My first thought was shock at the magnitude of the difference. 100x is a chasm crosser. Then I thought about Gmail, it's horizontally scalable using technologies that are following Moore's Law (storage and compute), latency requirements are lax, a commodity network is sufficient, and it can be highly automated so management costs scale slower than users. After that it's a simple matter of software :-) Oh, and developing a market where it's "cheaper to run a large thing than a small thing."

4 0.84936923 1254 high scalability-2012-05-30-Strategy: Get Servers for Free and Make Users Happy by Turning on Compression

Introduction: Edward Capriolo has a really interesting article on his dramatic performance expanding experience of  turning on compression for Cassandra . The idea: Enabling compression shrunk 71GB of data down to  31GB, which caused more data to fit in RAM, which reduced disk IO to nearly nothing. Compression means more data can be stored, which is like buying more machines without having to spend more money. Compression means serving more data out of RAM, which means clients are happier because of the performance improvements. The cost is higher CPU usage to perform the encrypt/decrypt. But disk IO is orders of magnitude slower than decompression and most servers have CPU to burn. Edward's article is well written, has the specifics on how to turn on compression for Cassandra, pretty graphs, and lots more details.

5 0.84854645 1540 high scalability-2013-10-30-Strategy: Use Your Quantum Computer Lab to Tell Intentional Blinks from Involuntary Blinks

Introduction: Oh, you don't have a Quantum Computer Lab staffed with researchers? Well, Google does. Here they are on G+ . To learn what they are up to the Verge has A first look inside Google's futuristic quantum lab . The lab is partnership between NASA, Google, and a 512-qubit D-Wave Two quantum computer.   One result from the lab is: The first practical application has been on Google Glass, as engineers put the quantum chips to work on Glass's blink detector, helping it to better distinguish between intentional winks and involuntary blinks. For engineering reasons, the quantum processor can never be installed in Glass, but together with Google's conventional server centers, it can point the way to a better blink-detecting algorithm. That would allow the Glass processor to detect blinks with better accuracy and using significantly less power. If successful, it could be an important breakthrough for wink-triggered apps, which have struggled with the task so far. Google thinks quantum

6 0.82525325 984 high scalability-2011-02-04-Stuff The Internet Says On Scalability For February 4, 2011

7 0.8206228 451 high scalability-2008-11-30-Creating a high-performing online database

8 0.81633413 644 high scalability-2009-06-29-eHarmony.com describes how they use Amazon EC2 and MapReduce

9 0.77036357 264 high scalability-2008-03-03-Read This Site and Ace Your Next Interview!

10 0.748963 1216 high scalability-2012-03-27-Big Data In the Cloud Using Cloudify

11 0.71029627 1275 high scalability-2012-07-02-C is for Compute - Google Compute Engine (GCE)

12 0.70926446 1102 high scalability-2011-08-22-Strategy: Run a Scalable, Available, and Cheap Static Site on S3 or GitHub

13 0.70914173 576 high scalability-2009-04-21-What CDN would you recommend?

14 0.70893639 837 high scalability-2010-06-07-Six Ways Twitter May Reach its Big Hairy Audacious Goal of One Billion Users

15 0.70773685 229 high scalability-2008-01-29-Building scalable storage into application - Instead of MogileFS OpenAFS etc.

16 0.7077145 327 high scalability-2008-05-27-How I Learned to Stop Worrying and Love Using a Lot of Disk Space to Scale

17 0.70759439 1037 high scalability-2011-05-10-Viddler Architecture - 7 Million Embeds a Day and 1500 Req-Sec Peak

18 0.70751035 517 high scalability-2009-02-21-Google AppEngine - A Second Look

19 0.70750374 129 high scalability-2007-10-23-Hire Facebook, Ning, and Salesforce to Scale for You

20 0.70736414 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT