high_scalability high_scalability-2008 high_scalability-2008-409 knowledge-graph by maker-knowledge-mining

409 high scalability-2008-10-13-Challenges from large scale computing at Google


meta infos for this blog

Source: html

Introduction: From Greg Linden on a talk Google Fellow Jeff Dean gave last week at University of Washington Computer Science titled "Research Challenges Inspired by Large-Scale Computing at Google" : Coming away from the talk, the biggest points for me were the considerable interest in reducing costs (especially reducing power costs), the suggestion that the Google cluster may eventually contain 10M machines at 1k locations, and the call to action for researchers on distributed systems and databases to think orders of magnitude bigger than they often are, not about running on hundreds of machines in one location, but hundreds of thousands of machines across many locations.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('locations', 0.281), ('reducing', 0.243), ('machines', 0.225), ('washington', 0.222), ('titled', 0.222), ('fellow', 0.209), ('suggestion', 0.202), ('considerable', 0.186), ('hundreds', 0.182), ('dean', 0.178), ('google', 0.178), ('researchers', 0.17), ('linden', 0.169), ('inspired', 0.163), ('university', 0.16), ('talk', 0.146), ('contain', 0.143), ('jeff', 0.142), ('costs', 0.141), ('greg', 0.14), ('orders', 0.134), ('action', 0.129), ('magnitude', 0.129), ('science', 0.122), ('gave', 0.121), ('week', 0.121), ('location', 0.12), ('interest', 0.12), ('coming', 0.114), ('bigger', 0.112), ('research', 0.109), ('biggest', 0.108), ('eventually', 0.107), ('points', 0.095), ('challenges', 0.095), ('call', 0.09), ('last', 0.089), ('computer', 0.088), ('especially', 0.087), ('thousands', 0.087), ('away', 0.079), ('often', 0.069), ('cluster', 0.068), ('computing', 0.066), ('databases', 0.066), ('power', 0.062), ('running', 0.053), ('across', 0.05), ('think', 0.049), ('may', 0.046)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 409 high scalability-2008-10-13-Challenges from large scale computing at Google

Introduction: From Greg Linden on a talk Google Fellow Jeff Dean gave last week at University of Washington Computer Science titled "Research Challenges Inspired by Large-Scale Computing at Google" : Coming away from the talk, the biggest points for me were the considerable interest in reducing costs (especially reducing power costs), the suggestion that the Google cluster may eventually contain 10M machines at 1k locations, and the call to action for researchers on distributed systems and databases to think orders of magnitude bigger than they often are, not about running on hundreds of machines in one location, but hundreds of thousands of machines across many locations.

2 0.19491301 734 high scalability-2009-10-30-Hot Scalabilty Links for October 30 2009

Introduction: Life beyond Distributed Transactions: an Apostate’s Opinion  by Pat Helland.  In particular, we focus on the implications that fall out of assuming we cannot have large-scale distributed transactions. T ragedy of the Commons, and Cold Starts  - Cold application starts on Google App Engine kill your application's responsiveness. Intel’s 1M IOPS desktop SSD setup  by Kevin Burton.  What do you get when you take 7 Intel SSDs and throw them in a desktop?  1M IOPS Videos from NoSQL Berlin sessions.  Nicely done talks on CAP, MongoDB, Redis, 4th generation object databases, CouchDB, and Riak. Designs, Lessons and Advice from Building Large Distributed Systems  by Jeff Dean of Google describing how they do their thing.   Here are some glosses on the talk by Greg Linden and James Hamilton. You really can't do better than Greg and James.  Advice from Google on Large Distributed Systems by Greg Linden. A nice summary of Jeff Dean's talk. A standard Google server

3 0.15133707 309 high scalability-2008-04-23-Behind The Scenes of Google Scalability

Introduction: The recent Data-Intensive Computing Symposium brought together experts in system design, programming, parallel algorithms, data management, scientific applications, and information-based applications to better understand existing capabilities in the development and application of large-scale computing systems, and to explore future opportunities. Google Fellow Jeff Dean had a very interesting presentation on Handling Large Datasets at Google: Current Systems and Future Directions. He discussed: • Hardware infrastructure • Distributed systems infrastructure: –Scheduling system –GFS –BigTable –MapReduce • Challenges and Future Directions –Infrastructure that spans all datacenters –More automation It is really like a "How does Google work" presentation in ~60 slides? Check out the slides and the video !

4 0.13798487 534 high scalability-2009-03-12-Google TechTalk: Amdahl's Law in the Multicore Era

Introduction: Over the last several decades computer architects have been phenomenally successful turning the transistor bounty provided by Moore's Law into chips with ever increasing single-threaded performance. During many of these successful years, however, many researchers paid scant attention to multiprocessor work. Now as vendors turn to multicore chips, researchers are reacting with more papers on multi-threaded systems. While this is good, we are concerned that further work on single-thread performance will be squashed. To help understand future high-level trade-offs, we develop a corollary to Amdahl's Law for multicore chips [Hill & Marty, IEEE Computer 2008]. It models fixed chip resources for alternative designs that use symmetric cores, asymmetric cores, or dynamic techniques that allow cores to work together on sequential execution. Our results encourage multicore designers to view performance of the entire chip rather than focus on core efficiencies. Moreover, we observe that obtai

5 0.13212302 946 high scalability-2010-11-22-Strategy: Google Sends Canary Requests into the Data Mine

Introduction: Google runs queries against thousands of in-memory index nodes in parallel and then merges the results. One of the interesting problems with this approach, explains Google's Jeff Dean in this lecture at Stanford , is the Query of Death . A query can cause a program to fail because of bugs or various other issues. This means that a single query can take down an entire cluster of machines, which is not good for availability and response times, as it takes quite a while for thousands of machines to recover. Thus the Query of Death. New queries are always coming into the system and when you are always rolling out new software, it's impossible to completely get rid of the problem. Two solutions: Test against logs . Google replays a month's worth of logs to see if any of those queries kill anything. That helps, but Queries of Death may still happen. Send a canary request . A request is sent to one machine. If the request succeeds then it will probably succeed on all machines, s

6 0.12108809 863 high scalability-2010-07-22-How can we spark the movement of research out of the Ivory Tower and into production?

7 0.11686232 544 high scalability-2009-03-18-QCon London 2009: Upgrading Twitter without service disruptions

8 0.11017148 1107 high scalability-2011-08-29-The Three Ages of Google - Batch, Warehouse, Instant

9 0.10331453 1075 high scalability-2011-07-07-Myth: Google Uses Server Farms So You Should Too - Resurrection of the Big-Ass Machines

10 0.096251704 537 high scalability-2009-03-12-QCon London 2009: Database projects to watch closely

11 0.095528826 1464 high scalability-2013-05-24-Stuff The Internet Says On Scalability For May 24, 2013

12 0.094111294 160 high scalability-2007-11-19-Tailrank Architecture - Learn How to Track Memes Across the Entire Blogosphere

13 0.084400736 1387 high scalability-2013-01-15-More Numbers Every Awesome Programmer Must Know

14 0.083908901 211 high scalability-2008-01-13-Google Reveals New MapReduce Stats

15 0.08386299 942 high scalability-2010-11-15-Strategy: Biggest Performance Impact is to Reduce the Number of HTTP Requests

16 0.08291328 1535 high scalability-2013-10-21-Google's Sanjay Ghemawat on What Made Google Google and Great Big Data Career Advice

17 0.082408555 1266 high scalability-2012-06-18-Google on Latency Tolerant Systems: Making a Predictable Whole Out of Unpredictable Parts

18 0.08135619 448 high scalability-2008-11-22-Google Architecture

19 0.079373591 1039 high scalability-2011-05-12-Paper: Mind the Gap: Reconnecting Architecture and OS Research

20 0.077994213 146 high scalability-2007-11-08-scaling drupal - an open-source infrastructure for high-traffic drupal sites


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.104), (1, 0.062), (2, 0.033), (3, 0.081), (4, -0.041), (5, 0.002), (6, -0.01), (7, 0.047), (8, 0.012), (9, 0.058), (10, -0.013), (11, -0.07), (12, 0.017), (13, -0.001), (14, 0.032), (15, 0.006), (16, -0.051), (17, -0.042), (18, 0.048), (19, -0.001), (20, 0.067), (21, 0.038), (22, -0.021), (23, -0.093), (24, -0.019), (25, 0.037), (26, -0.032), (27, 0.033), (28, -0.084), (29, 0.046), (30, 0.062), (31, -0.015), (32, 0.002), (33, 0.05), (34, -0.013), (35, 0.005), (36, 0.072), (37, -0.022), (38, 0.029), (39, 0.066), (40, 0.057), (41, -0.033), (42, -0.006), (43, -0.066), (44, -0.066), (45, -0.064), (46, 0.01), (47, -0.048), (48, -0.006), (49, 0.037)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98823828 409 high scalability-2008-10-13-Challenges from large scale computing at Google

Introduction: From Greg Linden on a talk Google Fellow Jeff Dean gave last week at University of Washington Computer Science titled "Research Challenges Inspired by Large-Scale Computing at Google" : Coming away from the talk, the biggest points for me were the considerable interest in reducing costs (especially reducing power costs), the suggestion that the Google cluster may eventually contain 10M machines at 1k locations, and the call to action for researchers on distributed systems and databases to think orders of magnitude bigger than they often are, not about running on hundreds of machines in one location, but hundreds of thousands of machines across many locations.

2 0.80126131 1107 high scalability-2011-08-29-The Three Ages of Google - Batch, Warehouse, Instant

Introduction: The world has changed. And some things that should not have been forgotten, were lost.  I found these words from the Lord of the Rings echoing in my head as I listened to a fascinating presentation by  Luiz André Barroso , Distinguished Engineer at Google, concerning Google's legendary past, golden present, and apocryphal future. His talk,  Warehouse-Scale Computing: Entering the Teenage Decade , was given at the Federated Computing Research Conference . Luiz clearly knows his stuff and was early at Google, so he has a deep and penetrating perspective on the technology. There's much to learn from, think about, and build. Lord of the Rings applies at two levels. At the change level, Middle Earth went through three ages . While listening to Luiz talk, it seems so has Google: Batch (indexes calculated every month), Warehouse (the datacenter is the computer), and Instant (make it all real-time). At the "what was forgot" level, in the Instant Age section of the talk,  a common theme was

3 0.77124691 1535 high scalability-2013-10-21-Google's Sanjay Ghemawat on What Made Google Google and Great Big Data Career Advice

Introduction: In a  People of ACM  interview with Sanjay Ghemawat , a Google Fellow in the Systems Infrastructure Group (MapReduce, BigTable, Spanner, GFS, etc), talks about a few interesting aspects of Google's culture. What Made Google Google Progress is a modern idea. The conviction that future can be changed for the better through individual advancement and action has over hundreds of years driven an exponential growth in the technome. What drives progress? Challenges. Individuals finding and defeating a challenge. There's usually something someone wants to do so badly that they put in the effort, the thought, and the money into solving all the problems. The results are often something new and amazing. And so it was for Google: The main motivation behind the development of much of Google's infrastructure was the challenge of keeping up with ever-growing data sets. For example, at the same time Google's web search was gaining usage very quickly, we were also scaling up the size of ou

4 0.76391864 211 high scalability-2008-01-13-Google Reveals New MapReduce Stats

Introduction: The Google Operating System blog has an interesting post on Google's scale based on an updated version of Google's paper about MapReduce. The input data for some of the MapReduce jobs run in September 2007 was 403,152 TB (terabytes), the average number of machines allocated for a MapReduce job was 394, while the average completion time was 6 minutes and a half. The paper mentions that Google's indexing system processes more than 20 TB of raw data. Niall Kennedy calculates that the average MapReduce job runs across a $1 million hardware infrastructure, assuming that Google still uses the same cluster configurations from 2004: two 2 GHz Intel Xeon processors with Hyper-Threading enabled, 4 GB of memory, two 160 GB IDE hard drives and a gigabit Ethernet link. Greg Linden notices that Google's infrastructure is an important competitive advantage. "Anyone at Google can process terabytes of data. And they can get their results back in about 10 minutes, so they ca

5 0.75449091 640 high scalability-2009-06-28-Google Voice Architecture

Introduction: Hi High Scalability community! Do you have any information on the architecture behind Google Voice , the new service by Google that offers one Google Number for all your calls and SMS? It is based on GrandCentral who has been acquired by Google 2 years ago. Thanks!

6 0.74906552 309 high scalability-2008-04-23-Behind The Scenes of Google Scalability

7 0.74226642 734 high scalability-2009-10-30-Hot Scalabilty Links for October 30 2009

8 0.73696518 483 high scalability-2009-01-04-Paper: MapReduce: Simplified Data Processing on Large Clusters

9 0.72526097 75 high scalability-2007-08-28-Google Utilities : An online google guide,tools and Utilities.

10 0.7075547 1328 high scalability-2012-09-24-Google Spanner's Most Surprising Revelation: NoSQL is Out and NewSQL is In

11 0.69349873 1540 high scalability-2013-10-30-Strategy: Use Your Quantum Computer Lab to Tell Intentional Blinks from Involuntary Blinks

12 0.69171035 1078 high scalability-2011-07-12-Google+ is Built Using Tools You Can Use Too: Closure, Java Servlets, JavaScript, BigTable, Colossus, Quick Turnaround

13 0.69142652 871 high scalability-2010-08-04-Dremel: Interactive Analysis of Web-Scale Datasets - Data as a Programming Paradigm

14 0.68310285 394 high scalability-2008-09-25-HighScalability.com Rated 16th Best Blog for Development Managers

15 0.6615358 362 high scalability-2008-08-11-Distributed Computing & Google Infrastructure

16 0.65789872 1505 high scalability-2013-08-22-The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second edition

17 0.65755349 618 high scalability-2009-06-05-Google Wave Architecture

18 0.65451509 946 high scalability-2010-11-22-Strategy: Google Sends Canary Requests into the Data Mine

19 0.64359963 665 high scalability-2009-07-29-Strategy: Let Google and Yahoo Host Your Ajax Library - For Free

20 0.63984126 1143 high scalability-2011-11-16-Google+ Infrastructure Update - the JavaScript Story


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.085), (2, 0.211), (5, 0.05), (9, 0.062), (27, 0.046), (47, 0.028), (61, 0.072), (77, 0.042), (79, 0.121), (83, 0.025), (94, 0.143)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94679344 409 high scalability-2008-10-13-Challenges from large scale computing at Google

Introduction: From Greg Linden on a talk Google Fellow Jeff Dean gave last week at University of Washington Computer Science titled "Research Challenges Inspired by Large-Scale Computing at Google" : Coming away from the talk, the biggest points for me were the considerable interest in reducing costs (especially reducing power costs), the suggestion that the Google cluster may eventually contain 10M machines at 1k locations, and the call to action for researchers on distributed systems and databases to think orders of magnitude bigger than they often are, not about running on hundreds of machines in one location, but hundreds of thousands of machines across many locations.

2 0.93439209 1223 high scalability-2012-04-06-Stuff The Internet Says On Scalability For April 6, 2012

Introduction: It's HighScalability Time: Exascale Supercomputer : how IBM plans to understand data from a universe of light;   905 Billion Objects and 650,000 Requests/Second : S3; 64-cores : PostgreSQL shows linear read scalability; Quotable quotes: pkaler : Programming is hard. Scaling is harder. @crucially : As far as I can tell, openstack is what happens when ops people write code.  @DEVOPS_BORAT : Goal of sysadmin is replace itself with small shell script. Goal of devops is replace itself with small REST API. @fowlduck : ec2, where dynamic scalability means them running out of instances :( hcarvalhoalves : You know what is amazing? Is that as soon you hit bigger or more general problems, you always face the compromise of "trading X resource for accuracy". Which leads me to believe that software, so far, has only been deterministic by pure accident. Kyle Lemmons : Clearly Go is a superior weapon if the goal is to shoot everyone in the > foot at the same ti

3 0.93423647 863 high scalability-2010-07-22-How can we spark the movement of research out of the Ivory Tower and into production?

Introduction: Over the years I've read a lot of research papers looking for better ways of doing things. Sometimes I find ideas I can use, but more often than not I come up empty. The problem is there are very few good papers. And by good I mean: can a reasonably intelligent person read a paper and turn it into something useful?  Now, clearly I'm not an academic and clearly I'm no genius, I'm just an everyday programmer searching for leverage, and as a common specimen of the species I've often thought how much better our industry would be if we could simply move research from academia into production with some sort of self-conscious professionalism. Currently the process is horribly hit or miss. And this problem extends equally to companies with research divisions that often do very little to help front-line developers succeed.  How many ideas break out of academia into industry in computer science? We have many brilliant examples: encryption, microprocessors, compression, transactions, distribu

4 0.92768091 1174 high scalability-2012-01-13-Stuff The Internet Says On Scalability For January 13, 2012

Introduction: With a name like HighScalability... it has to be good: Facebook: 1 Billion Users? ;  Internet Archive : 500,000 users/day, 6PB of data, 150 billion pages, 1000 queries a second; 6,180 : The number of patents granted to IBM in 2011; 676 : The number of patents granted to Apple in 2011; Live TV is Dead ; Kickstarter: 10,000 successfully funded projects ;  $82bn : Apple's cash hoard;  100 Billion Planets : Our home sweet galaxy; Creative: 100-core system-on-a-chip ; 15 million : Lines of code in the Linux kernel; According to Twitter: Justin Bieber > President Obama . Quotable quotes: @florind : I just realized that the Santa story is a classical scalability myth. @juokaz : doesn't always use dating sites, but when he does, he finds out about them on High Scalability http://bit.ly/xYfBmq. True story @niclashulting : The Yahoo! homepage is updated 45,000 times every five minutes." A content strategy is vital. Google’s Data Center Engineer Sh

5 0.92633146 266 high scalability-2008-03-04-Manage Downtime Risk by Connecting Multiple Data Centers into a Secure Virtual LAN

Introduction: Update: VcubeV - an OpenVPN-based solution designed to build and operate a multisourced infrastructure. True high availability requires a presence in multiple data centers. The recent downtime of even a high quality operation like Amazon makes this need all the more clear. Typically only the big boys can afford the complexity of operating in two or more data centers. Cloud computing along with utility billing starts to change that equation, leveling the playing field. Even smaller outfits will be in a position to manage risk by spreading machines amongst EC2, 3tera, Slicehost, Mosso and other providers. The question then becomes: given we aren't Angels, how do we walk amongst the clouds? One fascinating answer is exquisitely explained by Dmitriy Samovskiy in his Linux Journal article titled Building a Multisourced Infrastructure Using OpenVPN . Dmitriy's idea is to create a secure UDP tunnel between different data centers over public internet links so your applicatio

6 0.92559528 1025 high scalability-2011-04-16-The NewSQL Market Breakdown

7 0.92504722 1402 high scalability-2013-02-07-Ask HighScalability: Web asset server concept - 3rd party software available?

8 0.92299104 1516 high scalability-2013-09-13-Stuff The Internet Says On Scalability For September 13, 2013

9 0.91728175 1023 high scalability-2011-04-14-Strategy: Cache Application Start State to Reduce Spin-up Times

10 0.91656506 976 high scalability-2011-01-20-75% Chance of Scale - Leveraging the New Scaleogenic Environment for Growth

11 0.91479051 1084 high scalability-2011-07-22-Stuff The Internet Says On Scalability For July 22, 2011

12 0.91158748 1222 high scalability-2012-04-05-Big Data Counting: How to count a billion distinct objects using only 1.5KB of Memory

13 0.91036361 1462 high scalability-2013-05-22-Strategy: Stop Using Linked-Lists

14 0.90158498 970 high scalability-2011-01-06-BankSimple Mini-Architecture - Using a Next Generation Toolchain

15 0.90047961 119 high scalability-2007-10-10-WAN Accelerate Your Way to Lightening Fast Transfers Between Data Centers

16 0.89789075 1561 high scalability-2013-12-09-Site Moves from PHP to Facebook's HipHop, Now Pages Load in .6 Seconds Instead of Five

17 0.8942911 1247 high scalability-2012-05-18-Stuff The Internet Says On Scalability For May 18, 2012

18 0.893296 517 high scalability-2009-02-21-Google AppEngine - A Second Look

19 0.8915835 645 high scalability-2009-06-30-Hot New Trend: Linking Clouds Through Cheap IP VPNs Instead of Private Lines

20 0.89095867 307 high scalability-2008-04-21-Using Google AppEngine for a Little Micro-Scalability