high_scalability high_scalability-2013 high_scalability-2013-1558 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Jeremy Edberg gave a talk on Scaling Reddit from 1 Million to 1 Billion–Pitfalls and Lessons and one of the issues they had was that they: Did not account for increased latency after moving to EC2. In the datacenter they had submillisecond access between machines so it was possible to make a 1000 calls to memache for one page load. Not so on EC2. Memcache access times increased 10x to a millisecond which made their old approach unusable. Fix was to batch calls to memcache so a large number of gets are in one request. Dave Pacheco had an interesting question about batching requests and its impact on latency: I was confused about the memcached problem after moving to the cloud. I understand why network latency may have gone from submillisecond to milliseconds, but how could you improve latency by batching requests? Shouldn't that improve efficiency, not latency, at the possible expense of latency (since some requests will wait on the client as they get batched)?
sentIndex sentText sentNum sentScore
1 Jeremy Edberg gave a talk on Scaling Reddit from 1 Million to 1 Billion–Pitfalls and Lessons and one of the issues they had was that they: Did not account for increased latency after moving to EC2. [sent-1, score-0.772]
2 In the datacenter they had submillisecond access between machines so it was possible to make a 1000 calls to memache for one page load. [sent-2, score-0.948]
3 Memcache access times increased 10x to a millisecond which made their old approach unusable. [sent-4, score-0.375]
4 Fix was to batch calls to memcache so a large number of gets are in one request. [sent-5, score-0.622]
5 Dave Pacheco had an interesting question about batching requests and its impact on latency: I was confused about the memcached problem after moving to the cloud. [sent-6, score-0.782]
6 I understand why network latency may have gone from submillisecond to milliseconds, but how could you improve latency by batching requests? [sent-7, score-1.522]
7 Shouldn't that improve efficiency, not latency, at the possible expense of latency (since some requests will wait on the client as they get batched)? [sent-8, score-0.819]
8 Jeremy cleared it up by saying: The latency didn't get better, but what happened is that instead of having to make a lot of calls to memcache it was just one (well, just a few), so while that one took longer, the total time was much less. [sent-9, score-1.323]
wordName wordTfidf (topN-words)
[('batching', 0.353), ('latency', 0.321), ('submillisecond', 0.281), ('memcache', 0.23), ('jeremy', 0.217), ('calls', 0.201), ('questionabout', 0.177), ('pacheco', 0.177), ('memache', 0.177), ('cleared', 0.166), ('rosenthal', 0.159), ('graphic', 0.144), ('edberg', 0.144), ('aninteresting', 0.141), ('batched', 0.141), ('requests', 0.135), ('increased', 0.131), ('pitfalls', 0.13), ('dave', 0.128), ('total', 0.126), ('confused', 0.126), ('millisecond', 0.115), ('moving', 0.108), ('improve', 0.107), ('expense', 0.104), ('decrease', 0.101), ('milliseconds', 0.099), ('reddit', 0.093), ('happened', 0.088), ('showing', 0.087), ('possible', 0.083), ('efficiency', 0.083), ('gone', 0.082), ('gave', 0.077), ('access', 0.076), ('saying', 0.075), ('account', 0.072), ('batch', 0.071), ('wait', 0.069), ('datacenter', 0.067), ('lessons', 0.065), ('took', 0.065), ('one', 0.063), ('fact', 0.062), ('impact', 0.06), ('fix', 0.06), ('longer', 0.058), ('understand', 0.057), ('gets', 0.057), ('old', 0.053)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 1558 high scalability-2013-12-04-How Can Batching Requests Actually Reduce Latency?
Introduction: Jeremy Edberg gave a talk on Scaling Reddit from 1 Million to 1 Billion–Pitfalls and Lessons and one of the issues they had was that they: Did not account for increased latency after moving to EC2. In the datacenter they had submillisecond access between machines so it was possible to make a 1000 calls to memache for one page load. Not so on EC2. Memcache access times increased 10x to a millisecond which made their old approach unusable. Fix was to batch calls to memcache so a large number of gets are in one request. Dave Pacheco had an interesting question about batching requests and its impact on latency: I was confused about the memcached problem after moving to the cloud. I understand why network latency may have gone from submillisecond to milliseconds, but how could you improve latency by batching requests? Shouldn't that improve efficiency, not latency, at the possible expense of latency (since some requests will wait on the client as they get batched)?
2 0.23715171 661 high scalability-2009-07-25-Latency is Everywhere and it Costs You Sales - How to Crush it
Introduction: Update 8 : The Cost of Latency by James Hamilton. James summarizing some latency info from Steve Souder , Greg Linden , and Marissa Mayer . Speed [is] an undervalued and under-discussed asset on the web. Update 7: How do you know when you need more memcache servers? . Dathan Pattishall talks about using memcache not to scale, but to reduce latency and reduce I/O spikes, and how to use stats to know when more servers are needed. Update 6: Stock Traders Find Speed Pays, in Milliseconds . Goldman Sachs is making record profits off a 500 millisecond trading advantage. Yes, latency matters. As an interesting aside, Libet found 500 msecs is about the time it takes the brain to weave together an experience of consciousness from all our sensor inputs. Update 5: Shopzilla's Site Redo - You Get What You Measure . At the Velocity conference Phil Dixon, from Shopzilla, presented data showing a 5 second speed up resulted in a 25% increase in page views, a 10% increas
Introduction: Jeremy Edberg , the first paid employee at reddit, teaches us a lot about how to create a successful social site in a really good talk he gave at the RAMP conference. Watch it here at Scaling Reddit from 1 Million to 1 Billion–Pitfalls and Lessons . Jeremy uses a virtue and sin approach. Examples of the mistakes made in scaling reddit are shared and it turns out they did a lot of good stuff too. Somewhat of a shocker is that Jeremy is now a Reliability Architect at Netflix, so we get a little Netflix perspective thrown in for free. Some of the lessons that stood out most for me: Think of SSDs as cheap RAM, not expensive disk . When reddit moved from spinning disks to SSDs for the database the number of servers was reduced from 12 to 1 with a ton of headroom. SSDs are 4x more expensive but you get 16x the performance. Worth the cost. Give users a little bit of power, see what they do with it, and turn the good stuff into features . One of the biggest revelations
4 0.16863942 1207 high scalability-2012-03-12-Google: Taming the Long Latency Tail - When More Machines Equals Worse Results
Introduction: Likewise the current belief that, in the case of artificial machines the very large and the very small are equally feasible and lasting is a manifest error. Thus, for example, a small obelisk or column or other solid figure can certainly be laid down or set up without danger of breaking, while the large ones will go to pieces under the slightest provocation, and that purely on account of their own weight. -- Galileo Galileo observed how things broke if they were naively scaled up. Interestingly, Google noticed a similar pattern when building larger software systems using the same techniques used to build smaller systems. Luiz André Barroso , Distinguished Engineer at Google, talks about this fundamental property of scaling systems in his fascinating talk, Warehouse-Scale Computing: Entering the Teenage Decade . Google found the larger the scale the greater the impact of latency variability. When a request is implemented by work done in parallel, as is common with today's service
Introduction: In Taming The Long Latency Tail we covered Luiz Barroso ’s exploration of the long tail latency (some operations are really slow) problems generated by large fanout architectures (a request is composed of potentially thousands of other requests). You may have noticed there weren’t a lot of solutions. That’s where a talk I attended, Achieving Rapid Response Times in Large Online Services ( slide deck ), by Jeff Dean , also of Google, comes in: In this talk, I’ll describe a collection of techniques and practices lowering response times in large distributed systems whose components run on shared clusters of machines, where pieces of these systems are subject to interference by other tasks, and where unpredictable latency hiccups are the norm, not the exception. The goal is to use software techniques to reduce variability given the increasing variability in underlying hardware, the need to handle dynamic workloads on a shared infrastructure, and the need to use lar
6 0.12955335 1126 high scalability-2011-09-27-Use Instance Caches to Save Money: Latency == $$$
7 0.12752306 1116 high scalability-2011-09-15-Paper: It's Time for Low Latency - Inventing the 1 Microsecond Datacenter
9 0.11628993 1135 high scalability-2011-10-31-15 Ways to Make Your Application Feel More Responsive under Google App Engine
10 0.10559731 834 high scalability-2010-06-01-Web Speed Can Push You Off of Google Search Rankings! What Can You Do?
11 0.10475584 1387 high scalability-2013-01-15-More Numbers Every Awesome Programmer Must Know
12 0.10381208 927 high scalability-2010-10-26-Marrying memcached and NoSQL
13 0.094996162 1591 high scalability-2014-02-05-Little’s Law, Scalability and Fault Tolerance: The OS is your bottleneck. What you can do?
14 0.092279077 828 high scalability-2010-05-17-7 Lessons Learned While Building Reddit to 270 Million Page Views a Month
15 0.091556162 728 high scalability-2009-10-26-Facebook's Memcached Multiget Hole: More machines != More Capacity
16 0.090670824 960 high scalability-2010-12-20-Netflix: Use Less Chatty Protocols in the Cloud - Plus 26 Fixes
17 0.08970657 1415 high scalability-2013-03-04-7 Life Saving Scalability Defenses Against Load Monster Attacks
18 0.086578645 294 high scalability-2008-04-01-How to update video views count effectively?
19 0.083811767 1331 high scalability-2012-10-02-An Epic TripAdvisor Update: Why Not Run on the Cloud? The Grand Experiment.
20 0.082040444 687 high scalability-2009-08-24-How Google Serves Data from Multiple Datacenters
topicId topicWeight
[(0, 0.107), (1, 0.089), (2, -0.027), (3, -0.022), (4, -0.022), (5, -0.016), (6, 0.023), (7, 0.09), (8, -0.056), (9, -0.051), (10, 0.006), (11, -0.016), (12, -0.002), (13, 0.026), (14, -0.031), (15, 0.013), (16, -0.012), (17, -0.004), (18, 0.016), (19, -0.038), (20, 0.04), (21, 0.026), (22, 0.055), (23, -0.023), (24, -0.002), (25, 0.041), (26, -0.024), (27, 0.014), (28, -0.024), (29, -0.062), (30, 0.06), (31, -0.046), (32, -0.002), (33, 0.001), (34, 0.06), (35, 0.076), (36, 0.031), (37, -0.02), (38, -0.047), (39, -0.046), (40, 0.034), (41, 0.052), (42, -0.046), (43, -0.039), (44, 0.006), (45, -0.089), (46, 0.035), (47, -0.028), (48, -0.001), (49, -0.0)]
simIndex simValue blogId blogTitle
same-blog 1 0.98168099 1558 high scalability-2013-12-04-How Can Batching Requests Actually Reduce Latency?
Introduction: Jeremy Edberg gave a talk on Scaling Reddit from 1 Million to 1 Billion–Pitfalls and Lessons and one of the issues they had was that they: Did not account for increased latency after moving to EC2. In the datacenter they had submillisecond access between machines so it was possible to make a 1000 calls to memache for one page load. Not so on EC2. Memcache access times increased 10x to a millisecond which made their old approach unusable. Fix was to batch calls to memcache so a large number of gets are in one request. Dave Pacheco had an interesting question about batching requests and its impact on latency: I was confused about the memcached problem after moving to the cloud. I understand why network latency may have gone from submillisecond to milliseconds, but how could you improve latency by batching requests? Shouldn't that improve efficiency, not latency, at the possible expense of latency (since some requests will wait on the client as they get batched)?
2 0.79720539 1207 high scalability-2012-03-12-Google: Taming the Long Latency Tail - When More Machines Equals Worse Results
Introduction: Likewise the current belief that, in the case of artificial machines the very large and the very small are equally feasible and lasting is a manifest error. Thus, for example, a small obelisk or column or other solid figure can certainly be laid down or set up without danger of breaking, while the large ones will go to pieces under the slightest provocation, and that purely on account of their own weight. -- Galileo Galileo observed how things broke if they were naively scaled up. Interestingly, Google noticed a similar pattern when building larger software systems using the same techniques used to build smaller systems. Luiz André Barroso , Distinguished Engineer at Google, talks about this fundamental property of scaling systems in his fascinating talk, Warehouse-Scale Computing: Entering the Teenage Decade . Google found the larger the scale the greater the impact of latency variability. When a request is implemented by work done in parallel, as is common with today's service
Introduction: In Taming The Long Latency Tail we covered Luiz Barroso ’s exploration of the long tail latency (some operations are really slow) problems generated by large fanout architectures (a request is composed of potentially thousands of other requests). You may have noticed there weren’t a lot of solutions. That’s where a talk I attended, Achieving Rapid Response Times in Large Online Services ( slide deck ), by Jeff Dean , also of Google, comes in: In this talk, I’ll describe a collection of techniques and practices lowering response times in large distributed systems whose components run on shared clusters of machines, where pieces of these systems are subject to interference by other tasks, and where unpredictable latency hiccups are the norm, not the exception. The goal is to use software techniques to reduce variability given the increasing variability in underlying hardware, the need to handle dynamic workloads on a shared infrastructure, and the need to use lar
4 0.74868709 661 high scalability-2009-07-25-Latency is Everywhere and it Costs You Sales - How to Crush it
Introduction: Update 8 : The Cost of Latency by James Hamilton. James summarizing some latency info from Steve Souder , Greg Linden , and Marissa Mayer . Speed [is] an undervalued and under-discussed asset on the web. Update 7: How do you know when you need more memcache servers? . Dathan Pattishall talks about using memcache not to scale, but to reduce latency and reduce I/O spikes, and how to use stats to know when more servers are needed. Update 6: Stock Traders Find Speed Pays, in Milliseconds . Goldman Sachs is making record profits off a 500 millisecond trading advantage. Yes, latency matters. As an interesting aside, Libet found 500 msecs is about the time it takes the brain to weave together an experience of consciousness from all our sensor inputs. Update 5: Shopzilla's Site Redo - You Get What You Measure . At the Velocity conference Phil Dixon, from Shopzilla, presented data showing a 5 second speed up resulted in a 25% increase in page views, a 10% increas
5 0.70428109 1116 high scalability-2011-09-15-Paper: It's Time for Low Latency - Inventing the 1 Microsecond Datacenter
Introduction: In It's Time for Low Latency Stephen Rumble et al. explore the idea that it's time to rearchitect our stack to live in the modern era of low-latency datacenter instead of high-latency WANs. The implications for program architectures will be revolutionary . Luiz André Barroso , Distinguished Engineer at Google, sees ultra low latency as a way to make computer resources, to be as much as possible, fungible, that is they are interchangeable and location independent, effectively turning a datacenter into single computer. Abstract from the paper: The operating systems community has ignored network latency for too long. In the past, speed-of-light delays in wide area networks and unoptimized network hardware have made sub-100µs round-trip times impossible. However, in the next few years datacenters will be deployed with low-latency Ethernet. Without the burden of propagation delays in the datacenter campus and network delays in the Ethernet devices, it will be up to us to finish
6 0.67962152 942 high scalability-2010-11-15-Strategy: Biggest Performance Impact is to Reduce the Number of HTTP Requests
7 0.66444355 1387 high scalability-2013-01-15-More Numbers Every Awesome Programmer Must Know
8 0.64811933 981 high scalability-2011-02-01-Google Strategy: Tree Distribution of Requests and Responses
9 0.62838089 960 high scalability-2010-12-20-Netflix: Use Less Chatty Protocols in the Cloud - Plus 26 Fixes
10 0.59813613 712 high scalability-2009-10-01-Moving Beyond End-to-End Path Information to Optimize CDN Performance
11 0.59728575 728 high scalability-2009-10-26-Facebook's Memcached Multiget Hole: More machines != More Capacity
12 0.58181113 1051 high scalability-2011-06-01-Why is your network so slow? Your switch should tell you.
13 0.57653779 946 high scalability-2010-11-22-Strategy: Google Sends Canary Requests into the Data Mine
14 0.56865126 645 high scalability-2009-06-30-Hot New Trend: Linking Clouds Through Cheap IP VPNs Instead of Private Lines
15 0.56646973 941 high scalability-2010-11-15-How Google's Instant Previews Reduces HTTP Requests
16 0.55610973 1135 high scalability-2011-10-31-15 Ways to Make Your Application Feel More Responsive under Google App Engine
17 0.55516249 1267 high scalability-2012-06-18-The Clever Ways Chrome Hides Latency by Anticipating Your Every Need
18 0.5512659 1404 high scalability-2013-02-11-At Scale Even Little Wins Pay Off Big - Google and Facebook Examples
19 0.54899371 1001 high scalability-2011-03-09-Google and Netflix Strategy: Use Partial Responses to Reduce Request Sizes
20 0.54896843 205 high scalability-2008-01-10-Letting Clients Know What's Changed: Push Me or Pull Me?
topicId topicWeight
[(2, 0.298), (16, 0.313), (40, 0.025), (61, 0.059), (77, 0.024), (79, 0.157)]
simIndex simValue blogId blogTitle
1 0.94416314 110 high scalability-2007-10-03-Why most large-scale Web sites are not written in Java
Introduction: There i s a l ot of i nformation in the b l ogosphere descr i bing the arch i tecture of many popu l ar s i tes, such as Google, Amazon, eBay, LinkedIn, TypePad, W i kiPedia and others. I've summar i zed th i s issue in a b l og post here I wou l d rea l ly appreciate your opinion on th i s matter.
same-blog 2 0.88684076 1558 high scalability-2013-12-04-How Can Batching Requests Actually Reduce Latency?
Introduction: Jeremy Edberg gave a talk on Scaling Reddit from 1 Million to 1 Billion–Pitfalls and Lessons and one of the issues they had was that they: Did not account for increased latency after moving to EC2. In the datacenter they had submillisecond access between machines so it was possible to make a 1000 calls to memache for one page load. Not so on EC2. Memcache access times increased 10x to a millisecond which made their old approach unusable. Fix was to batch calls to memcache so a large number of gets are in one request. Dave Pacheco had an interesting question about batching requests and its impact on latency: I was confused about the memcached problem after moving to the cloud. I understand why network latency may have gone from submillisecond to milliseconds, but how could you improve latency by batching requests? Shouldn't that improve efficiency, not latency, at the possible expense of latency (since some requests will wait on the client as they get batched)?
Introduction: With a new Planet of the Apes coming out, this may be a touchy subject with our new overlords, but Netflix is using a whole lot more trouble injecting monkeys to test and iteratively harden their systems. We learned previously how Netflix used Chaos Monkey , a tool to test failover handling by continuously failing EC2 nodes. That was just a start. More monkeys have been added to the barrel. Node failure is just one problem in a system. Imagine a problem and you can imagine creating a monkey to test if your system is handling that problem properly. Yury Izrailevsky talks about just this approach in this very interesting post: The Netflix Simian Army . I know what you are thinking, if monkeys are so great then why has Netflix been down lately. Dmuino addressed this potential embarrassment, putting all fears of cloud inferiority to rest: Unfortunately we're not running 100% on the cloud today. We're working on it, and we could use more help. The latest outage was caused by a com
4 0.7676971 1071 high scalability-2011-07-01-Stuff The Internet Says On Scalability For July 1, 2011
Introduction: Submitted for your scaling pleasure: Twitterers tweet 200 million tweets a day. Popular topics are eclectic, ranging from Swine Flu to Rebecca Black. Twitter has a really cool video on the global flow of tweets in the world . Worth watching. It looks like a rainbow arcing across the northern hemisphere. Amazon Cloud Now Stores 339 Billion Objects , more than doubling last years volume. Quotable quotes for independence Alex: n8foo : My fav part about the new #AWS pricing announcement - 500TB is the level where they say 'contact us'. RoeyYaniv : Scalability guidelines - Technology can and will fail. The business should not. stevedekorte : Are the folks advocating FP for scalability unaware of the Von Neumann bottleneck? nivertech : #Hadoop is a guy with a machete in front of a jungle - it made a trail, but there are new better #BigData middleware offerings in the jungle lhazlewood : I don't think I've ever had a Love/Hate relationship like I've ha
5 0.76154363 1578 high scalability-2014-01-14-Ask HS: Design and Implementation of scalable services?
Introduction: We have written agents deployed/distributed across the network. Agents sends data every 15 Secs may be even 5 secs. Working on a service/system to which all agent can post data/tuples with marginal payload. Upto 5% drop rate is acceptable. Ultimately the data will be segregated and stored into DBMS System (currently we are using MSQL). Question(s) I am looking for answer 1. Client/Server Communication: Agent(s) can post data. Status of sending data is not that important. But there is a remote where Agent(s) to be notified if the server side system generates an event based on the data sent. - Lot of advices from internet suggests using Message Bus (ActiveMQ) for async communication. Multicast and UDP are the alternatives. 2. Persistence: After some evaluation data to be stored in DBMS System. - End of processing data is an aggregated record for which MySql looks scalable. But on the volume of data is exponential. Considering HBase as an option. Looking if there are any alter
6 0.72343475 230 high scalability-2008-01-29-Speed up (Oracle) database code with result caching
7 0.71731806 388 high scalability-2008-09-23-Event: CloudCamp Silicon Valley Unconference on 30th September
8 0.71233636 862 high scalability-2010-07-20-Strategy: Consider When a Service Starts Billing in Your Algorithm Cost
9 0.7087878 1640 high scalability-2014-04-30-10 Tips for Optimizing NGINX and PHP-fpm for High Traffic Sites
10 0.70476025 1266 high scalability-2012-06-18-Google on Latency Tolerant Systems: Making a Predictable Whole Out of Unpredictable Parts
11 0.70162129 252 high scalability-2008-02-18-limit on the number of databases open
12 0.70078605 236 high scalability-2008-02-03-Ideas on how to scale a shared inventory database???
13 0.70070499 620 high scalability-2009-06-05-SSL RPC API Scalability
14 0.69942343 1017 high scalability-2011-04-06-Netflix: Run Consistency Checkers All the time to Fixup Transactions
15 0.6991564 844 high scalability-2010-06-18-Paper: The Declarative Imperative: Experiences and Conjectures in Distributed Logic
16 0.69694018 533 high scalability-2009-03-11-The Implications of Punctuated Scalabilium for Website Architecture
17 0.69650733 1229 high scalability-2012-04-17-YouTube Strategy: Adding Jitter isn't a Bug
18 0.69618237 359 high scalability-2008-07-29-Ehcache - A Java Distributed Cache
19 0.69429135 188 high scalability-2007-12-19-How can I learn to scale my project?
20 0.69415647 595 high scalability-2009-05-08-Publish-subscribe model does not scale?