high_scalability high_scalability-2013 high_scalability-2013-1418 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: What good are problems without solutions? In 42 Monster Problems That Attack As Loads Increase we talked about problems. In this first post (OK, there was an earlier post, but I'm doing some reorganizing), we'll cover what I call aggregation strategies. Keep in mind these are low level architecture type suggestions of how to structure the components of your code and how they interact. We're not talking about massive scale-out clusters here, but of what your applications might like like internally, way below the service level interface level. There's a lot more to the world than evented architectures. Aggregation simply means we aren't using stupid queues. Our queues will be smart. We are deeply aware of queues as containers of work that eventually dictate how the entire system performs. As work containers we know intimately what requests and data sit in our queues and we can use that intelligence to our great advantage. Prioritize Work The key idea to it all is an almost mi
sentIndex sentText sentNum sentScore
1 We are deeply aware of queues as containers of work that eventually dictate how the entire system performs. [sent-9, score-0.508]
2 As work containers we know intimately what requests and data sit in our queues and we can use that intelligence to our great advantage. [sent-10, score-0.668]
3 Prioritize Work The key idea to it all is an almost mindful approach to design that has programmers consider as a first class concept the priority of what works gets done, why it gets done, and when it gets done, in every aspect of their creation. [sent-11, score-0.748]
4 Preventing Cascading Failures The most treacherous example of why prioritizing work is important is in preventing cascading failures . [sent-12, score-0.469]
5 Naive systems, without an idea of priority, will let useless control or data plane traffic elbow out essential control traffic when failures occur. [sent-13, score-0.696]
6 If you need to send a request to a switch to reprogram a route to failover traffic and there's head-of-line blocking on a low priority work item, then you'll never be able to get control of your system without shutting the whole thing down. [sent-14, score-0.873]
7 Chatty programs that don't know what's going on will pump out endless low priority control and data traffic that keeps your system busy doing nothing useful at all. [sent-15, score-0.722]
8 A priority aware system will try to prevent useless work being done while making sure high priority work gets done when it needs to get done. [sent-17, score-1.73]
9 You'll have a separate network for control and data so control messages always go through. [sent-18, score-0.407]
10 You'll have intelligent retry policies so useless work doesn't pile up in queues. [sent-19, score-0.386]
11 When the newest save the day control message is available current work is stopped so the higher priority item can be processed and it's messages will go out immediately as apposed to sitting back and waiting. [sent-22, score-0.885]
12 When we talk about using priority we are also talking about how a system can react smartly and predictably under fully loaded conditions. [sent-28, score-0.502]
13 You have a high priority client that has paid for better service than the rabble. [sent-35, score-0.41]
14 For them to achieve their SLA all the components of your system must be conditioned to understand priority while still not starving out others and giving good service. [sent-36, score-0.563]
15 Merge Aggregation In merge aggregation separate pieces of data and/or commands are merged into one piece. [sent-39, score-0.385]
16 Delete Aggregation In delete aggregation work is deleted whenever possible. [sent-48, score-0.507]
17 For example, take the following operations: create delete This would cause the create and delete operations to be deleted before messages were ever sent out. [sent-49, score-0.39]
18 In change aggregation we are keeping the state that something changed so no matter how many times it has changed we only note that it has changed once. [sent-58, score-0.455]
19 Otherwise we can create alarm storms when hardware is defective or wave after wave of a condition hits. [sent-64, score-0.318]
20 This intelligence system produces a constant stream of signals that can be used to control how other parts of the system behave. [sent-69, score-0.52]
wordName wordTfidf (topN-words)
[('priority', 0.41), ('aggregationin', 0.26), ('intelligence', 0.18), ('useless', 0.167), ('control', 0.156), ('work', 0.151), ('delete', 0.146), ('aggregation', 0.134), ('cascading', 0.124), ('merged', 0.119), ('queues', 0.115), ('prioritizing', 0.112), ('changed', 0.107), ('alarm', 0.101), ('done', 0.1), ('messages', 0.095), ('system', 0.092), ('idea', 0.089), ('monster', 0.088), ('resources', 0.088), ('happening', 0.085), ('containers', 0.084), ('gets', 0.083), ('example', 0.082), ('conscious', 0.081), ('sit', 0.077), ('deleted', 0.076), ('wave', 0.076), ('item', 0.073), ('operations', 0.073), ('attack', 0.072), ('infinite', 0.071), ('merge', 0.069), ('policies', 0.068), ('aware', 0.066), ('aggregationfrom', 0.065), ('createdelete', 0.065), ('createupdateupdatethese', 0.065), ('defective', 0.065), ('everyting', 0.065), ('flapping', 0.065), ('traffic', 0.064), ('scheduling', 0.063), ('pieces', 0.063), ('reorganizing', 0.061), ('intimately', 0.061), ('commonality', 0.061), ('touse', 0.061), ('junk', 0.061), ('conditioned', 0.061)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000005 1418 high scalability-2013-03-06-Low Level Scalability Solutions - The Aggregation Collection
Introduction: What good are problems without solutions? In 42 Monster Problems That Attack As Loads Increase we talked about problems. In this first post (OK, there was an earlier post, but I'm doing some reorganizing), we'll cover what I call aggregation strategies. Keep in mind these are low level architecture type suggestions of how to structure the components of your code and how they interact. We're not talking about massive scale-out clusters here, but of what your applications might like like internally, way below the service level interface level. There's a lot more to the world than evented architectures. Aggregation simply means we aren't using stupid queues. Our queues will be smart. We are deeply aware of queues as containers of work that eventually dictate how the entire system performs. As work containers we know intimately what requests and data sit in our queues and we can use that intelligence to our great advantage. Prioritize Work The key idea to it all is an almost mi
2 0.49108973 1415 high scalability-2013-03-04-7 Life Saving Scalability Defenses Against Load Monster Attacks
Introduction: We talked about 42 Monster Problems That Attack As Loads Increase . Here are a few ways you can defend yourself, secrets revealed by scaling masters across the ages. Note that these are low level programming level moves, not large architecture type strategies. Use Resources Proportional To a Fixed Limit This is probably the most important rule for achieving scalability within an application. What it means: Find the resource that has a fixed limit that you know you can support. For example, a guarantee to handle a certain number of objects in memory. So if we always use resources proportional to the number of objects it is likely we can prevent resource exhaustion. Devise ways of tying what you need to do to the individual resources. Some examples: Keep a list of purchase orders with line items over $20 (or whatever). Do not keep a list of the line items because the number of items can be much larger than the number of purchase orders. You have kept the resource usage
3 0.37499589 1425 high scalability-2013-03-18-Beyond Threads and Callbacks - Application Architecture Pros and Cons
Introduction: There's not a lot of talk about application architectures at the process level. You have your threads, pools of threads, and you have your callback models. That's about it. Languages/frameworks making a virtue out of simple models, like Go and Erlang, do so at the price of control. It's difficult to make a low latency well conditioned application when a power full tool, like work scheduling, is taken out of the hands of the programmer. But that's not all there is my friend. We'll dive into different ways an application can be composed across threads of control. Your favorite language may not give you access to all the capabilities we are going to talk about, but lately there has been a sort of revival in considering performance important, especially for controlling latency variance , so I think it's time to talk about these kind of issues. When it was do everything in the thread of a web server thread pool none of these issues really mattered. But now that developers are creating
Introduction: Hidden in every computer is a hardware backplane for moving signals around. Hidden in every application are ways of moving messages around and giving code CPU time to process them. Unhiding those capabilities and making them first class facilities for the programmer to control is the idea behind AppBackplane. This goes directly against the trend of hiding everything from the programmer and doing it all automagically. Which is great, until it doesn't work. Then it sucks. And the approach of giving the programmer all the power also sucks, until it's tuned to work together and performance is incredible even under increasing loads. Then it's great. These are two different curves going in opposite directions. You need to decide for your application which curve you need to be on. AppBackplane is an example framework supporting the multiple application architectures we talked about in Beyond Threads And Callbacks . It provides a scheduling system that supports continuous and high loa
5 0.32153851 1421 high scalability-2013-03-11-Low Level Scalability Solutions - The Conditioning Collection
Introduction: We talked about 42 Monster Problems That Attack As Loads Increase . And in The Aggregation Collection we talked about the value of prioritizing work and making smart queues as a way of absorbing and not reflecting traffic spikes. Now we move on to our next batch of strategies where the theme is conditioning , which is the idea of shaping and controlling flows of work within your application... Use Resources Proportional To a Fixed Limit This is probably the most important rule for achieving scalability within an application. What it means: Find the resource that has a fixed limit that you know you can support. For example, a guarantee to handle a certain number of objects in memory. So if we always use resources proportional to the number of objects it is likely we can prevent resource exhaustion. Devise ways of tying what you need to do to the individual resources. Some examples: Keep a list of purchase orders with line items over $20 (or whatever). Do not keep
6 0.30998394 1413 high scalability-2013-02-27-42 Monster Problems that Attack as Loads Increase
7 0.20581624 406 high scalability-2008-10-08-Strategy: Flickr - Do the Essential Work Up-front and Queue the Rest
10 0.13358168 920 high scalability-2010-10-15-Troubles with Sharding - What can we learn from the Foursquare Incident?
11 0.13282871 1568 high scalability-2013-12-23-What Happens While Your Brain Sleeps is Surprisingly Like How Computers Stay Sane
12 0.13159901 1622 high scalability-2014-03-31-How WhatsApp Grew to Nearly 500 Million Users, 11,000 cores, and 70 Million Messages a Second
13 0.12717409 464 high scalability-2008-12-13-Strategy: Facebook Tweaks to Handle 6 Time as Many Memcached Requests
14 0.12680736 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT
15 0.12606658 1266 high scalability-2012-06-18-Google on Latency Tolerant Systems: Making a Predictable Whole Out of Unpredictable Parts
16 0.12323773 960 high scalability-2010-12-20-Netflix: Use Less Chatty Protocols in the Cloud - Plus 26 Fixes
17 0.12187473 96 high scalability-2007-09-18-Amazon Architecture
18 0.12152289 1373 high scalability-2012-12-17-11 Uses For the Humble Presents Queue, er, Message Queue
19 0.11931133 910 high scalability-2010-09-30-Facebook and Site Failures Caused by Complex, Weakly Interacting, Layered Systems
20 0.11868001 21 high scalability-2007-07-23-GoogleTalk Architecture
topicId topicWeight
[(0, 0.216), (1, 0.131), (2, 0.002), (3, -0.004), (4, -0.014), (5, -0.03), (6, 0.099), (7, 0.102), (8, -0.147), (9, -0.118), (10, 0.011), (11, 0.137), (12, 0.007), (13, -0.063), (14, 0.067), (15, -0.047), (16, 0.02), (17, -0.029), (18, -0.012), (19, 0.047), (20, -0.008), (21, -0.074), (22, 0.058), (23, -0.001), (24, 0.045), (25, 0.007), (26, 0.092), (27, 0.128), (28, 0.08), (29, -0.015), (30, 0.008), (31, 0.011), (32, 0.12), (33, 0.05), (34, 0.069), (35, -0.002), (36, -0.046), (37, -0.044), (38, -0.007), (39, 0.006), (40, 0.006), (41, -0.028), (42, 0.022), (43, 0.049), (44, -0.061), (45, -0.091), (46, -0.039), (47, 0.011), (48, -0.011), (49, 0.033)]
simIndex simValue blogId blogTitle
same-blog 1 0.97188473 1418 high scalability-2013-03-06-Low Level Scalability Solutions - The Aggregation Collection
Introduction: What good are problems without solutions? In 42 Monster Problems That Attack As Loads Increase we talked about problems. In this first post (OK, there was an earlier post, but I'm doing some reorganizing), we'll cover what I call aggregation strategies. Keep in mind these are low level architecture type suggestions of how to structure the components of your code and how they interact. We're not talking about massive scale-out clusters here, but of what your applications might like like internally, way below the service level interface level. There's a lot more to the world than evented architectures. Aggregation simply means we aren't using stupid queues. Our queues will be smart. We are deeply aware of queues as containers of work that eventually dictate how the entire system performs. As work containers we know intimately what requests and data sit in our queues and we can use that intelligence to our great advantage. Prioritize Work The key idea to it all is an almost mi
2 0.91905332 1429 high scalability-2013-03-25-AppBackplane - A Framework for Supporting Multiple Application Architectures
Introduction: Hidden in every computer is a hardware backplane for moving signals around. Hidden in every application are ways of moving messages around and giving code CPU time to process them. Unhiding those capabilities and making them first class facilities for the programmer to control is the idea behind AppBackplane. This goes directly against the trend of hiding everything from the programmer and doing it all automagically. Which is great, until it doesn't work. Then it sucks. And the approach of giving the programmer all the power also sucks, until it's tuned to work together and performance is incredible even under increasing loads. Then it's great. These are two different curves going in opposite directions. You need to decide for your application which curve you need to be on. AppBackplane is an example framework supporting the multiple application architectures we talked about in Beyond Threads And Callbacks . It provides a scheduling system that supports continuous and high loa
3 0.90091902 1415 high scalability-2013-03-04-7 Life Saving Scalability Defenses Against Load Monster Attacks
Introduction: We talked about 42 Monster Problems That Attack As Loads Increase . Here are a few ways you can defend yourself, secrets revealed by scaling masters across the ages. Note that these are low level programming level moves, not large architecture type strategies. Use Resources Proportional To a Fixed Limit This is probably the most important rule for achieving scalability within an application. What it means: Find the resource that has a fixed limit that you know you can support. For example, a guarantee to handle a certain number of objects in memory. So if we always use resources proportional to the number of objects it is likely we can prevent resource exhaustion. Devise ways of tying what you need to do to the individual resources. Some examples: Keep a list of purchase orders with line items over $20 (or whatever). Do not keep a list of the line items because the number of items can be much larger than the number of purchase orders. You have kept the resource usage
4 0.87876189 1421 high scalability-2013-03-11-Low Level Scalability Solutions - The Conditioning Collection
Introduction: We talked about 42 Monster Problems That Attack As Loads Increase . And in The Aggregation Collection we talked about the value of prioritizing work and making smart queues as a way of absorbing and not reflecting traffic spikes. Now we move on to our next batch of strategies where the theme is conditioning , which is the idea of shaping and controlling flows of work within your application... Use Resources Proportional To a Fixed Limit This is probably the most important rule for achieving scalability within an application. What it means: Find the resource that has a fixed limit that you know you can support. For example, a guarantee to handle a certain number of objects in memory. So if we always use resources proportional to the number of objects it is likely we can prevent resource exhaustion. Devise ways of tying what you need to do to the individual resources. Some examples: Keep a list of purchase orders with line items over $20 (or whatever). Do not keep
5 0.87750947 1425 high scalability-2013-03-18-Beyond Threads and Callbacks - Application Architecture Pros and Cons
Introduction: There's not a lot of talk about application architectures at the process level. You have your threads, pools of threads, and you have your callback models. That's about it. Languages/frameworks making a virtue out of simple models, like Go and Erlang, do so at the price of control. It's difficult to make a low latency well conditioned application when a power full tool, like work scheduling, is taken out of the hands of the programmer. But that's not all there is my friend. We'll dive into different ways an application can be composed across threads of control. Your favorite language may not give you access to all the capabilities we are going to talk about, but lately there has been a sort of revival in considering performance important, especially for controlling latency variance , so I think it's time to talk about these kind of issues. When it was do everything in the thread of a web server thread pool none of these issues really mattered. But now that developers are creating
6 0.83248293 406 high scalability-2008-10-08-Strategy: Flickr - Do the Essential Work Up-front and Queue the Rest
7 0.83071047 1413 high scalability-2013-02-27-42 Monster Problems that Attack as Loads Increase
8 0.79863691 1454 high scalability-2013-05-08-Typesafe Interview: Scala + Akka is an IaaS for Your Process Architecture
9 0.77374601 1373 high scalability-2012-12-17-11 Uses For the Humble Presents Queue, er, Message Queue
10 0.75963873 772 high scalability-2010-02-05-High Availability Principle : Concurrency Control
11 0.71634746 398 high scalability-2008-09-30-Scalability Worst Practices
12 0.70934421 1001 high scalability-2011-03-09-Google and Netflix Strategy: Use Partial Responses to Reduce Request Sizes
13 0.70470434 1591 high scalability-2014-02-05-Little’s Law, Scalability and Fault Tolerance: The OS is your bottleneck. What you can do?
14 0.68724811 960 high scalability-2010-12-20-Netflix: Use Less Chatty Protocols in the Cloud - Plus 26 Fixes
15 0.6862241 1568 high scalability-2013-12-23-What Happens While Your Brain Sleeps is Surprisingly Like How Computers Stay Sane
16 0.67905492 350 high scalability-2008-07-15-ZooKeeper - A Reliable, Scalable Distributed Coordination System
17 0.67585921 317 high scalability-2008-05-10-Hitting 300 SimbleDB Requests Per Second on a Small EC2 Instance
18 0.67361224 295 high scalability-2008-04-02-Product: Supervisor - Monitor and Control Your Processes
19 0.66247964 1622 high scalability-2014-03-31-How WhatsApp Grew to Nearly 500 Million Users, 11,000 cores, and 70 Million Messages a Second
20 0.65738273 1229 high scalability-2012-04-17-YouTube Strategy: Adding Jitter isn't a Bug
topicId topicWeight
[(1, 0.072), (2, 0.266), (10, 0.076), (30, 0.023), (37, 0.076), (40, 0.023), (61, 0.075), (77, 0.037), (79, 0.11), (85, 0.042), (94, 0.036), (96, 0.09)]
simIndex simValue blogId blogTitle
same-blog 1 0.95091206 1418 high scalability-2013-03-06-Low Level Scalability Solutions - The Aggregation Collection
Introduction: What good are problems without solutions? In 42 Monster Problems That Attack As Loads Increase we talked about problems. In this first post (OK, there was an earlier post, but I'm doing some reorganizing), we'll cover what I call aggregation strategies. Keep in mind these are low level architecture type suggestions of how to structure the components of your code and how they interact. We're not talking about massive scale-out clusters here, but of what your applications might like like internally, way below the service level interface level. There's a lot more to the world than evented architectures. Aggregation simply means we aren't using stupid queues. Our queues will be smart. We are deeply aware of queues as containers of work that eventually dictate how the entire system performs. As work containers we know intimately what requests and data sit in our queues and we can use that intelligence to our great advantage. Prioritize Work The key idea to it all is an almost mi
2 0.94251275 1035 high scalability-2011-05-05-Paper: A Study of Practical Deduplication
Introduction: With BigData comes BigStorage costs. One way to store less is simply not to store the same data twice . That's the radically simple and powerful notion behind data deduplication . If you are one of those who got a good laugh out of the idea of eliminating SQL queries as a rather obvious scalability strategy, you'll love this one, but it is a powerful feature and one I don't hear talked about outside the enterprise. A parallel idea in programming is the once-and-only-once principle of never duplicating code. Using deduplication technology, for some upfront CPU usage, which is a plentiful resource in many systems that are IO bound anyway, it's possible to reduce storage requirements by upto 20:1, depending on your data, which saves both money and disk write overhead. This comes up because of really good article Robin Harris of StorageMojo wrote, All de-dup works , on a paper, A Study of Practical Deduplication by Dutch Meyer and William Bolosky, For a great explanation o
3 0.94236952 1549 high scalability-2013-11-15-Stuff The Internet Says On Scalability For November 15th, 2013
Introduction: Hey, it's HighScalability time: Test your sense of scale. Is this image of something microscopic or macroscopic? Find out . Quotable Quotes: fidotron : It feels like we've gone in one big circle, where first we move the DB on to a separate machine for performance, yet now more computation will go back to being done nearer the data (like Hadoop) and we'll try to pretend it's all just one giant computer again. @pbailis : Building systems from the ground up with distribution, scale, and availability in mind is much easier than retrofitting single-node systems. @merv : #awsreinvent Jassy: Netflix has 10,000s of EC2 instances. They are the final deployment scenario: All In. And others are coming. Edward Capriolo : YARN... Either it is really complicated or I have brain damage @djspiewak : Eventually, Node.js will reinvent the “IO promise” and realize that flattening your callback effects is actually quite nice. @jimblomo : A Note on Dis
4 0.93849349 1415 high scalability-2013-03-04-7 Life Saving Scalability Defenses Against Load Monster Attacks
Introduction: We talked about 42 Monster Problems That Attack As Loads Increase . Here are a few ways you can defend yourself, secrets revealed by scaling masters across the ages. Note that these are low level programming level moves, not large architecture type strategies. Use Resources Proportional To a Fixed Limit This is probably the most important rule for achieving scalability within an application. What it means: Find the resource that has a fixed limit that you know you can support. For example, a guarantee to handle a certain number of objects in memory. So if we always use resources proportional to the number of objects it is likely we can prevent resource exhaustion. Devise ways of tying what you need to do to the individual resources. Some examples: Keep a list of purchase orders with line items over $20 (or whatever). Do not keep a list of the line items because the number of items can be much larger than the number of purchase orders. You have kept the resource usage
5 0.93799686 965 high scalability-2010-12-29-Pinboard.in Architecture - Pay to Play to Keep a System Small
Introduction: How do you keep a system small enough, while still being successful, that a simple scale-up strategy becomes the preferred architecture? StackOverflow , for example, could stick with a tool chain they were comfortable with because they had a natural brake on how fast they could grow: there are only so many programmers in the world. If this doesn't work for you, here's another natural braking strategy to consider: charge for your service . Paul Houle summarized this nicely as: avoid scaling problems by building a service that's profitable at a small scale . This interesting point, one I hadn't properly considered before, was brought up by Maciej Ceglowski, co-founder of Pinboard.in , in an interview with Leo Laporte and Amber MacArthur on their their net@night show. Pinboard is a lean, mean, pay for bookmarking machine, a timely replacement for the nearly departed Delicious. And as a self professed anti-social bookmarking site, it emphasizes speed over socializing . Maciej
6 0.93485516 103 high scalability-2007-09-28-Kosmos File System (KFS) is a New High End Google File System Option
7 0.93421531 1379 high scalability-2012-12-31-Designing for Resiliency will be so 2013
9 0.93150502 620 high scalability-2009-06-05-SSL RPC API Scalability
10 0.93036765 117 high scalability-2007-10-08-Paper: Understanding and Building High Availability-Load Balanced Clusters
11 0.92839473 1221 high scalability-2012-04-03-Hazelcast 2.0: Big Data In-Memory
12 0.92805946 1207 high scalability-2012-03-12-Google: Taming the Long Latency Tail - When More Machines Equals Worse Results
13 0.92563754 1475 high scalability-2013-06-13-Busting 4 Modern Hardware Myths - Are Memory, HDDs, and SSDs Really Random Access?
14 0.92290533 1228 high scalability-2012-04-16-Instagram Architecture Update: What’s new with Instagram?
15 0.92237759 1204 high scalability-2012-03-06-Ask For Forgiveness Programming - Or How We'll Program 1000 Cores
16 0.92210633 306 high scalability-2008-04-21-The Search for the Source of Data - How SimpleDB Differs from a RDBMS
17 0.92049009 76 high scalability-2007-08-29-Skype Failed the Boot Scalability Test: Is P2P fundamentally flawed?
19 0.91926378 661 high scalability-2009-07-25-Latency is Everywhere and it Costs You Sales - How to Crush it
20 0.91924638 1438 high scalability-2013-04-10-Check Yourself Before You Wreck Yourself - Avocado's 5 Early Stages of Architecture Evolution