high_scalability high_scalability-2012 high_scalability-2012-1379 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: A big part of engineering for a quality experience is bringing in the long tail . An improbable severe failure can ruin your experience of a site, even if your average experience is quite good. That's where building for resilience comes in. Resiliency used to be outside the realm of possibility for the common system. It was simply too complex and too expensive. An evolution has been underway, making 2013 possibly the first time resiliency is truly on the table as a standard part of system architectures. We are getting the clouds, we are getting the tools, and prices are almost low enough. Even Netflix, real leaders in the resiliency architecture game, took some heat for relying completely on Amazon's ELB and not having a backup load balancing system, leading to a prolonged Christmas Eve failure . Adrian Cockcroft, Cloud Architect at Netflix, said they've investigated creating their own load balancing service, but that "we try not to invest in undifferentiated heavy lifting.
sentIndex sentText sentNum sentScore
1 An evolution has been underway, making 2013 possibly the first time resiliency is truly on the table as a standard part of system architectures. [sent-6, score-0.43]
2 Richard Cook , Professor of Healthcare Systems Safety and Chairman of the Department of Patient Safety at the Kungliga Techniska Hogskolan, has been thinking about resilience for a long time. [sent-19, score-0.416]
3 Are we going to take the system down to fix it or fix it on the fly? [sent-56, score-0.415]
4 You rarely see pictures of people doing work, sitting in front of screens and making the system run. [sent-62, score-0.425]
5 People look at the system to see what’s going on. [sent-78, score-0.309]
6 Not in a reacting sense, but understanding what is going on in the system to figure out where it’s going and trying to make changes to deal with that. [sent-80, score-0.58]
7 Find someway to make system designed in the mindset of 4-7 years ago work for the next 3-5 years while we are building the next systems that will be obsolete when we install them. [sent-88, score-0.547]
8 Is it possible to engineer systems ahead of time so that it is possible in operational time to have the resilience you want them to have? [sent-94, score-0.544]
9 Reveal the controls of the system to the operators . [sent-100, score-0.44]
10 When it’s 3AM in the morning the only people who are going to fix the problem are the operators and control people sitting in front of the screens that actually control the system. [sent-101, score-0.736]
11 If resilience is the goal then we have to develop some way of trusting people because we need them to act in situations where they are the only people available to act. [sent-104, score-0.744]
12 We do that because we know the system is going to be moved. [sent-107, score-0.309]
13 We have to start showing lift points in the system itself. [sent-108, score-0.309]
14 Designers have to present that information to operators so they can see what’s going on. [sent-116, score-0.329]
15 You need to think about the kind of situations operators are going to confront and the kinds of work they are going to do and you have to support the mental simulation they are going to do as they are trying to figure out how to make the system work or recover. [sent-117, score-1.214]
16 It turns out to be able to reason about how the system is working we have to have knowledge of what’s inside the black box. [sent-121, score-0.324]
17 Make resilience engineering the first priority of design for next gen systems. [sent-128, score-0.489]
18 Commit resources to discovering, understanding and supporting resilience through the system life-cycle. [sent-129, score-0.674]
19 Cook asks for is something developers can’t deliver: such a clear understanding of a complex system that you can hold it in the palm of your hand, turn it, twist it, interrogate it, and make it dance to your tune. [sent-133, score-0.447]
20 A system will always be in large part subconscious, just like how in the the human brain the conscious mind is only the smallest window on a vast subconscious mind. [sent-135, score-0.338]
wordName wordTfidf (topN-words)
[('resilience', 0.416), ('imagined', 0.205), ('operators', 0.197), ('resiliency', 0.197), ('system', 0.177), ('lift', 0.132), ('going', 0.132), ('systems', 0.128), ('situations', 0.116), ('people', 0.106), ('subconscious', 0.105), ('pumps', 0.095), ('shallow', 0.091), ('black', 0.088), ('screens', 0.082), ('things', 0.082), ('understanding', 0.081), ('cook', 0.08), ('maintenance', 0.077), ('complex', 0.076), ('constantly', 0.074), ('design', 0.073), ('mental', 0.073), ('conflicts', 0.073), ('work', 0.072), ('year', 0.07), ('hiding', 0.069), ('communities', 0.068), ('controls', 0.066), ('deep', 0.063), ('safety', 0.062), ('netflix', 0.062), ('fail', 0.061), ('sitting', 0.06), ('seemed', 0.06), ('knowledge', 0.059), ('make', 0.058), ('touch', 0.057), ('years', 0.056), ('part', 0.056), ('designs', 0.055), ('hold', 0.055), ('contact', 0.054), ('fix', 0.053), ('underway', 0.053), ('improbable', 0.053), ('archetype', 0.053), ('articlesctlab', 0.053), ('boundarieslayers', 0.053), ('confront', 0.053)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000005 1379 high scalability-2012-12-31-Designing for Resiliency will be so 2013
Introduction: A big part of engineering for a quality experience is bringing in the long tail . An improbable severe failure can ruin your experience of a site, even if your average experience is quite good. That's where building for resilience comes in. Resiliency used to be outside the realm of possibility for the common system. It was simply too complex and too expensive. An evolution has been underway, making 2013 possibly the first time resiliency is truly on the table as a standard part of system architectures. We are getting the clouds, we are getting the tools, and prices are almost low enough. Even Netflix, real leaders in the resiliency architecture game, took some heat for relying completely on Amazon's ELB and not having a backup load balancing system, leading to a prolonged Christmas Eve failure . Adrian Cockcroft, Cloud Architect at Netflix, said they've investigated creating their own load balancing service, but that "we try not to invest in undifferentiated heavy lifting.
2 0.62072068 1366 high scalability-2012-12-03-Resiliency is the New Normal - A Deep Look at What It Means and How to Build It
Introduction: Perhaps it is because the whole world feels as if it’s riding on the edge of a jagged knife that the idea of resilience is becoming a dominant theme across so many domains. Resilience in beings first developed when cells evolved a way of maintaining inner order through homeostatic (stability through constancy) mechanisms. After homeostasis was mastered, allostasis (stability through change) developed as a way of responding to a dynamic world of challenge. In economics we have the idea of Transition Towns , which emphasizes developing local economies as a way of being resilient to global failures. In agriculture we have the idea of permaculture , building a permanent agriculture by embracing diversity, sustainability, perennial systems, avoiding monocultures, and using edge thinking . There are many more examples, including psychological resilience and the legendary resilience of ecosystems . To explore the idea of resiliency we’ll look at a
Introduction: "But it is not complicated. [There's] just a lot of it." \--Richard Feynmanon how the immense variety of the world arises from simple rules.Contents:Have We Reached the End of Scaling?Applications Become Black Boxes Using Markets to Scale and Control CostsLet's Welcome our Neo-Feudal OverlordsThe Economic Argument for the Ambient CloudWhat Will Kill the Cloud?The Amazing Collective Compute Power of the Ambient CloudUsing the Ambient Cloud as an Application RuntimeApplications as Virtual StatesConclusionWe have not yet begun to scale. The world is still fundamentally disconnected and for all our wisdom we are still in the earliest days of learning how to build truly large planet-scaling applications.Today 350 million users on Facebook is a lot of users and five million followers on Twitter is a lot of followers. This may seem like a lot now, but consider we have no planet wide applications yet. None.Tomorrow the numbers foreshadow a newCambrian explosionof connectivity that will look as
Introduction: All in all this is still my favorite post and I still think it's an accurate vision of a future. Not everyone agrees, but I guess we'll see..."But it is not complicated. [There's] just a lot of it." \--Richard Feynmanon how the immense variety of the world arises from simple rules.Contents:Have We Reached the End of Scaling?Applications Become Black Boxes Using Markets to Scale and Control CostsLet's Welcome our Neo-Feudal OverlordsThe Economic Argument for the Ambient CloudWhat Will Kill the Cloud?The Amazing Collective Compute Power of the Ambient CloudUsing the Ambient Cloud as an Application RuntimeApplications as Virtual StatesConclusionWe have not yet begun to scale. The world is still fundamentally disconnected and for all our wisdom we are still in the earliest days of learning how to build truly large planet-scaling applications.Today 350 million users on Facebook is a lot of users and five million followers on Twitter is a lot of followers. This may seem like a lot now, but c
5 0.14313985 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT
Introduction: It remains that, from the same principles, I now demonstrate the frame of the System of the World. -- Isaac Newton The practice of IT reminds me a lot of the practice of science before Isaac Newton. Aristotelianism was dead, but there was nothing to replace it. Then Newton came along, created a scientific revolution with his System of the World . And everything changed. That was New System of the World number one. New System of the World number two was written about by the incomparable Neal Stephenson in his incredible Baroque Cycle series. It explores the singular creation of a new way of organizing society grounded in new modes of thought in business, religion, politics, and science. Our modern world emerged Enlightened as it could from this roiling cauldron of forces. In IT we may have had a Leonardo da Vinci or even a Galileo, but we’ve never had our Newton. Maybe we don't need a towering genius to make everything clear? For years startups, like the frenetically inventive
6 0.12908143 920 high scalability-2010-10-15-Troubles with Sharding - What can we learn from the Foursquare Incident?
7 0.12233485 1568 high scalability-2013-12-23-What Happens While Your Brain Sleeps is Surprisingly Like How Computers Stay Sane
8 0.12176485 96 high scalability-2007-09-18-Amazon Architecture
12 0.11443356 1564 high scalability-2013-12-13-Stuff The Internet Says On Scalability For December 13th, 2013
13 0.10938746 691 high scalability-2009-08-31-Squarespace Architecture - A Grid Handles Hundreds of Millions of Requests a Month
14 0.10804138 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?
15 0.10759506 1654 high scalability-2014-06-05-Cloud Architecture Revolution
16 0.10718177 1019 high scalability-2011-04-08-Stuff The Internet Says On Scalability For April 8, 2011
17 0.1068882 1418 high scalability-2013-03-06-Low Level Scalability Solutions - The Aggregation Collection
18 0.10686678 259 high scalability-2008-02-25-Any Suggestions for the Architecture Template?
19 0.10686678 260 high scalability-2008-02-25-Architecture Template Advice Needed
20 0.10550182 1359 high scalability-2012-11-15-Gone Fishin': Justin.Tv's Live Video Broadcasting Architecture
topicId topicWeight
[(0, 0.222), (1, 0.094), (2, 0.005), (3, 0.086), (4, 0.04), (5, -0.068), (6, -0.015), (7, 0.054), (8, -0.021), (9, -0.049), (10, -0.055), (11, 0.055), (12, -0.022), (13, 0.006), (14, 0.078), (15, -0.034), (16, 0.068), (17, 0.022), (18, -0.028), (19, 0.064), (20, 0.019), (21, 0.004), (22, 0.016), (23, 0.017), (24, -0.104), (25, -0.013), (26, -0.007), (27, 0.007), (28, -0.028), (29, -0.005), (30, -0.009), (31, 0.041), (32, 0.03), (33, 0.024), (34, 0.005), (35, -0.038), (36, -0.068), (37, 0.021), (38, 0.044), (39, -0.016), (40, -0.032), (41, -0.027), (42, 0.071), (43, 0.034), (44, 0.021), (45, -0.033), (46, -0.053), (47, -0.025), (48, 0.071), (49, 0.02)]
simIndex simValue blogId blogTitle
same-blog 1 0.97635096 1379 high scalability-2012-12-31-Designing for Resiliency will be so 2013
Introduction: A big part of engineering for a quality experience is bringing in the long tail . An improbable severe failure can ruin your experience of a site, even if your average experience is quite good. That's where building for resilience comes in. Resiliency used to be outside the realm of possibility for the common system. It was simply too complex and too expensive. An evolution has been underway, making 2013 possibly the first time resiliency is truly on the table as a standard part of system architectures. We are getting the clouds, we are getting the tools, and prices are almost low enough. Even Netflix, real leaders in the resiliency architecture game, took some heat for relying completely on Amazon's ELB and not having a backup load balancing system, leading to a prolonged Christmas Eve failure . Adrian Cockcroft, Cloud Architect at Netflix, said they've investigated creating their own load balancing service, but that "we try not to invest in undifferentiated heavy lifting.
2 0.96199745 1366 high scalability-2012-12-03-Resiliency is the New Normal - A Deep Look at What It Means and How to Build It
Introduction: Perhaps it is because the whole world feels as if it’s riding on the edge of a jagged knife that the idea of resilience is becoming a dominant theme across so many domains. Resilience in beings first developed when cells evolved a way of maintaining inner order through homeostatic (stability through constancy) mechanisms. After homeostasis was mastered, allostasis (stability through change) developed as a way of responding to a dynamic world of challenge. In economics we have the idea of Transition Towns , which emphasizes developing local economies as a way of being resilient to global failures. In agriculture we have the idea of permaculture , building a permanent agriculture by embracing diversity, sustainability, perennial systems, avoiding monocultures, and using edge thinking . There are many more examples, including psychological resilience and the legendary resilience of ecosystems . To explore the idea of resiliency we’ll look at a
3 0.83387589 1492 high scalability-2013-07-17-How do you create a 100th Monkey software development culture?
Introduction: Someone reading my now ancient C++ coding standard recommendation for using doxygen to automatically generate documentation from source code, asked a great question: I've often considered using doxygen, I always ask myself - is this really useful? Would I use it if I were new to a project? Would programmers working on the project use it? I'll rephrase their question to more conveniently express a point I've thought a lot about: Why do companies put so little effort into automating their own development process to make development easier? It's like the hair stylist whose own hair looks like someone cut it using a late night infomercial vacuum cleaner attachment. Or it's like the interior decorator whose own house looks like a monk's cell. Software organizations rarely build software to make developing software easier. Why is that? Because there are three ways changes are made in an organization: Top Dog - Someone so high up in the org chart you would need
4 0.82447135 1012 high scalability-2011-03-28-Aztec Empire Strategy: Use Dual Pipes in Your Aqueduct for High Availability
Introduction: With the Chapultepec aqueduct , also named the great aqueduct , the Aztecs built a novel uninterruptible water supply for providing fresh water to Tenochtitlan , their fast growing jewel of a capital city. A section of the aqueduct is still around today: It's fun to think about how even 600 years ago how it was built with high availability in mind. We find engineers being engineers , no matter the age: It consisted of a twin pipe distribution system made in part of compacted soil and in part of wood for the crossings of the aqueduct over the bridges built to allow the passage of the canoes. It was finished around 1466 AD, and the main purpose was to supply fresh water to Mexico-Tenochtitlan, to mitigate its thirst. The main source for the aqueduct was the spring of Chapultepec and the purpose of the twin pipes was to ease the maintenance of the system, because the water was conveyed through one pipe, and when it got dirty, the water was diverted to the other pipe
Introduction: Update 17 : Are Wireless Road Trains the Cure for Traffic Congestion? BY ADDY DUGDALE . The concept of road trains--up to eight vehicles zooming down the road together--has long been considered a faster, safer, and greener way of traveling long distances by car Update 16: The first electric vehicle in the country powered completely by ultracapacitors . The minibus can be fully recharged in fifteen minutes, unlike battery vehicles, which typically takes hours to recharge. Update 15: How to Make UAVs Fully Autonomous . The Sense-and-Avoid system uses a four-megapixel camera on a pan tilt to detect obstacles from the ground. It puts red boxes around planes and birds, and blue boxes around movement that it determines is not an obstacle (e.g., dust on the lens). Update 14: ATNMBL is a concept vehicle for 2040 that represents the end of driving and an alternative approach to car design. Upon entering ATNMBL, you are presented with a simple question: "Where can I take you
6 0.81842077 1568 high scalability-2013-12-23-What Happens While Your Brain Sleeps is Surprisingly Like How Computers Stay Sane
7 0.81458879 1503 high scalability-2013-08-19-What can the Amazing Race to the South Pole Teach us About Startups?
8 0.81447262 1500 high scalability-2013-08-12-100 Curse Free Lessons from Gordon Ramsay on Building Great Software
9 0.81279337 1422 high scalability-2013-03-12-If Your System was a Symphony it Might Sound Like This...
10 0.81263554 863 high scalability-2010-07-22-How can we spark the movement of research out of the Ivory Tower and into production?
11 0.80369693 352 high scalability-2008-07-18-Robert Scoble's Rules for Successfully Scaling Startups
13 0.7824083 919 high scalability-2010-10-14-I, Cloud
15 0.77303541 1410 high scalability-2013-02-20-Smart Companies Fail Because they Do Everything Right - Staying Alive to Scale
16 0.7716186 1377 high scalability-2012-12-26-Ask HS: What will programming and architecture look like in 2020?
18 0.75783247 1225 high scalability-2012-04-09-Why My Slime Mold is Better than Your Hadoop Cluster
19 0.75608915 500 high scalability-2009-01-22-Heterogeneous vs. Homogeneous System Architectures
20 0.75052488 429 high scalability-2008-10-25-Product: Puppet the Automated Administration System
topicId topicWeight
[(1, 0.099), (2, 0.227), (4, 0.012), (5, 0.015), (10, 0.036), (27, 0.016), (30, 0.013), (37, 0.231), (40, 0.021), (61, 0.093), (79, 0.1), (85, 0.034), (94, 0.029)]
simIndex simValue blogId blogTitle
1 0.94725037 1033 high scalability-2011-05-02-The Updated Big List of Articles on the Amazon Outage
Introduction: Since The Big List Of Articles On The Amazon Outage was published we've a had few updates that people might not have seen. Amazon of course released their Summary of the Amazon EC2 and Amazon RDS Service Disruption in the US East Region . Netlix shared their Lessons Learned from the AWS Outage as did Heroku ( How Heroku Survived the Amazon Outage ), Smug Mug ( How SmugMug survived the Amazonpocalypse ), and SimpleGeo ( How SimpleGeo Stayed Up During the AWS Downtime ). The curious thing from my perspective is the general lack of response to Amazon's explanation. I expected more discussion. There's been almost none that I've seen. My guess is very few people understand what Amazon was talking about enough to comment whereas almost everyone feels qualified to talk about the event itself. Lesson for crisis handlers : deep dive post-mortems that are timely, long, honestish, and highly technical are the most effective means of staunching the downward spiral of media attention.
2 0.93182886 1029 high scalability-2011-04-25-The Big List of Articles on the Amazon Outage
Introduction: Please see The Updated Big List Of Articles On The Amazon Outage for a new improved list. So many great articles have been written on the Amazon Outage. Some aim at being helpful, some chastise developers for being so stupid, some chastise Amazon for being so incompetent, some talk about the pain they and their companies have experienced, and some even predict the downfall of the cloud. Still others say we have seen a sea change in future of the cloud, a prediction that's hard to disagree with, though the shape of the change remains...cloudy. I'll try to keep this list update as more information comes out. There will be a lot for developers to consider going forward. If there's a resource you think should be added, just let me know. Amazon's Explanation of What Happened Summary of the Amazon EC2 and Amazon RDS Service Disruption in the US East Region Hackers News thread on AWS Service Disruption Post Mortem Quite Funny Commentary on the Summary Experiences f
3 0.89361072 311 high scalability-2008-04-29-Strategy: Sample to Reduce Data Set
Introduction: Update: Arjen links to video Supporting Scalable Online Statistical Processing which shows "rather than doing complete aggregates, use statistical sampling to provide a reasonable estimate (unbiased guess) of the result." When you have a lot of data, sampling allows you to draw conclusions from a much smaller amount of data. That's why sampling is a scalability solution. If you don't have to process all your data to get the information you need then you've made the problem smaller and you'll need fewer resources and you'll get more timely results. Sampling is not useful when you need a complete list that matches a specific criteria. If you need to know the exact set of people who bought a car in the last week then sampling won't help. But, if you want to know many people bought a car then you could take a sample and then create estimate of the full data-set. The difference is you won't really know the exact car count. You'll have a confidence interval saying how confident
4 0.8919968 965 high scalability-2010-12-29-Pinboard.in Architecture - Pay to Play to Keep a System Small
Introduction: How do you keep a system small enough, while still being successful, that a simple scale-up strategy becomes the preferred architecture? StackOverflow , for example, could stick with a tool chain they were comfortable with because they had a natural brake on how fast they could grow: there are only so many programmers in the world. If this doesn't work for you, here's another natural braking strategy to consider: charge for your service . Paul Houle summarized this nicely as: avoid scaling problems by building a service that's profitable at a small scale . This interesting point, one I hadn't properly considered before, was brought up by Maciej Ceglowski, co-founder of Pinboard.in , in an interview with Leo Laporte and Amber MacArthur on their their net@night show. Pinboard is a lean, mean, pay for bookmarking machine, a timely replacement for the nearly departed Delicious. And as a self professed anti-social bookmarking site, it emphasizes speed over socializing . Maciej
5 0.88620085 103 high scalability-2007-09-28-Kosmos File System (KFS) is a New High End Google File System Option
Introduction: There's a new clustered file system on the spindle: Kosmos File System (KFS) . Thanks to Rich Skrenta for turning me on to KFS and I think his blog post says it all. KFS is an open source project written in C++ by search startup Kosmix . The team members have a good pedigree so there's a better than average chance this software will be worth considering. After you stop trying to turn KFS into "Kentucky Fried File System" in your mind, take a look at KFS' intriguing feature set: Incremental scalability: New chunkserver nodes can be added as storage needs increase; the system automatically adapts to the new nodes. Availability: Replication is used to provide availability due to chunk server failures. Typically, files are replicated 3-way. Per file degree of replication: The degree of replication is configurable on a per file basis, with a max. limit of 64. Re-replication: Whenever the degree of replication for a file drops below the configured amount (
6 0.88596773 891 high scalability-2010-09-01-Scale-out vs Scale-up
same-blog 7 0.8819254 1379 high scalability-2012-12-31-Designing for Resiliency will be so 2013
8 0.87438691 1133 high scalability-2011-10-27-Strategy: Survive a Comet Strike in the East With Reserved Instances in the West
9 0.85407609 314 high scalability-2008-05-03-Product: nginx
10 0.84410888 1415 high scalability-2013-03-04-7 Life Saving Scalability Defenses Against Load Monster Attacks
11 0.82822788 1366 high scalability-2012-12-03-Resiliency is the New Normal - A Deep Look at What It Means and How to Build It
12 0.82615978 620 high scalability-2009-06-05-SSL RPC API Scalability
13 0.81715137 113 high scalability-2007-10-07-Paper: Architecture of a Highly Scalable NIO-Based Server
14 0.81077337 1444 high scalability-2013-04-23-Facebook Secrets of Web Performance
15 0.79894066 1107 high scalability-2011-08-29-The Three Ages of Google - Batch, Warehouse, Instant
16 0.79325801 1418 high scalability-2013-03-06-Low Level Scalability Solutions - The Aggregation Collection
17 0.78790075 279 high scalability-2008-03-17-Microsoft's New Database Cloud Ready to Rumble with Amazon
18 0.779333 1070 high scalability-2011-06-29-Second Hand Seizure : A New Cause of Site Death
19 0.77490479 628 high scalability-2009-06-13-Neo4j - a Graph Database that Kicks Buttox
20 0.77437395 1509 high scalability-2013-08-30-Stuff The Internet Says On Scalability For August 30, 2013