high_scalability high_scalability-2009 high_scalability-2009-674 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Update 2: Elastic Load Balancer and EC2 instance bandwidth . It turns out we are limited by bandwidth and not by CPU . Solution: use DNS Round Robin for two to three HighCPU medium instances . Update: The Skinny Straw: Cloud Computing's Bottleneck and How to Address It . For cloud computing, bandwidth to and from the cloud provider is a bottleneck . Solution: Evaluate application architecture and consider application partitioning . I'm writing this post as a sort of penance. My sin was getting involved in another mutli-threaded mess of a program that was rife with strange pauses and unexpected errors. I really should have known better. But when APIs choose to make callbacks from some mystery thread pool it's hard to keep things straight. I eventually sobered up and posted all events to a queue so I could make sure the program would work correctly. Doh. I may never know why the .Net console output stopped working, but I'll live with it. And that reminded me that I've been m
sentIndex sentText sentNum sentScore
1 For cloud computing, bandwidth to and from the cloud provider is a bottleneck . [sent-5, score-0.364]
2 I eventually sobered up and posted all events to a queue so I could make sure the program would work correctly. [sent-11, score-0.478]
3 The easiest way to create a scalable service is to compose the service from other scalable services. [sent-17, score-0.32]
4 The canonical cloud architecture that has evolved revolves around dynamically scalable CPUs consuming asynchronous, persistently queued events. [sent-19, score-0.496]
5 SmugMug's Cloud Architecture AWS pioneer Don MacAskill of SmugMug details how they process high-resolution photos and high-definition video use a cloud hosted queuing architecture in SkyNet Lives! [sent-27, score-0.375]
6 SkyNet, as you might expect, operates completely without human minders and automatically scales up and down in relation to the work load. [sent-29, score-0.291]
7 Their system has several components: Work Initiators - Work comes in from your website and/or other software subsystems and is queued up for processing in the Queue Service. [sent-30, score-0.263]
8 Provisioning Service - This is Amazon's infrastructure that allows instances to be automatically scaled up and down in relation to the work load. [sent-35, score-0.291]
9 Queuing Service - This is where work is queued for consumption by the workers. [sent-40, score-0.239]
10 Creating a scalable, distributed, performant, highly available queue service is not easy, so you may want to take a look at a number of different queue product suggestions in Flickr - Do the Essential Work Up-front and Queue the Rest . [sent-42, score-0.493]
11 Controller - This component monitors many variables related to the work flow and decides how many instances of EC2 are necessary based on optimizing a small set of goals. [sent-43, score-0.267]
12 Don shares a lot of practical detailia on how to efficiently use AWS, how their queue service works, and how their controller manages to balances minimizing cost while still being responsive to users. [sent-45, score-0.283]
13 Achieving fairness and balance in a queue system can be difficult, but SmugMug appears to have done a good job of that. [sent-46, score-0.264]
14 If one component is producing events too fast the queue will buffer up events until they can be processed. [sent-50, score-0.48]
15 The idea is to take an unpredictable but possibly large number of search requests, apply the search expression to hundreds of terabytes of documents, and return the results in a reasonable period of time. [sent-63, score-0.268]
16 To coordinate and dispatch work you need a queuing service like SQS. [sent-67, score-0.35]
17 The paper makes several key architectural recommendations: Use Scalable Ingredients - Ensure that your application is scalable by designing each component to be scalable on its own. [sent-75, score-0.302]
18 If every component implements a service interface, responsible for its own scalability in all appropriate dimensions, then the overall system will have a scalable base. [sent-76, score-0.342]
19 If any component fails (and failures happen all the time), the system should automatically alert, failover, and re-sync back to the “last known state” as if nothing had failed. [sent-83, score-0.258]
20 The cloud makes all the necessary components standard, featureful, and relatively inexpensive. [sent-91, score-0.28]
wordName wordTfidf (topN-words)
[('greptheweb', 0.352), ('smugmug', 0.253), ('queue', 0.21), ('cloud', 0.182), ('work', 0.139), ('queuing', 0.138), ('component', 0.128), ('simpledb', 0.126), ('pipelines', 0.124), ('aws', 0.115), ('processing', 0.109), ('inflickr', 0.106), ('queued', 0.1), ('components', 0.098), ('structuring', 0.089), ('scalable', 0.087), ('seasonal', 0.084), ('websites', 0.083), ('essential', 0.082), ('season', 0.081), ('nightly', 0.079), ('expression', 0.078), ('aka', 0.077), ('results', 0.076), ('automatically', 0.076), ('relation', 0.076), ('reboot', 0.075), ('service', 0.073), ('canonical', 0.072), ('events', 0.071), ('sleep', 0.071), ('parallelization', 0.071), ('tight', 0.069), ('sqs', 0.068), ('requests', 0.067), ('loosely', 0.066), ('intermediate', 0.065), ('though', 0.063), ('millions', 0.062), ('automated', 0.062), ('independently', 0.059), ('lives', 0.059), ('program', 0.058), ('removed', 0.057), ('search', 0.057), ('storing', 0.056), ('builds', 0.055), ('parts', 0.055), ('architecture', 0.055), ('system', 0.054)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000005 674 high scalability-2009-08-07-The Canonical Cloud Architecture
Introduction: Update 2: Elastic Load Balancer and EC2 instance bandwidth . It turns out we are limited by bandwidth and not by CPU . Solution: use DNS Round Robin for two to three HighCPU medium instances . Update: The Skinny Straw: Cloud Computing's Bottleneck and How to Address It . For cloud computing, bandwidth to and from the cloud provider is a bottleneck . Solution: Evaluate application architecture and consider application partitioning . I'm writing this post as a sort of penance. My sin was getting involved in another mutli-threaded mess of a program that was rife with strange pauses and unexpected errors. I really should have known better. But when APIs choose to make callbacks from some mystery thread pool it's hard to keep things straight. I eventually sobered up and posted all events to a queue so I could make sure the program would work correctly. Doh. I may never know why the .Net console output stopped working, but I'll live with it. And that reminded me that I've been m
2 0.24262491 38 high scalability-2007-07-30-Build an Infinitely Scalable Infrastructure for $100 Using Amazon Services
Introduction: Can you really create an infinitely scalable infrastructure for less than $100 using Amazon's storage, grid, and queuing services platform? It appears so, at least for the right application. Amazon beams a spot light on the future battle of the roll-your-own versus the connect-the-dots approach to building next generation websites using core external services. Their argument is strong. Using Amazon's platform you can quickly build an infrastructure that would otherwise take an eternity to make, a pile of money to create, and an unbounded mass of people to implement and maintain. Yet Amazon doesn't provide SLAs, so you can you really trust them with your crown jewels? Facebook recently leap frogged Amazon's vision with an even more comprehensive set of services. The battle for the future is on. Site: http://aws.amazon.com/ Information Sources Slides: Building Highly Scalable Web Applications Podcast: Technometria: Amazon Web Services Amazon Services Home . Platform
3 0.20059194 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT
Introduction: It remains that, from the same principles, I now demonstrate the frame of the System of the World. -- Isaac Newton The practice of IT reminds me a lot of the practice of science before Isaac Newton. Aristotelianism was dead, but there was nothing to replace it. Then Newton came along, created a scientific revolution with his System of the World . And everything changed. That was New System of the World number one. New System of the World number two was written about by the incomparable Neal Stephenson in his incredible Baroque Cycle series. It explores the singular creation of a new way of organizing society grounded in new modes of thought in business, religion, politics, and science. Our modern world emerged Enlightened as it could from this roiling cauldron of forces. In IT we may have had a Leonardo da Vinci or even a Galileo, but we’ve never had our Newton. Maybe we don't need a towering genius to make everything clear? For years startups, like the frenetically inventive
4 0.19865341 538 high scalability-2009-03-16-Are Cloud Based Memory Architectures the Next Big Thing?
Introduction: We are on the edge of two potent technological changes: Clouds and Memory Based Architectures. This evolution will rip open a chasm where new players can enter and prosper. Google is the master of disk. You can't beat them at a game they perfected. Disk based databases like SimpleDB and BigTable are complicated beasts, typical last gasp products of any aging technology before a change. The next era is the age of Memory and Cloud which will allow for new players to succeed. The tipping point will be soon. Let's take a short trip down web architecture lane: It's 1993: Yahoo runs on FreeBSD, Apache, Perl scripts and a SQL database It's 1995: Scale-up the database. It's 1998: LAMP It's 1999: Stateless + Load Balanced + Database + SAN It's 2001: In-memory data-grid. It's 2003: Add a caching layer. It's 2004: Add scale-out and partitioning. It's 2005: Add asynchronous job scheduling and maybe a distributed file system. It's 2007: Move it all into the cloud. It's 2008: C
Introduction: All in all this is still my favorite post and I still think it's an accurate vision of a future. Not everyone agrees, but I guess we'll see..."But it is not complicated. [There's] just a lot of it." \--Richard Feynmanon how the immense variety of the world arises from simple rules.Contents:Have We Reached the End of Scaling?Applications Become Black Boxes Using Markets to Scale and Control CostsLet's Welcome our Neo-Feudal OverlordsThe Economic Argument for the Ambient CloudWhat Will Kill the Cloud?The Amazing Collective Compute Power of the Ambient CloudUsing the Ambient Cloud as an Application RuntimeApplications as Virtual StatesConclusionWe have not yet begun to scale. The world is still fundamentally disconnected and for all our wisdom we are still in the earliest days of learning how to build truly large planet-scaling applications.Today 350 million users on Facebook is a lot of users and five million followers on Twitter is a lot of followers. This may seem like a lot now, but c
7 0.18729073 317 high scalability-2008-05-10-Hitting 300 SimbleDB Requests Per Second on a Small EC2 Instance
8 0.16948566 406 high scalability-2008-10-08-Strategy: Flickr - Do the Essential Work Up-front and Queue the Rest
9 0.16615252 96 high scalability-2007-09-18-Amazon Architecture
10 0.16356091 853 high scalability-2010-07-08-Cloud AWS Infrastructure vs. Physical Infrastructure
11 0.16349101 498 high scalability-2009-01-20-Product: Amazon's SimpleDB
12 0.15813409 1090 high scalability-2011-08-01-Peecho Architecture - scalability on a shoestring
13 0.15513475 1654 high scalability-2014-06-05-Cloud Architecture Revolution
14 0.14978918 1331 high scalability-2012-10-02-An Epic TripAdvisor Update: Why Not Run on the Cloud? The Grand Experiment.
15 0.14657319 184 high scalability-2007-12-13-Amazon SimpleDB - Scalable Cloud Database
16 0.14506565 1429 high scalability-2013-03-25-AppBackplane - A Framework for Supporting Multiple Application Architectures
17 0.14483927 1373 high scalability-2012-12-17-11 Uses For the Humble Presents Queue, er, Message Queue
18 0.14238583 761 high scalability-2010-01-17-Applications Become Black Boxes Using Markets to Scale and Control Costs
19 0.13651392 1112 high scalability-2011-09-07-What Google App Engine Price Changes Say About the Future of Web Architecture
topicId topicWeight
[(0, 0.291), (1, 0.113), (2, 0.03), (3, 0.071), (4, -0.067), (5, -0.068), (6, 0.072), (7, -0.083), (8, 0.0), (9, -0.047), (10, 0.028), (11, 0.052), (12, 0.02), (13, -0.085), (14, 0.02), (15, -0.055), (16, -0.044), (17, -0.023), (18, 0.04), (19, -0.013), (20, -0.02), (21, -0.047), (22, 0.018), (23, 0.02), (24, 0.009), (25, -0.064), (26, 0.046), (27, 0.111), (28, 0.087), (29, 0.048), (30, -0.015), (31, 0.031), (32, -0.01), (33, 0.004), (34, 0.059), (35, -0.033), (36, -0.025), (37, -0.023), (38, -0.025), (39, -0.053), (40, 0.053), (41, -0.014), (42, -0.038), (43, -0.003), (44, -0.028), (45, 0.022), (46, -0.014), (47, -0.047), (48, 0.003), (49, -0.048)]
simIndex simValue blogId blogTitle
same-blog 1 0.97713161 674 high scalability-2009-08-07-The Canonical Cloud Architecture
Introduction: Update 2: Elastic Load Balancer and EC2 instance bandwidth . It turns out we are limited by bandwidth and not by CPU . Solution: use DNS Round Robin for two to three HighCPU medium instances . Update: The Skinny Straw: Cloud Computing's Bottleneck and How to Address It . For cloud computing, bandwidth to and from the cloud provider is a bottleneck . Solution: Evaluate application architecture and consider application partitioning . I'm writing this post as a sort of penance. My sin was getting involved in another mutli-threaded mess of a program that was rife with strange pauses and unexpected errors. I really should have known better. But when APIs choose to make callbacks from some mystery thread pool it's hard to keep things straight. I eventually sobered up and posted all events to a queue so I could make sure the program would work correctly. Doh. I may never know why the .Net console output stopped working, but I'll live with it. And that reminded me that I've been m
2 0.84889239 1090 high scalability-2011-08-01-Peecho Architecture - scalability on a shoestring
Introduction: This is a guest post by Marcel Panse and Sander Nagtegaal from Peecho . Although architecture descriptions are an interesting read, the problems that start-ups face are hardly ever addressed. We would like to change that, so here is our architecture story. Introducing a start-up The Amsterdam-based company Peecho offers print-as-a-service. Our embeddable print button allows you to sell your digital content as professionally printed products, like photo books, magazines or canvases - straight from your own website. There is an API, too. Printcloud is the system that powers the print button. It exists in the cloud only, growing when needed and becoming smaller if it can. The system takes in print orders, magically transforms tough data into print-ready files and routes the orders to the production facility that is closest to the intended recipient. To preserve the environment, Peecho's philosophy is to facilitate global ordering, but to aim for local production on
3 0.81813127 38 high scalability-2007-07-30-Build an Infinitely Scalable Infrastructure for $100 Using Amazon Services
Introduction: Can you really create an infinitely scalable infrastructure for less than $100 using Amazon's storage, grid, and queuing services platform? It appears so, at least for the right application. Amazon beams a spot light on the future battle of the roll-your-own versus the connect-the-dots approach to building next generation websites using core external services. Their argument is strong. Using Amazon's platform you can quickly build an infrastructure that would otherwise take an eternity to make, a pile of money to create, and an unbounded mass of people to implement and maintain. Yet Amazon doesn't provide SLAs, so you can you really trust them with your crown jewels? Facebook recently leap frogged Amazon's vision with an even more comprehensive set of services. The battle for the future is on. Site: http://aws.amazon.com/ Information Sources Slides: Building Highly Scalable Web Applications Podcast: Technometria: Amazon Web Services Amazon Services Home . Platform
4 0.81136155 317 high scalability-2008-05-10-Hitting 300 SimbleDB Requests Per Second on a Small EC2 Instance
Introduction: High Performance Multithreaded Access to Amazon SimpleDB is a great follow up to the idea in How SimpleDB Differs from a RDBMS that more programming is the price paid for performance in SimpleDB. It shows how much work and infrastructure is required to batter better performance out of SimpleDB. Remember, in SimpleDB you get keys to records from queries so if you want to get all the fields for records you need to make separate requests. Since SimpleDB isn't exactly a speed daemon the obvious strategy is to parallelize. Even if a job takes a 100 msecs you can get a lot done in a little time if you can execute enough jobs in parallel. Parallelization is the approach taken by Haakon@AWS in his Java code example of how to get the most out of SimpleDB. You can find the code at Indexing and Querying Amazon S3 Metadata with Amazon SimpleDB . We'll also consider how a back-end service architecture built on Erlang may be a better fit with cloud computing. Two general mechanisms
5 0.79855222 1452 high scalability-2013-05-06-7 Not So Sexy Tips for Saving Money On Amazon
Introduction: Harish Ganesan CTO of 8KMiles has a very helpful blog, Cloud, Big Data and Mobile , where he shows a nice analytical bent which leads to a lot of practical advice and cost saving tips: Use SQS Batch Requests to reduce the number of requests hitting SQS which saves costs. Sending 10 messages in a single batch request which in the example save $30/month. Use SQS Long Polling to reduce extra polling requests, cutting down empty receives, which in the example saves ~$600 in empty receive leakage costs. Choose the right search technology choice to save costs in AWS by matching your activity pattern to the technology. For a small application with constant load or a heavily utilized search tier or seasonal loads Amazon Cloud Search looks like the cost efficient play. Use Amazon CloudFront Price Class to minimize costs by selecting the right Price Class for your audience to potentially reduce delivery costs by excluding Amazon CloudFront’s more expensive edge locatio
6 0.77702397 96 high scalability-2007-09-18-Amazon Architecture
7 0.76940244 798 high scalability-2010-03-22-7 Secrets to Successfully Scaling with Scalr (on Amazon) by Sebastian Stadil
9 0.75394082 1198 high scalability-2012-02-24-Stuff The Internet Says On Scalability For February 24, 2012
10 0.74804282 1482 high scalability-2013-06-26-Leveraging Cloud Computing at Yelp - 102 Million Monthly Vistors and 39 Million Reviews
11 0.74547994 964 high scalability-2010-12-28-Netflix: Continually Test by Failing Servers with Chaos Monkey
12 0.73554939 1113 high scalability-2011-09-09-Stuff The Internet Says On Scalability For September 9, 2011
13 0.73482037 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT
14 0.72420001 853 high scalability-2010-07-08-Cloud AWS Infrastructure vs. Physical Infrastructure
15 0.72282171 1654 high scalability-2014-06-05-Cloud Architecture Revolution
16 0.72214842 767 high scalability-2010-01-27-Hot Scalability Links for January 28 2010
17 0.72061175 953 high scalability-2010-12-03-GPU vs CPU Smackdown : The Rise of Throughput-Oriented Architectures
18 0.71868289 406 high scalability-2008-10-08-Strategy: Flickr - Do the Essential Work Up-front and Queue the Rest
19 0.7181071 1381 high scalability-2013-01-04-Stuff The Internet Says On Scalability For January 4, 2013
20 0.71708858 1133 high scalability-2011-10-27-Strategy: Survive a Comet Strike in the East With Reserved Instances in the West
topicId topicWeight
[(1, 0.141), (2, 0.211), (10, 0.069), (26, 0.011), (27, 0.011), (30, 0.032), (40, 0.014), (43, 0.011), (47, 0.011), (53, 0.074), (61, 0.079), (77, 0.027), (79, 0.112), (85, 0.062), (94, 0.058)]
simIndex simValue blogId blogTitle
1 0.97447443 1224 high scalability-2012-04-09-The Instagram Architecture Facebook Bought for a Cool Billion Dollars
Introduction: It's been a well kept secret, but you may have heard Facebook will Buy Photo-Sharing Service Instagram for $1 Billion . Just what is Facebook buying? Here's a quick gloss I did a little over a year ago on a presentation Instagram gave on their architecture. In that article I called Instagram's architecture the " canonical description of an early stage startup in this era." Little did we know how true that would turn out to be. If you want to learn how they did it then don't take a picture, just keep on reading... Instagram is a free photo sharing and social networking service for your iPhone that has been an instant success . Growing to 14 million users in just over a year (now 30 million users), they reached 150 million photos in August while amassing several terabytes of photos, and they did this with just 3 Instaneers, all on the Amazon stack. The Instagram team has written up what can be considered the canonical description of an early stage startup in this era: Wh
Introduction: Instagram is a free photo sharing and social networking service for your iPhone that has been an instant success . Growing to 14 million users in just over a year, they reached 150 million photos in August while amassing several terabytes of photos, and they did this with just 3 Instaneers, all on the Amazon stack. The Instagram team has written up what can be considered the canonical description of an early stage startup in this era: What Powers Instagram: Hundreds of Instances, Dozens of Technologies . Instagram uses a pastiche of different technologies and strategies. The team is small yet has experience rapid growth riding the crest of a rising social and mobile wave, it uses a hybrid of SQL and NoSQL, it uses a ton of open source projects, they chose the cloud over colo, Amazon services are highly leveraged rather than building their own, reliability is through availability zones, async work scheduling links components together, the system is composed as much as possible
3 0.96791339 1522 high scalability-2013-09-25-Great Open Source Solution for Boring HA and Scalability Problems
Introduction: This is a guest post about how boring and repetitive HA and scalability problems can be solved via Open Source so you can focus on the interesting tasks. The post was written by Maarten Ectors , responsible for Cloud Strategy and Frank Mueller , a Juju Core developer, at Ubuntu / Canonical . High-availability and scalability are exciting in general but there are certain problems that experts see over and over again. The list is long but examples are setting up MySQL clustering, sharding Mongo, adding data nodes to a Hadoop cluster, monitoring with Ganglia, building continuous deployment solutions, integrating Memcached / Varnish / Nginx,… Why are we reinventing the wheel? At Ubuntu we made it our goal to have the community solve these repetitive and often boring tasks. How often have you had to set-up MySQL replication and scale it? What if the next time you just simply do: juju deploy mysql juju deploy mysql mysql-slave juju add-relation mysql:master mysql-slave:slav
4 0.9662295 345 high scalability-2008-06-11-Pyshards aspires to build sharding toolkit for Python
Introduction: I've been interested in sharding concepts since first hearing the term "shard" a few years back. My interest had been piqued earlier, the first time I read about Google's original approach to distributed search. It was described as a hashtable-like system in which independent physical machines play the role of the buckets. More recently, I needed the capacity and performance of a Sharded system, but did not find helpful libraries or toolkits which would assist with the configuration for my language of preference these days, which is Python. And, since I had a few weeks on my hands, I decided I would begin the work of creating these tools. The result of my initial work the Pyshards project, a still-incomplete python and MySQL based horizontal partitioning and sharding toolkit. HighScalability.com readers will already know that horizontal partitioning is a data segmenting pattern in which distinct groups of physical row-based datasets are distributed across multiple partitions. Whe
same-blog 5 0.96101028 674 high scalability-2009-08-07-The Canonical Cloud Architecture
Introduction: Update 2: Elastic Load Balancer and EC2 instance bandwidth . It turns out we are limited by bandwidth and not by CPU . Solution: use DNS Round Robin for two to three HighCPU medium instances . Update: The Skinny Straw: Cloud Computing's Bottleneck and How to Address It . For cloud computing, bandwidth to and from the cloud provider is a bottleneck . Solution: Evaluate application architecture and consider application partitioning . I'm writing this post as a sort of penance. My sin was getting involved in another mutli-threaded mess of a program that was rife with strange pauses and unexpected errors. I really should have known better. But when APIs choose to make callbacks from some mystery thread pool it's hard to keep things straight. I eventually sobered up and posted all events to a queue so I could make sure the program would work correctly. Doh. I may never know why the .Net console output stopped working, but I'll live with it. And that reminded me that I've been m
6 0.95691305 1600 high scalability-2014-02-21-Stuff The Internet Says On Scalability For February 21st, 2014
7 0.95605755 1389 high scalability-2013-01-18-Stuff The Internet Says On Scalability For January 18, 2013
8 0.95544791 72 high scalability-2007-08-22-Wikimedia architecture
9 0.95522231 1109 high scalability-2011-09-02-Stuff The Internet Says On Scalability For September 2, 2011
10 0.95452434 716 high scalability-2009-10-06-Building a Unique Data Warehouse
11 0.95437717 1431 high scalability-2013-03-29-Stuff The Internet Says On Scalability For March 29, 2013
12 0.95414376 498 high scalability-2009-01-20-Product: Amazon's SimpleDB
13 0.95390511 1637 high scalability-2014-04-25-Stuff The Internet Says On Scalability For April 25th, 2014
14 0.95220792 1302 high scalability-2012-08-10-Stuff The Internet Says On Scalability For August 10, 2012
15 0.95195371 1148 high scalability-2011-11-29-DataSift Architecture: Realtime Datamining at 120,000 Tweets Per Second
16 0.95171994 671 high scalability-2009-08-05-Stack Overflow Architecture
17 0.95117193 1186 high scalability-2012-02-02-The Data-Scope Project - 6PB storage, 500GBytes-sec sequential IO, 20M IOPS, 130TFlops
18 0.95094109 1382 high scalability-2013-01-07-Analyzing billions of credit card transactions and serving low-latency insights in the cloud
20 0.95024747 1602 high scalability-2014-02-26-The WhatsApp Architecture Facebook Bought For $19 Billion