high_scalability high_scalability-2010 high_scalability-2010-831 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Cloud computing promises a number of advantages for the deployment of data-intensive applications. Most prominently, these include reducing cost with a pay-as-you-go pricing model and (virtually) unlimited throughput by adding servers if the workload increases. At the Systems Group , ETH Zurich, we did an extensive end-to-end performance study to compare the major cloud offerings regarding their ability to fulfill these promises and their implied cost. The focus of the work is on transaction processing (i.e., read and update work-loads), rather than analytics workloads. We used the TPC-W , a standardized benchmark simulating a Web-shop, as the baseline for our comparison. The TPC-W defines that users are simulated through emulated browsers (EB) and issue page requests, called web-interactions (WI), against the system. As a major modification to the benchmark, we constantly increase the load from 1 to 9000 simultaneous users to measure the scalability and cost variance of the syst
sentIndex sentText sentNum sentScore
1 Most prominently, these include reducing cost with a pay-as-you-go pricing model and (virtually) unlimited throughput by adding servers if the workload increases. [sent-2, score-0.245]
2 As a major modification to the benchmark, we constantly increase the load from 1 to 9000 simultaneous users to measure the scalability and cost variance of the system. [sent-9, score-0.19]
3 As a result, the cost and performance of the services vary significantly depending on the workload. [sent-15, score-0.297]
4 Furthermore, only two architectures, the one implemented on top of Amazon S3 and MS Azure using SQL Azure as the database, were able to scale and sustain our maximum workload of 9000 EBs, resulting in over 1200 Web-interactions per second (WIPS). [sent-17, score-0.34]
5 MySQL installed on EC2 and Amazon RDS are able to sustain a maximum load of approximate 3500 EBs. [sent-18, score-0.265]
6 Figure 2: Comparison of Architectures [WIPS] Table 1 shows the total cost per web-interaction in milli dollar for the alternative approaches and a varying load (EBs). [sent-29, score-0.59]
7 Google AE is cheapest for low workloads (below 100 EBs) whereas Azure is cheapest for medium to large workloads (more than 100 EBs). [sent-30, score-0.788]
8 The three MySQL variants (MySQL, MySQL/R, and RDS) have (almost) the same cost as Azure for medium workloads (EB=100 and EB=3000), but they are not able to sustain large workloads. [sent-31, score-1.012]
9 Azure and the MySQL variants win for medium and large workloads because all these approaches can amortize their fixed cost for these workloads. [sent-37, score-0.935]
10 Azure SQL server has a fixed cost per month of USD 100 for a database of up to 10 GB, independent of the number of requests that need to be processed by the database. [sent-38, score-0.414]
11 Likewise, RDS involves an hourly fixed fee so that the cost per WIPS decreases in a load situation. [sent-40, score-0.453]
12 Table 2 shows the total cost per day for the alternative approaches and a varying load (EBs). [sent-42, score-0.59]
13 (A "-" indicates that the variant was not able to sustain the load. [sent-43, score-0.352]
14 ) These results confirm the observations made previously: Google wins for small workloads; Azure wins for medium and large workloads. [sent-44, score-0.315]
15 The three MySQL variants come close to Azure in the range of workloads that they sustain. [sent-46, score-0.431]
16 Azure and the three MySQL variants roughly share the same architectural principles (replication with master copy architectures). [sent-47, score-0.209]
17 For a large number of EBs, the high cost of SimpleDB is particularly annoying because users must pay even though SimpleDB drops many requests and is not able to sustain the workload. [sent-50, score-0.511]
18 Table 2: Total Cost per Day [$], Vary EB Turning to the S3 cost in Table 2, the total cost grows linearly with the workload. [sent-51, score-0.533]
19 For S3, the high cost is matched by high throughputs so that the high cost for S3 at high workloads is tolerable. [sent-53, score-0.602]
20 In addition to the here presented results, the paper also compares the overload behavior and presents the different cost-factors leading to the here presented numbers. [sent-59, score-0.227]
wordName wordTfidf (topN-words)
[('ebs', 0.325), ('wips', 0.304), ('azure', 0.234), ('simpledb', 0.228), ('workloads', 0.222), ('variants', 0.209), ('sustain', 0.206), ('cost', 0.19), ('ae', 0.182), ('eb', 0.165), ('medium', 0.126), ('wi', 0.122), ('mysql', 0.12), ('vary', 0.107), ('emulated', 0.104), ('fee', 0.095), ('table', 0.095), ('approaches', 0.095), ('google', 0.093), ('fixed', 0.093), ('rds', 0.089), ('variant', 0.087), ('transaction', 0.086), ('quota', 0.086), ('presented', 0.084), ('total', 0.078), ('combinations', 0.078), ('cheapest', 0.077), ('shows', 0.077), ('benchmark', 0.076), ('per', 0.075), ('varying', 0.075), ('appengine', 0.075), ('architectures', 0.074), ('browsers', 0.073), ('amazon', 0.071), ('promises', 0.069), ('whereas', 0.064), ('results', 0.063), ('figure', 0.063), ('wins', 0.063), ('offered', 0.062), ('policy', 0.059), ('able', 0.059), ('behavior', 0.059), ('guarantees', 0.058), ('requests', 0.056), ('pricing', 0.055), ('eth', 0.055), ('zurich', 0.055)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000002 831 high scalability-2010-05-26-End-To-End Performance Study of Cloud Services
Introduction: Cloud computing promises a number of advantages for the deployment of data-intensive applications. Most prominently, these include reducing cost with a pay-as-you-go pricing model and (virtually) unlimited throughput by adding servers if the workload increases. At the Systems Group , ETH Zurich, we did an extensive end-to-end performance study to compare the major cloud offerings regarding their ability to fulfill these promises and their implied cost. The focus of the work is on transaction processing (i.e., read and update work-loads), rather than analytics workloads. We used the TPC-W , a standardized benchmark simulating a Web-shop, as the baseline for our comparison. The TPC-W defines that users are simulated through emulated browsers (EB) and issue page requests, called web-interactions (WI), against the system. As a major modification to the benchmark, we constantly increase the load from 1 to 9000 simultaneous users to measure the scalability and cost variance of the syst
2 0.24393182 1631 high scalability-2014-04-14-How do you even do anything without using EBS?
Introduction: In a recent thread on Hacker News discussing recent AWS price changes , seldo mentioned they use AWS for business, they just never use EBS on AWS. A good question was asked: How do you even do anything without using EBS? Amazon certainly makes using EBS the easiest path. And EBS has a better reliability record as of late, but it's still often recommended to not use EBS. This avoids a single point of failure at the cost of a lot of complexity, though as AWS uses EBS internally, not using EBS may not save you if you use other AWS services like RDS or ELB. If you don't want to use EBS, it's hard to know where to even start. A dilemma to which Kevin Nuckolls gives a great answer : Well, you break your services out onto stateless and stateful machines. After that, you make sure that each of your stateful services is resilient to individual node failure. I prefer to believe that if you can't roll your entire infrastructure over to new nodes monthly then you're unprepared fo
3 0.21042244 498 high scalability-2009-01-20-Product: Amazon's SimpleDB
Introduction: Update 35 : How and Why Glue is Using Amazon SimpleDB instead of a Relational Database . Discusses a key design decision that required duplicating data in order to mimic RDBMS joins: Given the trade off between potential inconsistencies and scalability, social services have to choose the latter. Update 34 : Apparently Amazon pulled this article. I'm not sure what that means. Maybe time went backwards or something? Amazon dramatically drops SimpleDB pricing to $0.25 per GB per month from $1.50 per GB . This puts SimpleDB on par with Google App Engine . They also announced a few new features: a SQL-like SELECT API as well as a Batch Put operation to streamline uploading of multiple items or attributes . One of the complaints against SimpleDB is that programmers end up writing too much code to do simple things. These features and a much cheaper price should help considerably. And you can store lots of data now. GAE is still capped. Update 33 : Amazon announces
4 0.18504642 1121 high scalability-2011-09-21-5 Scalability Poisons and 3 Cloud Scalability Antidotes
Introduction: Sean Hull with two helpful posts: 5 Things That are Toxic to Scalability : Object Relational Mappers. Create complex queries that hard to optimize and tweak. Synchronous, Serial, Coupled or Locking Processes. Locks are like stop signs, traffic circles keep the traffic flowing. Row level locking is better than table level locking. Use async replication. Use eventual consistency for clusters. One Copy of Your Database. A single database server is a choke point. Create parallel databases and let a driver select between them. Having No Metrics. Visualize what's happening to your system using one of the many monitoring packages. Lack of Feature Flags. Be able to turn off features via a flag so when a spike hits features can be turned off to reduce load. 3 Ways to Boost Cloud Scalability : Use Auto-scaling. Spin-up new instances when a threshold is passed and back down again when traffic drops. Horizontally Scale the Database Tier. MySQL in a master
5 0.18488854 881 high scalability-2010-08-16-Scaling an AWS infrastructure - Tools and Patterns
Introduction: This is a guest post by Frédéric Faure (architect at Ysance ), you can follow him on twitter . How do you scale an AWS (Amazon Web Services) infrastructure? This article will give you a detailed reply in two parts: the tools you can use to make the most of Amazon’s dynamic approach, and the architectural model you should adopt for a scalable infrastructure. I base my report on my experience gained in several AWS production projects in casual gaming (Facebook), e-commerce infrastructures and within the mainstream GIS (Geographic Information System). It’s true that my experience in gaming ( IsCool, The Game ) is currently the most representative in terms of scalability, due to the number of users (over 800 thousand DAU – daily active users – at peak usage and over 20 million page views every day), however my experiences in e-commerce and GIS (currently underway) provide a different view of scalability, taking into account the various problems of availability and da
6 0.16164914 306 high scalability-2008-04-21-The Search for the Source of Data - How SimpleDB Differs from a RDBMS
7 0.1471514 925 high scalability-2010-10-22-Paper: Netflix’s Transition to High-Availability Storage Systems
9 0.14622879 853 high scalability-2010-07-08-Cloud AWS Infrastructure vs. Physical Infrastructure
10 0.1462149 1348 high scalability-2012-10-26-Stuff The Internet Says On Scalability For October 26, 2012
11 0.14084226 619 high scalability-2009-06-05-HotPads Shows the True Cost of Hosting on Amazon
13 0.13651735 184 high scalability-2007-12-13-Amazon SimpleDB - Scalable Cloud Database
14 0.12303169 1065 high scalability-2011-06-21-Running TPC-C on MySQL-RDS
15 0.1198224 1029 high scalability-2011-04-25-The Big List of Articles on the Amazon Outage
16 0.11738774 687 high scalability-2009-08-24-How Google Serves Data from Multiple Datacenters
17 0.11600111 974 high scalability-2011-01-18-Paper: Relational Cloud: A Database-as-a-Service for the Cloud
18 0.11416349 1331 high scalability-2012-10-02-An Epic TripAdvisor Update: Why Not Run on the Cloud? The Grand Experiment.
19 0.11127427 1098 high scalability-2011-08-15-Should any cloud be considered one availability zone? The Amazon experience says yes.
20 0.10706455 1353 high scalability-2012-11-01-Cost Analysis: TripAdvisor and Pinterest costs on the AWS cloud
topicId topicWeight
[(0, 0.169), (1, 0.091), (2, -0.017), (3, 0.046), (4, -0.072), (5, 0.037), (6, -0.024), (7, -0.128), (8, 0.06), (9, -0.136), (10, 0.021), (11, -0.1), (12, 0.053), (13, -0.027), (14, -0.026), (15, 0.008), (16, -0.056), (17, -0.009), (18, 0.069), (19, -0.03), (20, 0.102), (21, 0.025), (22, -0.016), (23, -0.037), (24, 0.004), (25, -0.038), (26, -0.031), (27, -0.008), (28, 0.109), (29, -0.042), (30, -0.004), (31, -0.029), (32, 0.01), (33, 0.03), (34, -0.014), (35, -0.01), (36, 0.066), (37, 0.009), (38, 0.009), (39, -0.023), (40, 0.059), (41, -0.005), (42, 0.002), (43, -0.06), (44, 0.005), (45, -0.03), (46, -0.006), (47, -0.009), (48, 0.095), (49, 0.04)]
simIndex simValue blogId blogTitle
same-blog 1 0.96599412 831 high scalability-2010-05-26-End-To-End Performance Study of Cloud Services
Introduction: Cloud computing promises a number of advantages for the deployment of data-intensive applications. Most prominently, these include reducing cost with a pay-as-you-go pricing model and (virtually) unlimited throughput by adding servers if the workload increases. At the Systems Group , ETH Zurich, we did an extensive end-to-end performance study to compare the major cloud offerings regarding their ability to fulfill these promises and their implied cost. The focus of the work is on transaction processing (i.e., read and update work-loads), rather than analytics workloads. We used the TPC-W , a standardized benchmark simulating a Web-shop, as the baseline for our comparison. The TPC-W defines that users are simulated through emulated browsers (EB) and issue page requests, called web-interactions (WI), against the system. As a major modification to the benchmark, we constantly increase the load from 1 to 9000 simultaneous users to measure the scalability and cost variance of the syst
Introduction: Amazon created a whole new class of service with their Provisioned IOPS for RDS, EBS, and DynamoDB. The idea is simple. If you want more performance, you turn a dial up. If you want less, you turn a dial down. A beautifully simple model. You pay for the performance you want, which is different than their previous cloud model, where performance varied, but you paid only for what you used. The question: Do these higher priced services really work better? Rodrigo Campos put this question to the test (only for EBS) by running a benchmark he describes in IOMelt Provisioned IOPS EBS Benchmark Results - December 2012 . The result? Yes, AWS Provisioned IOPS Volumes Really Deliver More Consistent and Higher Performance IO : It is clear that the provisioned IOPS EBS volumes offer a huge performance upgrade when compared to the non-optimized EBS volumes, but as data has to be spread among more underlying disks or systems, it seems that the volume is increasingly more susceptibl
3 0.66486502 498 high scalability-2009-01-20-Product: Amazon's SimpleDB
Introduction: Update 35 : How and Why Glue is Using Amazon SimpleDB instead of a Relational Database . Discusses a key design decision that required duplicating data in order to mimic RDBMS joins: Given the trade off between potential inconsistencies and scalability, social services have to choose the latter. Update 34 : Apparently Amazon pulled this article. I'm not sure what that means. Maybe time went backwards or something? Amazon dramatically drops SimpleDB pricing to $0.25 per GB per month from $1.50 per GB . This puts SimpleDB on par with Google App Engine . They also announced a few new features: a SQL-like SELECT API as well as a Batch Put operation to streamline uploading of multiple items or attributes . One of the complaints against SimpleDB is that programmers end up writing too much code to do simple things. These features and a much cheaper price should help considerably. And you can store lots of data now. GAE is still capped. Update 33 : Amazon announces
4 0.6428985 1121 high scalability-2011-09-21-5 Scalability Poisons and 3 Cloud Scalability Antidotes
Introduction: Sean Hull with two helpful posts: 5 Things That are Toxic to Scalability : Object Relational Mappers. Create complex queries that hard to optimize and tweak. Synchronous, Serial, Coupled or Locking Processes. Locks are like stop signs, traffic circles keep the traffic flowing. Row level locking is better than table level locking. Use async replication. Use eventual consistency for clusters. One Copy of Your Database. A single database server is a choke point. Create parallel databases and let a driver select between them. Having No Metrics. Visualize what's happening to your system using one of the many monitoring packages. Lack of Feature Flags. Be able to turn off features via a flag so when a spike hits features can be turned off to reduce load. 3 Ways to Boost Cloud Scalability : Use Auto-scaling. Spin-up new instances when a threshold is passed and back down again when traffic drops. Horizontally Scale the Database Tier. MySQL in a master
5 0.63266742 1543 high scalability-2013-11-05-10 Things You Should Know About AWS
Introduction: Authored by Chris Fregly : Former Netflix Streaming Platform Engineer, AWS Certified Solution Architect and Purveyor of fluxcapacitor.com. Ahead of the upcoming 2nd annual re:Invent conference, inspired by Simone Brunozzi’s recent presentation at an AWS Meetup in San Francisco, and collected from a few of my recent Fluxcapacitor.com consulting engagements, I’ve compiled a list of 10 useful time and clock-tick saving tips about AWS. 1) Query AWS resource metadata Can’t remember the EBS-Optimized IO throughput of your c1.xlarge cluster? How about the size limit of an S3 object on a single PUT? awsnow.info is the answer to all of your AWS-resource metadata questions. Interested in integrating awsnow.info with your application? You’re in luck. There’s now a REST API , as well! Note: These are default soft limits and will vary by account. 2) Tame your S3 buckets Delete an entire S3 bucket with a single CLI command:
6 0.62725407 1348 high scalability-2012-10-26-Stuff The Internet Says On Scalability For October 26, 2012
7 0.61631435 1347 high scalability-2012-10-25-Not All Regions are Created Equal - South America Es Bueno
8 0.60644335 1343 high scalability-2012-10-18-Save up to 30% by Selecting Better Performing Amazon Instances
9 0.60535276 1631 high scalability-2014-04-14-How do you even do anything without using EBS?
10 0.58432841 1452 high scalability-2013-05-06-7 Not So Sexy Tips for Saving Money On Amazon
11 0.58118492 798 high scalability-2010-03-22-7 Secrets to Successfully Scaling with Scalr (on Amazon) by Sebastian Stadil
12 0.58055782 1278 high scalability-2012-07-06-Stuff The Internet Says On Scalability For July 6, 2012
13 0.57787132 1557 high scalability-2013-12-02-Evolution of Bazaarvoice’s Architecture to 500M Unique Users Per Month
14 0.57554287 1098 high scalability-2011-08-15-Should any cloud be considered one availability zone? The Amazon experience says yes.
15 0.57369488 487 high scalability-2009-01-08-Paper: Sharding with Oracle Database
16 0.57316536 1448 high scalability-2013-04-29-AWS v GCE Face-off and Why Innovation Needs Lower Cost Infrastructures
17 0.57287735 925 high scalability-2010-10-22-Paper: Netflix’s Transition to High-Availability Storage Systems
18 0.5711841 619 high scalability-2009-06-05-HotPads Shows the True Cost of Hosting on Amazon
19 0.56930125 336 high scalability-2008-05-31-Biggest Under Reported Story: Google's BigTable Costs 10 Times Less than Amazon's SimpleDB
20 0.56911737 853 high scalability-2010-07-08-Cloud AWS Infrastructure vs. Physical Infrastructure
topicId topicWeight
[(1, 0.076), (2, 0.153), (10, 0.078), (30, 0.342), (56, 0.022), (61, 0.072), (73, 0.011), (79, 0.096), (85, 0.012), (94, 0.038)]
simIndex simValue blogId blogTitle
1 0.96206987 1459 high scalability-2013-05-16-Paper: Warp: Multi-Key Transactions for Key-Value Stores
Introduction: Looks like an interesting take on "a completely asynchronous, low-latency transaction management protocol, in line with the fully distributed NoSQL architecture." Warp: Multi-Key Transactions for Key-Value Stores  overview: Implementing ACID transactions has been a longstanding challenge for NoSQL systems. Because these systems are based on a sharded architecture, transactions necessarily require coordination across multiple servers. Past work in this space has relied either on heavyweight protocols such as Paxos or clock synchronization for this coordination. This paper presents a novel protocol for coordinating distributed transactions with ACID semantics on top of a sharded data store. Called linear transactions, this protocol achieves scalability by distributing the coordination task to only those servers that hold relevant data for each transaction. It achieves high performance by serializing only those transactions whose concurrent execution could potentially yield a vio
2 0.95737267 991 high scalability-2011-02-16-Paper: An Experimental Investigation of the Akamai Adaptive Video Streaming
Introduction: Video is hot on the Internet and people are really interested in knowing how to make it work. Dan Rayburn has a post pointing to a fascinating paper: An Experimental Investigation of the Akamai Adaptive Video Streaming , which talks in some detail about the protocols big players like YouTube, Skype and Akamai use to serve video over on an inherently video unfriendly medium like the Internet. For Akamai they found: Each video is encoded in five versions at different bit rates and stored in separate files. The client sends commands to the server with an average inter departure time of about 2 s, i.e. the control algorithm is executed on average each 2 seconds. Akamai uses only the video level to adapt the video source to the available bandwidth, whereas the frame rate of the video is kept constant. When a sudden drop in the available bandwidth occurs, short interruptions of the video playback can occur due to the a large actuation delay. For a sudden increase of the avai
3 0.95359343 131 high scalability-2007-10-25-Should JSPs be avoided for high scalability?
Introduction: I just heard about some web sites where Velocity templates are used to render HTML instead of using JSPs and all the processing in performed in servlets. Can JSPs cause issue with scalability? Thanks, Unmesh
4 0.94104207 1016 high scalability-2011-04-04-Scaling Social Ecommerce Architecture Case study
Introduction: A recent study showed that over 92 percent of executives from leading retailers are focusing their marketing efforts on Facebook and subsequent applications. Furthermore, over 71 percent of users have confirmed they are more likely to make a purchase after “liking” a brand they find online. ( source ) Sears Architect Tomer Gabel provides an insightful overview on how they built a Social Ecommerce solution for Sears.com that can handle complex relationship quires in real time. Tomer goes through: the architectural considerations behind their solution why they chose memory over disk how they partitioned the data to gain scalability why they chose to execute code with the data using GigaSpaces Map/Reduce execution framework how they integrated with Facebook why they chose GigaSpaces over Coherence and Terracotta for in-memory caching and scale In this post I tried to summarize the main takeaway from the interview. You can also watch the full interview (highly reco
5 0.89797211 16 high scalability-2007-07-16-Book: High Performance MySQL
Introduction: As users come to depend on MySQL, they find that they have to deal with issues of reliability, scalability, and performance--issues that are not well documented but are critical to a smoothly functioning site. This book is an insider's guide to these little understood topics. Author Jeremy Zawodny has managed large numbers of MySQL servers for mission-critical work at Yahoo!, maintained years of contacts with the MySQL AB team, and presents regularly at conferences. Jeremy and Derek have spent months experimenting, interviewing major users of MySQL, talking to MySQL AB, benchmarking, and writing some of their own tools in order to produce the information in this book. In High Performance MySQL you will learn about MySQL indexing and optimization in depth so you can make better use of these key features. You will learn practical replication, backup, and load-balancing strategies with information that goes beyond available tools to discuss their effects in real-life environments. And you
same-blog 6 0.89702708 831 high scalability-2010-05-26-End-To-End Performance Study of Cloud Services
7 0.87691575 261 high scalability-2008-02-25-Make Your Site Run 10 Times Faster
8 0.87396109 500 high scalability-2009-01-22-Heterogeneous vs. Homogeneous System Architectures
9 0.86809278 308 high scalability-2008-04-22-Simple NFS failover solution with symbolic link?
11 0.83303344 14 high scalability-2007-07-15-Web Analytics: An Hour a Day
12 0.83079517 783 high scalability-2010-02-24-Hot Scalability Links for February 24, 2010
13 0.82562119 182 high scalability-2007-12-12-Oracle Can Do Read-Write Splitting Too
14 0.81669867 263 high scalability-2008-02-27-Product: System Imager - Automate Deployment and Installs
15 0.79193389 917 high scalability-2010-10-08-4 Scalability Themes from Surgecon
16 0.7792418 1284 high scalability-2012-07-16-Cinchcast Architecture - Producing 1,500 Hours of Audio Every Day
17 0.7569117 334 high scalability-2008-05-29-Amazon Improves Diagonal Scaling Support with High-CPU Instances
18 0.74742466 464 high scalability-2008-12-13-Strategy: Facebook Tweaks to Handle 6 Time as Many Memcached Requests
19 0.74686939 291 high scalability-2008-03-29-20 New Rules for Faster Web Pages
20 0.74611229 312 high scalability-2008-04-30-Rather small site architecture.