high_scalability high_scalability-2010 high_scalability-2010-767 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Google's Research Areas of Interest: Building scalable, robust cluster applications . At Google we see distributed systems as a technology in its infancy, with huge gaps in the supporting research that represent some of the most important problems in the space. Here are some examples: Resource sharing, Balancing cost, performance, and reliability, Self-maintaining systems. Amazon SimpleDB: A Simple Way to Store Complex Data by Paul Tremblett. The most effective way I have found to understand SimpleDB is to think about it in terms of something else we all use and understand -- a spreadsheet. Rackspace Cloud Servers versus Amazon EC2: Performance Analysis . The Bitsource conducted a review of the two cloud computing platforms, Rackspace Cloud Servers and Amazon Elastic Compute Cloud (EC2), to get a general idea of overall system performance. Private Clouds Are Not The Future by Jame Hamilton. Private clouds are better than nothing but an investment in
sentIndex sentText sentNum sentScore
1 Google's Research Areas of Interest: Building scalable, robust cluster applications . [sent-1, score-0.082]
2 At Google we see distributed systems as a technology in its infancy, with huge gaps in the supporting research that represent some of the most important problems in the space. [sent-2, score-0.25]
3 The most effective way I have found to understand SimpleDB is to think about it in terms of something else we all use and understand -- a spreadsheet. [sent-5, score-0.399]
4 Rackspace Cloud Servers versus Amazon EC2: Performance Analysis . [sent-6, score-0.093]
5 The Bitsource conducted a review of the two cloud computing platforms, Rackspace Cloud Servers and Amazon Elastic Compute Cloud (EC2), to get a general idea of overall system performance. [sent-7, score-0.386]
6 Private clouds are better than nothing but an investment in a private cloud is an investment in a temporary fix that will only slow the path to the final destination: shared clouds. [sent-9, score-1.122]
7 What is the right way to measure scale? [sent-10, score-0.083]
8 Is using the number of nodes a better proxy than size of data? [sent-13, score-0.291]
9 It becomes much more complicated to handle fail-over between multiple data centers. [sent-18, score-0.08]
10 As an example if data center 1 fails entirely, we need to ensure that VIPs are routed to the correct data center OR DNS is changed. [sent-19, score-0.519]
11 I was recently tasked with fork-lifting ~1 billion rows from Oracle into SimpleDB . [sent-21, score-0.224]
wordName wordTfidf (topN-words)
[('simpledb', 0.219), ('investment', 0.201), ('multihoming', 0.185), ('infancy', 0.185), ('jame', 0.185), ('throughputby', 0.185), ('futureby', 0.174), ('greenplum', 0.174), ('vips', 0.174), ('clouds', 0.17), ('siddharth', 0.155), ('cloud', 0.151), ('research', 0.144), ('gaps', 0.141), ('royans', 0.138), ('tasked', 0.138), ('conducted', 0.134), ('jan', 0.134), ('destination', 0.126), ('andrew', 0.125), ('amazon', 0.12), ('understand', 0.119), ('daniel', 0.119), ('temporary', 0.116), ('center', 0.113), ('routed', 0.109), ('represent', 0.109), ('final', 0.107), ('achieving', 0.106), ('nodes', 0.103), ('review', 0.101), ('better', 0.098), ('entirely', 0.095), ('rackspace', 0.094), ('versus', 0.093), ('fails', 0.093), ('correct', 0.091), ('proxy', 0.09), ('steps', 0.089), ('areas', 0.088), ('rows', 0.086), ('way', 0.083), ('robust', 0.082), ('handle', 0.08), ('elastic', 0.08), ('interest', 0.079), ('google', 0.078), ('private', 0.078), ('scale', 0.078), ('effective', 0.078)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999982 767 high scalability-2010-01-27-Hot Scalability Links for January 28 2010
Introduction: Google's Research Areas of Interest: Building scalable, robust cluster applications . At Google we see distributed systems as a technology in its infancy, with huge gaps in the supporting research that represent some of the most important problems in the space. Here are some examples: Resource sharing, Balancing cost, performance, and reliability, Self-maintaining systems. Amazon SimpleDB: A Simple Way to Store Complex Data by Paul Tremblett. The most effective way I have found to understand SimpleDB is to think about it in terms of something else we all use and understand -- a spreadsheet. Rackspace Cloud Servers versus Amazon EC2: Performance Analysis . The Bitsource conducted a review of the two cloud computing platforms, Rackspace Cloud Servers and Amazon Elastic Compute Cloud (EC2), to get a general idea of overall system performance. Private Clouds Are Not The Future by Jame Hamilton. Private clouds are better than nothing but an investment in
2 0.19261542 184 high scalability-2007-12-13-Amazon SimpleDB - Scalable Cloud Database
Introduction: Amazon has announced the limited beta of Amazon SimpleDB - a simple web services interface to create and store multiple data sets, query your data easily, and return the results. Together with the Simple Storage Service (S3), Elastic Compute Cloud (EC2) and other web services Amazon offers a complete utility computing platform. SimpleDB was the missing piece of AWS - the scalable structured database. Check out my blog entry: http://innowave.blogspot.com/2007/12/amazon-simpledb-scalable-cloud-database.html I was waiting for this one :-) Geekr
3 0.19244137 498 high scalability-2009-01-20-Product: Amazon's SimpleDB
Introduction: Update 35 : How and Why Glue is Using Amazon SimpleDB instead of a Relational Database . Discusses a key design decision that required duplicating data in order to mimic RDBMS joins: Given the trade off between potential inconsistencies and scalability, social services have to choose the latter. Update 34 : Apparently Amazon pulled this article. I'm not sure what that means. Maybe time went backwards or something? Amazon dramatically drops SimpleDB pricing to $0.25 per GB per month from $1.50 per GB . This puts SimpleDB on par with Google App Engine . They also announced a few new features: a SQL-like SELECT API as well as a Batch Put operation to streamline uploading of multiple items or attributes . One of the complaints against SimpleDB is that programmers end up writing too much code to do simple things. These features and a much cheaper price should help considerably. And you can store lots of data now. GAE is still capped. Update 33 : Amazon announces
4 0.16962376 925 high scalability-2010-10-22-Paper: Netflix’s Transition to High-Availability Storage Systems
Introduction: In an audacious move for such an established property, Netflix is moving their website out of the comfort of their own datacenter and into the wilds of the Amazon cloud. This paper by Netflix's Siddharth “Sid” Anand, Netflix’s Transition to High-Availability Storage Systems , gives a detailed look at this transition and does a deep dive on SimpleDB best practices, focussing especially on techniques useful to those who are making the move from a RDBMS. Sid is going to give a talk at QCon based on this paper and he would appreciate your feedback. So if you have any comments or thoughts please comment here or email Sid at r39132@hotmail.com or Twitter at @r39132 Here's the introduction from the paper: Circa late 2008, Netflix had a single data center. This single data center raised a few concerns. As a single-point-of-failure (a.k.a. SPOF), it represented a liability – data center outages meant interruptions to service and negative customer impact. Additionally, with growth in both
5 0.14132953 538 high scalability-2009-03-16-Are Cloud Based Memory Architectures the Next Big Thing?
Introduction: We are on the edge of two potent technological changes: Clouds and Memory Based Architectures. This evolution will rip open a chasm where new players can enter and prosper. Google is the master of disk. You can't beat them at a game they perfected. Disk based databases like SimpleDB and BigTable are complicated beasts, typical last gasp products of any aging technology before a change. The next era is the age of Memory and Cloud which will allow for new players to succeed. The tipping point will be soon. Let's take a short trip down web architecture lane: It's 1993: Yahoo runs on FreeBSD, Apache, Perl scripts and a SQL database It's 1995: Scale-up the database. It's 1998: LAMP It's 1999: Stateless + Load Balanced + Database + SAN It's 2001: In-memory data-grid. It's 2003: Add a caching layer. It's 2004: Add scale-out and partitioning. It's 2005: Add asynchronous job scheduling and maybe a distributed file system. It's 2007: Move it all into the cloud. It's 2008: C
6 0.14002061 306 high scalability-2008-04-21-The Search for the Source of Data - How SimpleDB Differs from a RDBMS
9 0.12393294 743 high scalability-2009-11-23-Big Data on Grids or on Clouds?
10 0.12103818 445 high scalability-2008-11-14-Useful Cloud Computing Blogs
11 0.11883536 674 high scalability-2009-08-07-The Canonical Cloud Architecture
12 0.11351198 888 high scalability-2010-08-27-OpenStack - The Answer to: How do We Compete with Amazon?
13 0.11079526 187 high scalability-2007-12-14-The Current Pros and Cons List for SimpleDB
14 0.10926154 38 high scalability-2007-07-30-Build an Infinitely Scalable Infrastructure for $100 Using Amazon Services
15 0.10786492 40 high scalability-2007-07-30-Product: Amazon Elastic Compute Cloud
16 0.10714801 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT
17 0.1059695 688 high scalability-2009-08-26-Hot Links for 2009-8-26
18 0.10513479 589 high scalability-2009-05-05-Drop ACID and Think About Data
19 0.10388143 336 high scalability-2008-05-31-Biggest Under Reported Story: Google's BigTable Costs 10 Times Less than Amazon's SimpleDB
20 0.10381565 96 high scalability-2007-09-18-Amazon Architecture
topicId topicWeight
[(0, 0.186), (1, 0.064), (2, 0.069), (3, 0.096), (4, -0.093), (5, -0.001), (6, 0.013), (7, -0.128), (8, 0.048), (9, 0.013), (10, 0.006), (11, -0.002), (12, 0.009), (13, -0.017), (14, 0.028), (15, -0.032), (16, -0.053), (17, -0.026), (18, 0.023), (19, -0.001), (20, 0.009), (21, 0.044), (22, 0.044), (23, 0.018), (24, 0.002), (25, -0.021), (26, 0.028), (27, 0.081), (28, 0.058), (29, 0.056), (30, 0.053), (31, -0.023), (32, -0.078), (33, -0.031), (34, 0.076), (35, 0.08), (36, 0.02), (37, 0.066), (38, -0.009), (39, -0.011), (40, -0.022), (41, -0.01), (42, 0.0), (43, -0.049), (44, -0.007), (45, 0.022), (46, -0.015), (47, -0.052), (48, 0.042), (49, 0.021)]
simIndex simValue blogId blogTitle
same-blog 1 0.95752275 767 high scalability-2010-01-27-Hot Scalability Links for January 28 2010
Introduction: Google's Research Areas of Interest: Building scalable, robust cluster applications . At Google we see distributed systems as a technology in its infancy, with huge gaps in the supporting research that represent some of the most important problems in the space. Here are some examples: Resource sharing, Balancing cost, performance, and reliability, Self-maintaining systems. Amazon SimpleDB: A Simple Way to Store Complex Data by Paul Tremblett. The most effective way I have found to understand SimpleDB is to think about it in terms of something else we all use and understand -- a spreadsheet. Rackspace Cloud Servers versus Amazon EC2: Performance Analysis . The Bitsource conducted a review of the two cloud computing platforms, Rackspace Cloud Servers and Amazon Elastic Compute Cloud (EC2), to get a general idea of overall system performance. Private Clouds Are Not The Future by Jame Hamilton. Private clouds are better than nothing but an investment in
2 0.81143069 610 high scalability-2009-05-29-Is Eucalyptus ready to be your private cloud?
Introduction: Update: : Eucalyptus Goes Commercial with $5.5M Funding Round . This removes my objection that it's an academic project only. Go team go! Rich Wolski , professor of Computer Science at the University of California, Santa Barbara, gave a spirited talk on Eucalyptus to a large group of very interested cloudsters at the Eucalyptus Cloud Meetup . If Rich could teach computer science at every school the state of the computer science industry would be stratospheric. Rich is dynamic, smart, passionate, and visionary. It's that vision that prompted him to create Eucalyptus in the first place. Rich and his group are experts in grid and distributed computing, having a long and glorious history in that space. When he saw cloud computing on the rise he decided the best way to explore it was to implement what everyone accepted as a real cloud, Amazon's API. In a remarkably short time they implement Eucalyptus and have been improving it and tracking Amazon's changes ever since. The question
3 0.78640926 688 high scalability-2009-08-26-Hot Links for 2009-8-26
Introduction: I'm Going To Scale My Foot Up Your Ass - Shut up about scalability, no one is using your app anyway. Multi-Tenant Data Architecture - Microsoft's take on different approaches to multitenancy. Cloud computing rides on spiraling Energy costs - A report by US researchers has shown the increasing cost of power and cooling in the data centre is a driver towards cloud computing. Interview: Apple’s Gigantic New Data Center Hints at Cloud Computing - Companies building centers this big are getting into cloud computing. Running apps in the cloud requires massive infrastructure: Google-size infrastructure. What Does Cloud Computing Actually Cost? An Analysis of the Top Vendors - Amazon is currently the lowest cost cloud computing option overall. At least for production applications that need more than 6.5 hours of CPU/day, otherwise GAE is technically cheaper because it's free until this usage level. no:sql(east) - October 28–30, 2009, Atlanta, GA. Very cute pa
4 0.78309911 480 high scalability-2008-12-30-Scalability Perspectives #5: Werner Vogels – The Amazon Technology Platform
Introduction: Scalability Perspectives is a series of posts that highlights the ideas that will shape the next decade of IT architecture. Each post is dedicated to a thought leader of the information age and his vision of the future. Be warned though – the journey into the minds and perspectives of these people requires an open mind. Werner Vogels Dr. Werner Vogels is Vice President & Chief Technology Officer at Amazon.com where he is responsible for driving the company’s technology vision, which is to continuously enhance the innovation on behalf of Amazon’s customers at a global scale. Prior to joining Amazon, he worked as a researcher at Cornell University where he was a principal investigator in several research projects that target the scalability and robustness of mission-critical enterprise computing systems. He is regarded as one of the world's top experts on ultra-scalable systems and he uses his weblog to educate the community about issues such as eventual consistency. Information
5 0.77199578 888 high scalability-2010-08-27-OpenStack - The Answer to: How do We Compete with Amazon?
Introduction: The Silicon Valley Cloud Computing Group had a meetup Wednesday on OpenStack , whose tag line is the open source, open standards cloud . I was shocked at the large turnout. 287 people registered and it looked like a large percentage of them actually showed up. I wonder, was it the gourmet pizza, the free t-shirts, or are people really that interested in OpenStack? And if they are really interested, why are they that interested? On the surface an open cloud doesn't seem all that sexy a topic, but with contributions from NASA, from Rackspace, and from a very avid user community, a lot of interest there seems to be. The brief intro blurb to OpenStack is: OpenStack allows any organization to create and offer cloud computing capabilities using open source software running on standard hardware. OpenStack Compute is software for automatically creating and managing large groups of virtual private servers. OpenStack Storage is software for creating redundant, scalable object storage
6 0.76491582 184 high scalability-2007-12-13-Amazon SimpleDB - Scalable Cloud Database
7 0.75298285 355 high scalability-2008-07-21-Eucalyptus - Build Your Own Private EC2 Cloud
8 0.74755114 40 high scalability-2007-07-30-Product: Amazon Elastic Compute Cloud
9 0.73502451 1575 high scalability-2014-01-08-Under Snowden's Light Software Architecture Choices Become Murky
10 0.72558415 498 high scalability-2009-01-20-Product: Amazon's SimpleDB
11 0.71985704 803 high scalability-2010-04-05-Intercloud: How Will We Scale Across Multiple Clouds?
12 0.71625406 674 high scalability-2009-08-07-The Canonical Cloud Architecture
13 0.7072807 325 high scalability-2008-05-25-How do you explain cloud computing to your grandma?
14 0.69689757 760 high scalability-2010-01-13-10 Hot Scalability Links for January 13, 2010
15 0.68232113 536 high scalability-2009-03-12-Product: Building Next-generation Collaborative cloud-ready applications with Optimus Cloud™
16 0.67992711 743 high scalability-2009-11-23-Big Data on Grids or on Clouds?
18 0.67641658 918 high scalability-2010-10-12-The CIO’s Problem: Cloud “Mess” or Cloud “Mash”
19 0.66808355 798 high scalability-2010-03-22-7 Secrets to Successfully Scaling with Scalr (on Amazon) by Sebastian Stadil
20 0.66678804 38 high scalability-2007-07-30-Build an Infinitely Scalable Infrastructure for $100 Using Amazon Services
topicId topicWeight
[(1, 0.082), (2, 0.136), (10, 0.565), (79, 0.126)]
simIndex simValue blogId blogTitle
1 0.94628823 430 high scalability-2008-10-26-Should you use a SAN to scale your architecture?
Introduction: This is a question everyone must struggle with when building out their datacenter. Storage choices are always the ones I have the least confidence in. David Marks in his blog You Can Change It Later! asks the question Should I get a SAN to scale my site architecture? and answers no. A better solution is to use commodity hardware, directly attach storage on servers, and partition across servers to scale and for greater availability. David's reasoning is interesting: A SAN creates a SPOF (single point of failure) that is dependent on a vendor to fly and fix when there's a problem. This can lead to long down times during this outage you have no access to your data at all. Using easily available commodity hardware minimizes risks to your company, it's not just about saving money. Zooming over to Fry's to buy emergency equipment provides the kind of agility startups need in order to respond quickly to ever changing situations. It's hard to beat the power and flexibility (backup
2 0.93394709 178 high scalability-2007-12-10-1 Master, N Slaves
Introduction: Hello all, Reading the site you can note that "1 Master for writes, N Slaves for reads" scheme is used offen. How is this implemented? Who decides where writes and reads go? Something in application level or specific database proxies, like Slony-I? Thanks.
3 0.93285471 1066 high scalability-2011-06-22-It's the Fraking IOPS - 1 SSD is 44,000 IOPS, Hard Drive is 180
Introduction: Planning your next buildout and thinking SSDs are still far in the future? Still too expensive, too low density. Hard disks are cheap, familiar, and store lots of stuff. In this short and entertaining video Wikia's Artur Bergman wants to change your mind about SSDs. SSDs are for today, get with the math already. Here's Artur's logic: Wikia is all SSD in production. The new Wikia file servers have a theoretical read rate of ~10GB/sec sequential, 6GB/sec random and 1.2 million IOPs. If you can't do math or love the past, you love spinning rust. If you are awesome you love SSDs. SSDs are cheaper than drives using the most relevant metric: $/GB/IOPS. 1 SSD is 44,000 IOPS and one hard drive is 180 IOPS. Need 1 SSD instead of 50 hard drives. With 8 million files there's a 9 minute fsck. Full backup in 12 minutes (X-25M based). 4 GB/sec random read average latency 1 msec. 2.2 GB/sec random write average latency 1 msec. 50TBs of SSDs in one machine for $80,000. With the densi
Introduction: If you stayed up all night watching the life reaffirming Curiosity landing on Mars , then this paper, High-Performance Concurrency Control Mechanisms for Main-Memory Databases , has nothing to do with that at all, but it is an excellent look at how to use optimistic MVCC schemes to reduce lock overhead on in-memory datastructures: A database system optimized for in-memory storage can support much higher transaction rates than current systems. However, standard concurrency control methods used today do not scale to the high transaction rates achievable by such systems. In this paper we introduce two efficient concurrency control methods specifically designed for main-memory databases. Both use multiversioning to isolate read-only transactions from updates but differ in how atomicity is ensured: one is optimistic and one is pessimistic. To avoid expensive context switching, transactions never block during normal processing but they may have to wait before commit to ensure corr
Introduction: This is a guest post by Ali Khajeh-Hosseini , Technical Lead at PlanForCloud . The original article was published on their site . With 29 cloud price reductions I thought it would be interesting to see how the bottom line would change compared to an article we published last year . The result is surprisingly little for TripAdvisor because prices for On Demand instances have not dropped as fast as for other other instances types. Over the last year and a half, we counted 29 price reductions in cloud services provided by AWS, Google Compute Engine, Windows Azure, and Rackspace Cloud. Price reductions have a direct effect on cloud users, but given the usual tiny reductions, how significant is that effect on the bottom line? Last year I wrote about cloud cost forecasts for TripAdvisor and Pinterest . TripAdvisor was experimenting with AWS and attempted to process 700K HTTP requests per minute on a replica of its live site, and Pinterest was growing massively on AWS . In th
6 0.91299248 874 high scalability-2010-08-07-ArchCamp: Scalable Databases (NoSQL)
7 0.89338577 171 high scalability-2007-12-02-a8cjdbc - update verision 1.3
8 0.89337754 170 high scalability-2007-12-02-Database-Clustering: a8cjdbc - update: version 1.3
same-blog 9 0.89213479 767 high scalability-2010-01-27-Hot Scalability Links for January 28 2010
10 0.87686223 1635 high scalability-2014-04-21-This is why Microsoft won. And why they lost.
11 0.8323226 1046 high scalability-2011-05-23-Evernote Architecture - 9 Million Users and 150 Million Requests a Day
12 0.82272047 1631 high scalability-2014-04-14-How do you even do anything without using EBS?
13 0.79368836 584 high scalability-2009-04-27-Some Questions from a newbie
14 0.79332769 792 high scalability-2010-03-10-How FarmVille Scales - The Follow-up
15 0.77837789 1585 high scalability-2014-01-24-Stuff The Internet Says On Scalability For January 24th, 2014
16 0.76423234 20 high scalability-2007-07-16-Paper: The Clustered Storage Revolution
17 0.76368231 689 high scalability-2009-08-28-Strategy: Solve Only 80 Percent of the Problem
18 0.73774403 1353 high scalability-2012-11-01-Cost Analysis: TripAdvisor and Pinterest costs on the AWS cloud
19 0.72350174 142 high scalability-2007-11-05-Strategy: Diagonal Scaling - Don't Forget to Scale Out AND Up
20 0.70338333 257 high scalability-2008-02-22-Kevin's Great Adventures in SSDland