high_scalability high_scalability-2010 high_scalability-2010-879 knowledge-graph by maker-knowledge-mining

879 high scalability-2010-08-12-Think of Latency as a Pseudo-permanent Network Partition


meta infos for this blog

Source: html

Introduction: The title of this post is a quote from Ilya Grigorik's post  Weak Consistency and CAP Implications . Besides the article being excellent, I thought this idea had something to add to the great NoSQL versus RDBMS debate, where Mike Stonebraker  makes the argument that network partitions are rare so designing eventually consistent systems for such rare occurrence is not worth losing ACID semantics over. Even if network partitions are rare, latency between datacenters is not rare, so the game is still on. The rare-partition argument seems to flow from a centralized-distributed view of systems. Such systems are scale-out in that they grow by adding distributed nodes, but the nodes generally do not cross datacenter boundaries. The assumption is the network is fast enough that distributed operations are roughly homogenous between nodes. In a fully-distributed system the nodes can be dispersed across datacenters, which gives operations a widely variable performance profile. Because everyt


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Even if network partitions are rare, latency between datacenters is not rare, so the game is still on. [sent-3, score-0.616]

2 The rare-partition argument seems to flow from a centralized-distributed view of systems. [sent-4, score-0.122]

3 Such systems are scale-out in that they grow by adding distributed nodes, but the nodes generally do not cross datacenter boundaries. [sent-5, score-0.328]

4 The assumption is the network is fast enough that distributed operations are roughly homogenous between nodes. [sent-6, score-0.493]

5 In a fully-distributed system the nodes can be dispersed across datacenters, which gives operations a widely variable performance profile. [sent-7, score-0.26]

6 A datacenter in Europe, for example, wants to store data locally rather than ship it synchronously across the pond to a US datacenter. [sent-10, score-0.495]

7 Fully distributed systems are very complex because they must manage replication, availability, transactions, fail-over etc. [sent-11, score-0.146]

8 The notion is that latency is a kind of partition that requires eventual consistency to mask. [sent-13, score-0.431]

9 The roundtrip time for a packet in a datacenter, for example, might be . [sent-14, score-0.199]

10 5ms, and the roundtrip time from California to Europe back to California might be 150ms. [sent-15, score-0.199]

11 So roughly 300 times more messages can be exchanged in a datacenter versus between datacenters. [sent-16, score-0.486]

12 This implies that reads from different datacenters are unlikely to be consistent. [sent-17, score-0.142]

13 Imagine a write coursing down two network paths. [sent-18, score-0.236]

14 Without a strong consistency guarantee, as in a potentially performance/scaling/availability killing two-phase commit , a read will be inconsistent for those latency windows. [sent-21, score-0.431]

15 Ilya writes: Interestingly enough, dealing with network partitions is not the only case for adopting “weak consistency”. [sent-22, score-0.375]

16 The PNUTS system deployed at Yahoo must deal with WAN replication of data between different continents, and unfortunately, the speed of light imposes some strict latency limits on the performance of such a system. [sent-23, score-0.433]

17 In Yahoo’s case, the communications latency is enough of a performance barrier such that their system is configured, by default, to operate under the “choose availability, under weak consistency” model - think of latency as a pseudo-permanent network partition. [sent-24, score-0.839]

18 Does this mean geographically disperse systems by their very nature must be eventually consistent? [sent-25, score-0.259]

19 It's at least interesting think about as we slowly make our way towards truly distributed architectures. [sent-26, score-0.068]

20 T here are three solutions for providing consistency in a data service that operates across a wide area network (WAN). [sent-30, score-0.46]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('rare', 0.257), ('consistency', 0.255), ('wan', 0.212), ('weak', 0.212), ('roundtrip', 0.199), ('latency', 0.176), ('partitions', 0.173), ('datacenter', 0.168), ('ilya', 0.144), ('datacenters', 0.142), ('california', 0.133), ('network', 0.125), ('europe', 0.125), ('argument', 0.122), ('versus', 0.112), ('coursing', 0.111), ('homogenous', 0.111), ('wanby', 0.111), ('roughly', 0.11), ('lightening', 0.104), ('continents', 0.104), ('imposes', 0.104), ('pond', 0.1), ('yahoo', 0.099), ('disperse', 0.096), ('exchanged', 0.096), ('nodes', 0.092), ('occurrence', 0.09), ('dispersed', 0.088), ('stonebraker', 0.086), ('pnuts', 0.086), ('eventually', 0.085), ('besides', 0.083), ('crush', 0.081), ('across', 0.08), ('consistent', 0.079), ('enough', 0.079), ('inconsistent', 0.079), ('must', 0.078), ('synchronously', 0.078), ('adopting', 0.077), ('permanent', 0.077), ('strict', 0.075), ('grigorik', 0.075), ('interestingly', 0.074), ('barrier', 0.071), ('debate', 0.071), ('transfers', 0.071), ('ship', 0.069), ('distributed', 0.068)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000002 879 high scalability-2010-08-12-Think of Latency as a Pseudo-permanent Network Partition

Introduction: The title of this post is a quote from Ilya Grigorik's post  Weak Consistency and CAP Implications . Besides the article being excellent, I thought this idea had something to add to the great NoSQL versus RDBMS debate, where Mike Stonebraker  makes the argument that network partitions are rare so designing eventually consistent systems for such rare occurrence is not worth losing ACID semantics over. Even if network partitions are rare, latency between datacenters is not rare, so the game is still on. The rare-partition argument seems to flow from a centralized-distributed view of systems. Such systems are scale-out in that they grow by adding distributed nodes, but the nodes generally do not cross datacenter boundaries. The assumption is the network is fast enough that distributed operations are roughly homogenous between nodes. In a fully-distributed system the nodes can be dispersed across datacenters, which gives operations a widely variable performance profile. Because everyt

2 0.2857793 687 high scalability-2009-08-24-How Google Serves Data from Multiple Datacenters

Introduction: Update: Streamy Explains CAP and HBase's Approach to CAP . We plan to employ inter-cluster replication, with each cluster located in a single DC. Remote replication will introduce some eventual consistency into the system, but each cluster will continue to be strongly consistent. Ryan Barrett, Google App Engine datastore lead, gave this talk Transactions Across Datacenters (and Other Weekend Projects) at the Google I/O 2009 conference. While the talk doesn't necessarily break new technical ground, Ryan does an excellent job explaining and evaluating the different options you have when architecting a system to work across multiple datacenters. This is called multihoming , operating from multiple datacenters simultaneously. As multihoming is one of the most challenging tasks in all computing, Ryan's clear and thoughtful style comfortably leads you through the various options. On the trip you learn: The different multi-homing options are: Backups, Master-Slave, Multi-M

3 0.20581616 676 high scalability-2009-08-08-Yahoo!'s PNUTS Database: Too Hot, Too Cold or Just Right?

Introduction: So far every massively scalable database is a bundle of compromises. For some the weak guarantees of Amazon's eventual consistency model are too cold. For many the strong guarantees of standard RDBMS distributed transactions are too hot. Google App Engine tries to get it just right with entity groups . Yahoo! is also trying to get is just right by offering per-record timeline consistency, which hopes to serve up a heaping bowl of rich database functionality and low latency at massive scale : We describe PNUTS [Platform for Nimble Universal Table Storage], a massively parallel and geographically distributed database system for Yahoo!’s web applications. PNUTS provides data storage organized as hashed or ordered tables, low latency for large numbers of con-current requests including updates and queries, and novel per-record consistency guarantees. It is a hosted, centrally managed, and geographically distributed service, and utilizes automated load-balancing and failover to redu

4 0.20064175 1374 high scalability-2012-12-18-Georeplication: When Bad Things Happen to Good Systems

Introduction: Georeplication is one of the standard techniques for dealing when bad things--failure and latency--happen to good systems. The problem is always: how do you do that?  Murat Demirbas , Associate Professor at SUNY Buffalo, has a couple of really good posts that can help: MDCC: Multi-Data Center Consistency  and Making Geo-Replicated Systems Fast as Possible, Consistent when Necessary .  In  MDCC: Multi-Data Center Consistency  Murat discusses a paper that says synchronous wide-area replication can be feasible. There's a quick and clear explanation of Paxos and various optimizations that is worth the price of admission. We find that strong consistency doesn't have to be lost across a WAN: The good thing about using Paxos over the WAN is you /almost/ get the full CAP  (all three properties: consistency, availability, and partition-freedom). As we discussed earlier (Paxos taught), Paxos is CP, that is, in the presence of a partition, Paxos keeps consistency over availability. But, P

5 0.19522001 661 high scalability-2009-07-25-Latency is Everywhere and it Costs You Sales - How to Crush it

Introduction: Update 8 : The Cost of Latency  by James Hamilton. James summarizing some latency info from   Steve Souder , Greg Linden , and Marissa Mayer .   Speed [is] an undervalued and under-discussed asset on the web. Update 7: How do you know when you need more memcache servers? . Dathan Pattishall talks about using memcache not to scale, but to reduce latency and reduce I/O spikes, and how to use stats to know when more servers are needed. Update 6: Stock Traders Find Speed Pays, in Milliseconds . Goldman Sachs is making record profits off a 500 millisecond trading advantage. Yes, latency matters. As an interesting aside, Libet found 500 msecs is about the time it takes the brain to weave together an experience of consciousness from all our sensor inputs. Update 5: Shopzilla's Site Redo - You Get What You Measure . At the Velocity conference Phil Dixon, from Shopzilla, presented data showing a 5 second speed up resulted in a 25% increase in page views, a 10% increas

6 0.19493389 1146 high scalability-2011-11-23-Paper: Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS

7 0.17806961 1450 high scalability-2013-05-01-Myth: Eric Brewer on Why Banks are BASE Not ACID - Availability Is Revenue

8 0.1678776 119 high scalability-2007-10-10-WAN Accelerate Your Way to Lightening Fast Transfers Between Data Centers

9 0.16411042 645 high scalability-2009-06-30-Hot New Trend: Linking Clouds Through Cheap IP VPNs Instead of Private Lines

10 0.15992816 1153 high scalability-2011-12-08-Update on Scalable Causal Consistency For Wide-Area Storage With COPS

11 0.15919854 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?

12 0.15256375 1116 high scalability-2011-09-15-Paper: It's Time for Low Latency - Inventing the 1 Microsecond Datacenter

13 0.14748158 1017 high scalability-2011-04-06-Netflix: Run Consistency Checkers All the time to Fixup Transactions

14 0.14368019 1345 high scalability-2012-10-22-Spanner - It's About Programmers Building Apps Using SQL Semantics at NoSQL Scale

15 0.13007279 589 high scalability-2009-05-05-Drop ACID and Think About Data

16 0.12871249 972 high scalability-2011-01-11-Google Megastore - 3 Billion Writes and 20 Billion Read Transactions Daily

17 0.12432747 1392 high scalability-2013-01-23-Building Redundant Datacenter Networks is Not For Sissies - Use an Outside WAN Backbone

18 0.122013 849 high scalability-2010-06-28-VoltDB Decapitates Six SQL Urban Myths and Delivers Internet Scale OLTP in the Process

19 0.1217669 1307 high scalability-2012-08-20-The Performance of Distributed Data-Structures Running on a "Cache-Coherent" In-Memory Data Grid

20 0.12015919 890 high scalability-2010-09-01-Paper: The Case for Determinism in Database Systems


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.174), (1, 0.105), (2, 0.013), (3, 0.087), (4, -0.028), (5, 0.089), (6, 0.028), (7, 0.024), (8, -0.066), (9, -0.046), (10, 0.007), (11, 0.036), (12, -0.118), (13, -0.027), (14, 0.086), (15, 0.108), (16, 0.082), (17, 0.033), (18, -0.018), (19, -0.085), (20, 0.1), (21, 0.153), (22, 0.004), (23, -0.04), (24, -0.089), (25, 0.019), (26, 0.0), (27, 0.003), (28, -0.017), (29, -0.145), (30, 0.047), (31, -0.048), (32, -0.058), (33, 0.031), (34, 0.029), (35, 0.025), (36, -0.032), (37, 0.065), (38, -0.082), (39, -0.007), (40, 0.004), (41, 0.056), (42, -0.005), (43, -0.022), (44, 0.011), (45, -0.043), (46, 0.073), (47, -0.029), (48, -0.044), (49, -0.02)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97197336 879 high scalability-2010-08-12-Think of Latency as a Pseudo-permanent Network Partition

Introduction: The title of this post is a quote from Ilya Grigorik's post  Weak Consistency and CAP Implications . Besides the article being excellent, I thought this idea had something to add to the great NoSQL versus RDBMS debate, where Mike Stonebraker  makes the argument that network partitions are rare so designing eventually consistent systems for such rare occurrence is not worth losing ACID semantics over. Even if network partitions are rare, latency between datacenters is not rare, so the game is still on. The rare-partition argument seems to flow from a centralized-distributed view of systems. Such systems are scale-out in that they grow by adding distributed nodes, but the nodes generally do not cross datacenter boundaries. The assumption is the network is fast enough that distributed operations are roughly homogenous between nodes. In a fully-distributed system the nodes can be dispersed across datacenters, which gives operations a widely variable performance profile. Because everyt

2 0.86472368 1450 high scalability-2013-05-01-Myth: Eric Brewer on Why Banks are BASE Not ACID - Availability Is Revenue

Introduction: In NoSQL: Past, Present, Future Eric Brewer has a particularly fine section on explaining the often hard to understand ideas of BASE (Basically Available, Soft State, Eventually Consistent), ACID (Atomicity, Consistency, Isolation, Durability), CAP (Consistency Availability, Partition Tolerance), in terms of a pernicious long standing myth about the sanctity of consistency in banking. Myth : Money is important, so banks must use transactions to keep money safe and consistent, right? Reality : Banking transactions are inconsistent, particularly for ATMs. ATMs are designed to have a normal case behaviour and a partition mode behaviour. In partition mode Availability is chosen over Consistency. Why? 1) Availability correlates with revenue and consistency generally does not. 2) Historically there was never an idea of perfect communication so everything was partitioned. Your ATM transaction must go through so Availability is more important than

3 0.86230838 1146 high scalability-2011-11-23-Paper: Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS

Introduction: Teams from Princeton and CMU are working together to solve one of the most difficult problems in the repertoire: scalable geo-distributed data stores. Major companies like Google and Facebook have been working on multiple datacenter database functionality for some time, but there's still a general lack of available systems that work for complex data scenarios. The ideas in this paper-- Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS --are different. It's not another eventually consistent system, or a traditional transaction oriented system, or a replication based system, or a system that punts on the issue. It's something new, a causally consistent system that achieves ALPS system properties. Move over CAP, NoSQL, etc, we have another acronym: ALPS - Available (operations always complete successfully), Low-latency (operations complete quickly (single digit milliseconds)), Partition-tolerant (operates with a partition), and Scalable (just a

4 0.86097598 1374 high scalability-2012-12-18-Georeplication: When Bad Things Happen to Good Systems

Introduction: Georeplication is one of the standard techniques for dealing when bad things--failure and latency--happen to good systems. The problem is always: how do you do that?  Murat Demirbas , Associate Professor at SUNY Buffalo, has a couple of really good posts that can help: MDCC: Multi-Data Center Consistency  and Making Geo-Replicated Systems Fast as Possible, Consistent when Necessary .  In  MDCC: Multi-Data Center Consistency  Murat discusses a paper that says synchronous wide-area replication can be feasible. There's a quick and clear explanation of Paxos and various optimizations that is worth the price of admission. We find that strong consistency doesn't have to be lost across a WAN: The good thing about using Paxos over the WAN is you /almost/ get the full CAP  (all three properties: consistency, availability, and partition-freedom). As we discussed earlier (Paxos taught), Paxos is CP, that is, in the presence of a partition, Paxos keeps consistency over availability. But, P

5 0.85852736 1153 high scalability-2011-12-08-Update on Scalable Causal Consistency For Wide-Area Storage With COPS

Introduction: Here are a few updates on the article Paper: Don’t Settle For Eventual: Scalable Causal Consistency For Wide-Area Storage With COPS  from Mike Freedman and Wyatt Lloyd. Q: How software architectures could change in response to casual+ consistency? A : I don't really think they would much. Somebody would still run a two-tier architecture in their datacenter:  a front-tier of webservers running both (say) PHP and our client library, and a back tier of storage nodes running COPS.  (I'm not sure if it was obvious given the discussion of our "thick" client -- you should think of the COPS client dropping in where a memcache client library does...albeit ours has per-session state.)   Q: Why not just use vector clocks? A : The problem with vector clocks and scalability has always been that the size of vector clocks in O(N), where N is the number of nodes.  So if we want to scale to a datacenter with 10K nodes, each piece of metadata must have size O(10K).  And in fact, vector

6 0.8106063 687 high scalability-2009-08-24-How Google Serves Data from Multiple Datacenters

7 0.77936864 676 high scalability-2009-08-08-Yahoo!'s PNUTS Database: Too Hot, Too Cold or Just Right?

8 0.75845069 890 high scalability-2010-09-01-Paper: The Case for Determinism in Database Systems

9 0.71536839 1459 high scalability-2013-05-16-Paper: Warp: Multi-Key Transactions for Key-Value Stores

10 0.69015414 1017 high scalability-2011-04-06-Netflix: Run Consistency Checkers All the time to Fixup Transactions

11 0.68414688 108 high scalability-2007-10-03-Paper: Brewer's Conjecture and the Feasibility of Consistent Available Partition-Tolerant Web Services

12 0.68381298 1116 high scalability-2011-09-15-Paper: It's Time for Low Latency - Inventing the 1 Microsecond Datacenter

13 0.66273266 529 high scalability-2009-03-10-Paper: Consensus Protocols: Paxos

14 0.65590227 1392 high scalability-2013-01-23-Building Redundant Datacenter Networks is Not For Sissies - Use an Outside WAN Backbone

15 0.65414757 507 high scalability-2009-02-03-Paper: Optimistic Replication

16 0.65279663 963 high scalability-2010-12-23-Paper: CRDTs: Consistency without concurrency control

17 0.64703524 1345 high scalability-2012-10-22-Spanner - It's About Programmers Building Apps Using SQL Semantics at NoSQL Scale

18 0.63796365 972 high scalability-2011-01-11-Google Megastore - 3 Billion Writes and 20 Billion Read Transactions Daily

19 0.62450975 1273 high scalability-2012-06-27-Paper: Logic and Lattices for Distributed Programming

20 0.62304348 1648 high scalability-2014-05-15-Paper: SwiftCloud: Fault-Tolerant Geo-Replication Integrated all the Way to the Client Machine


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.13), (2, 0.191), (30, 0.019), (40, 0.303), (47, 0.055), (61, 0.027), (77, 0.01), (79, 0.158), (94, 0.033)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.94979602 1419 high scalability-2013-03-07-It's a VM Wasteland - A Near Optimal Packing of VMs to Machines Reduces TCO by 22%

Introduction: In  Algorithm Design for Performance Aware VM Consolidation  we learn some shocking facts (gambling in Casablanca?): Average server utilization in many data centers is low, estimated between 5% and 15%. This is wasteful because an idle server often consumes more than 50% of peak power. Surely that's just for old style datacenters? Nope. In Google data centers, workloads that are consolidated use only 50% of the processor cores. Every other processor core is left unused simply to ensure that performance does not degrade. It's a VM wasteland. The goal is to reduce waste by packing VMs onto machines without hurting performance or wasting resources. The idea is to select VMs that interfere the least with each other and places them together on the same server. It's a NP-Complete problem, but this paper describes a practical method that performs provably close to the optimal. Interestingly they can optimize for performance or power efficiency, so you can use different algorithm

2 0.94246417 1466 high scalability-2013-05-29-Amazon: Creating a Customer Utopia One Culture Hack at a Time

Introduction: If you don’t cannibalize yourself, someone else will. -- Steve Jobs     America as the New World has a long history of inspiring  Utopian communities . Early experiments were famously religious. But there have been many others as new waves of thought have inspired people to organize and try something different. In the 1840s Transcendentalists, believing the true path lay in the perfection of the individual, created intentional communities like Brook Farm . We've also seen socialist ,  anarchist , hippy , and virtually every other kind of Utopia in-between. Psychologist B.F. Skinner wrote an infamous book, Walden Two , with a more "scientific" take on creating a Utopian community and Ayn Rand in Atlas Shrugged  had her free market version. I believe in startup organizations we see the modern version of Utopian energy in action. We now call it by names like "culture hacking", but the goal is the same: to create a new form of human community to achieve a profound go

3 0.9323163 338 high scalability-2008-06-02-Total Cost of Ownership for different web development frameworks

Introduction: I would like to compile a comparison matrix on the total cost of ownership for .Net, Java, Lamp & Rails. Where should I start? Has anyone seen or know of a recent study on this subject?

4 0.92709184 402 high scalability-2008-10-05-Paper: Scalability Design Patterns

Introduction: I have introduced pattern languages in my earlier post on The Pattern Bible for Distributed Computing . Achieving highest possible scalability is a complex combination of many factors. This PLoP 2007 paper presents a pattern language that can be used to make a system highly scalable. The Scalability Pattern Language introduced by Kanwardeep Singh Ahluwalia includes patterns to: Introduce Scalability Optimize Algorithm Add Hardware Add Parallelism Add Intra-Process Parallelism Add Inter-Porcess Parallelism Add Hybrid Parallelism Optimize Decentralization Control Shared Resources Automate Scalability

5 0.92671454 778 high scalability-2010-02-15-The Amazing Collective Compute Power of the Ambient Cloud

Introduction: This is an excerpt from my article Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud. Earlier we talked about how a single botnet could harness more compute power than our largest super computers. Well, that's just the start of it. The amount of computer power available to the Ambient Cloud will be truly astounding. 2 Billion Personal Computers The number of personal computers is still growing. By 2014 one estimate is there will be 2 billion PCs . That's a giant reservoir of power to exploit, especially considering these new boxes are stuffed with multiple powerful processors and gigabytes of memory. 7 Billion Smartphones By now it's common wisdom smartphones are the computing platform of the future. It's plausible to assume the total number of mobile phones in use will roughly equal the number of people on earth. That's 7 billion smartphones. Smartphones aren't just tiny little wannabe computers anymore either. They are rea

6 0.91835451 860 high scalability-2010-07-17-Hot Scalability Links for July 17, 2010

7 0.90886724 27 high scalability-2007-07-25-Product: 3 PAR REMOTE COPY

8 0.89843726 330 high scalability-2008-05-27-Should Twitter be an All-You-Can-Eat Buffet or a Vending Machine?

9 0.89807385 1471 high scalability-2013-06-06-Paper: Memory Barriers: a Hardware View for Software Hackers

same-blog 10 0.88887018 879 high scalability-2010-08-12-Think of Latency as a Pseudo-permanent Network Partition

11 0.88641787 482 high scalability-2009-01-04-Alternative Memcache Usage: A Highly Scalable, Highly Available, In-Memory Shard Index

12 0.86807227 1414 high scalability-2013-03-01-Stuff The Internet Says On Scalability For February 29, 2013

13 0.86319262 768 high scalability-2010-02-01-What Will Kill the Cloud?

14 0.85773104 848 high scalability-2010-06-25-Hot Scalability Links for June 25, 2010

15 0.85499024 300 high scalability-2008-04-07-Scalr - Open Source Auto-scaling Hosting on Amazon EC2

16 0.84031868 280 high scalability-2008-03-17-Paper: Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web

17 0.81776398 1492 high scalability-2013-07-17-How do you create a 100th Monkey software development culture?

18 0.81643289 757 high scalability-2010-01-04-11 Strategies to Rock Your Startup’s Scalability in 2010

19 0.7609731 97 high scalability-2007-09-18-Session management in highly scalable web sites

20 0.75604743 19 high scalability-2007-07-16-Paper: Replication Under Scalable Hashing