high_scalability high_scalability-2009 high_scalability-2009-510 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Henry Robinson has created an excellent series of articles on consensus protocols. Henry starts with a very useful discussion of what all this talk about consensus really means: The consensus problem is the problem of getting a set of nodes in a distributed system to agree on something - it might be a value, a course of action or a decision. Achieving consensus allows a distributed system to act as a single entity, with every individual node aware of and in agreement with the actions of the whole of the network. In this article Henry tackles Two-Phase Commit, the protocol most databases use to arrive at a consensus for database writes. The article is very well written with lots of pretty and informative pictures. He did a really good job. In conclusion we learn 2PC is very efficient, a minimal number of messages are exchanged and latency is low. The problem is when a co-ordinator fails availability is dramatically reduced. This is why 2PC isn't generally used on highly distributed
sentIndex sentText sentNum sentScore
1 Henry Robinson has created an excellent series of articles on consensus protocols. [sent-1, score-0.92]
2 Henry starts with a very useful discussion of what all this talk about consensus really means: The consensus problem is the problem of getting a set of nodes in a distributed system to agree on something - it might be a value, a course of action or a decision. [sent-2, score-2.165]
3 Achieving consensus allows a distributed system to act as a single entity, with every individual node aware of and in agreement with the actions of the whole of the network. [sent-3, score-1.345]
4 In this article Henry tackles Two-Phase Commit, the protocol most databases use to arrive at a consensus for database writes. [sent-4, score-1.05]
5 The article is very well written with lots of pretty and informative pictures. [sent-5, score-0.343]
6 In conclusion we learn 2PC is very efficient, a minimal number of messages are exchanged and latency is low. [sent-7, score-0.436]
7 The problem is when a co-ordinator fails availability is dramatically reduced. [sent-8, score-0.335]
8 This is why 2PC isn't generally used on highly distributed systems. [sent-9, score-0.165]
9 To solve that problem we have to move on to different algorithms and that is the subject of other articles. [sent-10, score-0.401]
wordName wordTfidf (topN-words)
[('consensus', 0.601), ('henry', 0.466), ('robinson', 0.171), ('exchanged', 0.165), ('articles', 0.138), ('agreement', 0.138), ('tackles', 0.136), ('problem', 0.135), ('arrive', 0.128), ('achieving', 0.11), ('article', 0.105), ('entity', 0.105), ('commit', 0.104), ('dramatically', 0.104), ('actions', 0.102), ('conclusion', 0.102), ('agree', 0.102), ('minimal', 0.099), ('aware', 0.097), ('fails', 0.096), ('subject', 0.093), ('act', 0.089), ('action', 0.088), ('distributed', 0.087), ('protocol', 0.08), ('generally', 0.078), ('informative', 0.073), ('messages', 0.07), ('series', 0.069), ('individual', 0.067), ('algorithms', 0.065), ('discussion', 0.065), ('starts', 0.064), ('course', 0.064), ('really', 0.063), ('efficient', 0.061), ('pretty', 0.06), ('value', 0.059), ('node', 0.058), ('solve', 0.057), ('useful', 0.057), ('whole', 0.057), ('created', 0.056), ('excellent', 0.056), ('lots', 0.054), ('nodes', 0.053), ('move', 0.051), ('written', 0.051), ('talk', 0.05), ('allows', 0.049)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 510 high scalability-2009-02-09-Paper: Consensus Protocols: Two-Phase Commit
Introduction: Henry Robinson has created an excellent series of articles on consensus protocols. Henry starts with a very useful discussion of what all this talk about consensus really means: The consensus problem is the problem of getting a set of nodes in a distributed system to agree on something - it might be a value, a course of action or a decision. Achieving consensus allows a distributed system to act as a single entity, with every individual node aware of and in agreement with the actions of the whole of the network. In this article Henry tackles Two-Phase Commit, the protocol most databases use to arrive at a consensus for database writes. The article is very well written with lots of pretty and informative pictures. He did a really good job. In conclusion we learn 2PC is very efficient, a minimal number of messages are exchanged and latency is low. The problem is when a co-ordinator fails availability is dramatically reduced. This is why 2PC isn't generally used on highly distributed
2 0.28863993 529 high scalability-2009-03-10-Paper: Consensus Protocols: Paxos
Introduction: Update: Barbara Liskov’s Turing Award, and Byzantine Fault Tolerance . Henry Robinson has created an excellent series of articles on consensus protocols. We already covered his 2 Phase Commit article and he also has a 3 Phase Commit article showing how to handle 2PC under single node failures. But that is not enough! 3PC works well under node failures, but fails for network failures. So another consensus mechanism is needed that handles both network and node failures. And that's Paxos . Paxos correctly handles both types of failures, but it does this by becoming inaccessible if too many components fail. This is the "liveness" property of protocols. Paxos waits until the faults are fixed. Read queries can be handled, but updates will be blocked until the protocol thinks it can make forward progress. The liveness of Paxos is primarily dependent on network stability. In a distributed heterogeneous environment you are at risk of losing the ability to make updates. Users hate t
3 0.19214293 687 high scalability-2009-08-24-How Google Serves Data from Multiple Datacenters
Introduction: Update: Streamy Explains CAP and HBase's Approach to CAP . We plan to employ inter-cluster replication, with each cluster located in a single DC. Remote replication will introduce some eventual consistency into the system, but each cluster will continue to be strongly consistent. Ryan Barrett, Google App Engine datastore lead, gave this talk Transactions Across Datacenters (and Other Weekend Projects) at the Google I/O 2009 conference. While the talk doesn't necessarily break new technical ground, Ryan does an excellent job explaining and evaluating the different options you have when architecting a system to work across multiple datacenters. This is called multihoming , operating from multiple datacenters simultaneously. As multihoming is one of the most challenging tasks in all computing, Ryan's clear and thoughtful style comfortably leads you through the various options. On the trip you learn: The different multi-homing options are: Backups, Master-Slave, Multi-M
4 0.17992176 357 high scalability-2008-07-26-Google's Paxos Made Live – An Engineering Perspective
Introduction: This is an unusually well written and useful paper . It talks in detail about experiences implementing a complex project, something we don't see very often. They shockingly even admit that creating a working implementation of Paxos was more difficult than just translating the pseudo code. Imagine that, programmers aren't merely typists! I particularly like the explanation of the Paxos algorithm and why anyone would care about it, working with disk corruption, using leases to support simultaneous reads, using epoch numbers to indicate a new master election, using snapshots to prevent unbounded logs, using MultiOp to implement database transactions, how they tested the system, and their openness with the various problems they had. A lot to learn here. From the paper: We describe our experience building a fault-tolerant data-base using the Paxos consensus algorithm. Despite the existing literature in the field, building such a database proved to be non-trivial. We describe selected alg
5 0.12300745 1498 high scalability-2013-08-07-RAFT - In Search of an Understandable Consensus Algorithm
Introduction: If like many humans you've found even Paxos Made Simple a bit difficult to understand, you might enjoy RAFT as described in In Search of an Understandable Consensus Algorithm by Stanford's Diego Ongaro and John Ousterhout. The video presentation of the paper is given by John Ousterhout . Both the paper and the video are delightfully accessible. mcherm has a good summary of the paper: A consensus algorithm is: a cluster of servers should record a series of records ("log entries") in response to requests from clients of the cluster. (It may also take action based on those entries.) It does so in a way that guarantees that the responses seen by clients of the cluster will be consistent EVEN in the face of servers crashing in unpredictable ways (but not loosing data that was synched to disk), and networks introducing unpredictable delays or communication blockages. Here's what Raft does. First, it elects a leader, then the leader records the master version of the log, t
6 0.10083438 1527 high scalability-2013-10-04-Stuff The Internet Says On Scalability For October 4th, 2013
7 0.096718252 1142 high scalability-2011-11-14-Using Gossip Protocols for Failure Detection, Monitoring, Messaging and Other Good Things
8 0.083237499 1645 high scalability-2014-05-09-Stuff The Internet Says On Scalability For May 9th, 2014
9 0.081948407 744 high scalability-2009-11-24-Hot Scalability Links for Nov 24 2009
10 0.077659316 919 high scalability-2010-10-14-I, Cloud
11 0.076708183 931 high scalability-2010-10-28-Notes from A NOSQL Evening in Palo Alto
12 0.075871788 1607 high scalability-2014-03-07-Stuff The Internet Says On Scalability For March 7th, 2014
13 0.075271666 1451 high scalability-2013-05-03-Stuff The Internet Says On Scalability For May 3, 2013
14 0.07228744 1261 high scalability-2012-06-08-Stuff The Internet Says On Scalability For June 8, 2012
15 0.070906304 1516 high scalability-2013-09-13-Stuff The Internet Says On Scalability For September 13, 2013
16 0.069985829 958 high scalability-2010-12-16-7 Design Patterns for Almost-infinite Scalability
17 0.069194309 1183 high scalability-2012-01-30-37signals Still Happily Scaling on Moore RAM and SSDs
18 0.068101414 972 high scalability-2011-01-11-Google Megastore - 3 Billion Writes and 20 Billion Read Transactions Daily
19 0.067295291 1479 high scalability-2013-06-21-Stuff The Internet Says On Scalability For June 21, 2013
20 0.066910423 1513 high scalability-2013-09-06-Stuff The Internet Says On Scalability For September 6, 2013
topicId topicWeight
[(0, 0.094), (1, 0.067), (2, -0.005), (3, 0.033), (4, 0.016), (5, 0.049), (6, 0.008), (7, 0.018), (8, -0.027), (9, -0.015), (10, 0.01), (11, 0.023), (12, -0.038), (13, -0.03), (14, 0.041), (15, 0.002), (16, 0.037), (17, -0.016), (18, -0.01), (19, -0.01), (20, 0.051), (21, 0.012), (22, -0.025), (23, 0.012), (24, -0.064), (25, 0.006), (26, 0.067), (27, 0.039), (28, -0.021), (29, -0.025), (30, -0.012), (31, -0.016), (32, -0.02), (33, 0.006), (34, -0.018), (35, -0.049), (36, 0.021), (37, 0.013), (38, 0.004), (39, -0.02), (40, -0.023), (41, -0.005), (42, -0.024), (43, 0.02), (44, -0.031), (45, 0.003), (46, 0.031), (47, 0.055), (48, -0.04), (49, -0.036)]
simIndex simValue blogId blogTitle
same-blog 1 0.95273292 510 high scalability-2009-02-09-Paper: Consensus Protocols: Two-Phase Commit
Introduction: Henry Robinson has created an excellent series of articles on consensus protocols. Henry starts with a very useful discussion of what all this talk about consensus really means: The consensus problem is the problem of getting a set of nodes in a distributed system to agree on something - it might be a value, a course of action or a decision. Achieving consensus allows a distributed system to act as a single entity, with every individual node aware of and in agreement with the actions of the whole of the network. In this article Henry tackles Two-Phase Commit, the protocol most databases use to arrive at a consensus for database writes. The article is very well written with lots of pretty and informative pictures. He did a really good job. In conclusion we learn 2PC is very efficient, a minimal number of messages are exchanged and latency is low. The problem is when a co-ordinator fails availability is dramatically reduced. This is why 2PC isn't generally used on highly distributed
2 0.86687905 529 high scalability-2009-03-10-Paper: Consensus Protocols: Paxos
Introduction: Update: Barbara Liskov’s Turing Award, and Byzantine Fault Tolerance . Henry Robinson has created an excellent series of articles on consensus protocols. We already covered his 2 Phase Commit article and he also has a 3 Phase Commit article showing how to handle 2PC under single node failures. But that is not enough! 3PC works well under node failures, but fails for network failures. So another consensus mechanism is needed that handles both network and node failures. And that's Paxos . Paxos correctly handles both types of failures, but it does this by becoming inaccessible if too many components fail. This is the "liveness" property of protocols. Paxos waits until the faults are fixed. Read queries can be handled, but updates will be blocked until the protocol thinks it can make forward progress. The liveness of Paxos is primarily dependent on network stability. In a distributed heterogeneous environment you are at risk of losing the ability to make updates. Users hate t
3 0.74262482 890 high scalability-2010-09-01-Paper: The Case for Determinism in Database Systems
Introduction: Can you have your ACID cake and eat your distributed database too? Yes explains Daniel Abadi, Assistant Professor of Computer Science at Yale University, in an epic post, The problems with ACID, and how to fix them without going NoSQL , coauthored with Alexander Thomson , on their paper The Case for Determinism in Database Systems . We've already seen VoltDB offer the best of both worlds, this sounds like a completely different approach. The solution, they propose, is: ...an architecture and execution model that avoids deadlock, copes with failures without aborting transactions, and achieves high concurrency. The paper contains full details, but the basic idea is to use ordered locking coupled with optimistic lock location prediction, while exploiting deterministic systems' nice replication properties in the case of failures. The problem they are trying to solve is: In our opinion, the NoSQL decision to give up on ACID is the lazy solution to these scala
Introduction: When building a system on top of a set of wildly uncooperative and unruly computers you have knowledge problems: knowing when other nodes are dead; knowing when nodes become alive; getting information about other nodes so you can make local decisions, like knowing which node should handle a request based on a scheme for assigning nodes to a certain range of users; learning about new configuration data; agreeing on data values; and so on. How do you solve these problems? A common centralized approach is to use a database and all nodes query it for information. Obvious availability and performance issues for large distributed clusters. Another approach is to use Paxos , a protocol for solving consensus in a network to maintain strict consistency requirements for small groups of unreliable processes. Not practical when larger number of nodes are involved. So what's the super cool decentralized way to bring order to large clusters? Gossip protocols , which maintain relaxed consi
Introduction: Teams from Princeton and CMU are working together to solve one of the most difficult problems in the repertoire: scalable geo-distributed data stores. Major companies like Google and Facebook have been working on multiple datacenter database functionality for some time, but there's still a general lack of available systems that work for complex data scenarios. The ideas in this paper-- Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS --are different. It's not another eventually consistent system, or a traditional transaction oriented system, or a replication based system, or a system that punts on the issue. It's something new, a causally consistent system that achieves ALPS system properties. Move over CAP, NoSQL, etc, we have another acronym: ALPS - Available (operations always complete successfully), Low-latency (operations complete quickly (single digit milliseconds)), Partition-tolerant (operates with a partition), and Scalable (just a
6 0.7049576 1450 high scalability-2013-05-01-Myth: Eric Brewer on Why Banks are BASE Not ACID - Availability Is Revenue
7 0.70222694 705 high scalability-2009-09-16-Paper: A practical scalable distributed B-tree
8 0.69950449 1374 high scalability-2012-12-18-Georeplication: When Bad Things Happen to Good Systems
9 0.6951316 1153 high scalability-2011-12-08-Update on Scalable Causal Consistency For Wide-Area Storage With COPS
10 0.68143493 676 high scalability-2009-08-08-Yahoo!'s PNUTS Database: Too Hot, Too Cold or Just Right?
11 0.66406494 1243 high scalability-2012-05-10-Paper: Paxos Made Moderately Complex
12 0.66266328 687 high scalability-2009-08-24-How Google Serves Data from Multiple Datacenters
13 0.66168392 1273 high scalability-2012-06-27-Paper: Logic and Lattices for Distributed Programming
14 0.66094017 1459 high scalability-2013-05-16-Paper: Warp: Multi-Key Transactions for Key-Value Stores
15 0.6548149 357 high scalability-2008-07-26-Google's Paxos Made Live – An Engineering Perspective
16 0.6544528 844 high scalability-2010-06-18-Paper: The Declarative Imperative: Experiences and Conjectures in Distributed Logic
17 0.6531502 958 high scalability-2010-12-16-7 Design Patterns for Almost-infinite Scalability
18 0.65000391 1017 high scalability-2011-04-06-Netflix: Run Consistency Checkers All the time to Fixup Transactions
19 0.64828396 963 high scalability-2010-12-23-Paper: CRDTs: Consistency without concurrency control
20 0.64462996 1463 high scalability-2013-05-23-Paper: Calvin: Fast Distributed Transactions for Partitioned Database Systems
topicId topicWeight
[(1, 0.075), (2, 0.31), (10, 0.064), (40, 0.016), (51, 0.278), (61, 0.103), (79, 0.018)]
simIndex simValue blogId blogTitle
same-blog 1 0.91616136 510 high scalability-2009-02-09-Paper: Consensus Protocols: Two-Phase Commit
Introduction: Henry Robinson has created an excellent series of articles on consensus protocols. Henry starts with a very useful discussion of what all this talk about consensus really means: The consensus problem is the problem of getting a set of nodes in a distributed system to agree on something - it might be a value, a course of action or a decision. Achieving consensus allows a distributed system to act as a single entity, with every individual node aware of and in agreement with the actions of the whole of the network. In this article Henry tackles Two-Phase Commit, the protocol most databases use to arrive at a consensus for database writes. The article is very well written with lots of pretty and informative pictures. He did a really good job. In conclusion we learn 2PC is very efficient, a minimal number of messages are exchanged and latency is low. The problem is when a co-ordinator fails availability is dramatically reduced. This is why 2PC isn't generally used on highly distributed
2 0.89660227 98 high scalability-2007-09-18-Sync data on all servers
Introduction: I have a few apache servers ( arround 11 atm ) serving a small amount of data ( arround 44 gigs right now ). For some time I have been using rsync to keep all the content equal on all servers, but the amount of data has been growing, and rsync takes a few too much time to "compare" all data from source to destination, and create a lot of I/O. I have been taking a look at MogileFS, it seems a good and reliable option, but as the fuse module is not finished, we should have to rewrite all our apps, and its not an option atm. Any ideas? I just want a "real time, non resource-hungry" solution alternative for rsync. If I get more features on the way, then they are welcome :) Why I prefer to use a Distributed File System instead of using NAS + NFS? - I need 2 NAS, if I dont want a point of failure, and NAS hard is expensive. - Non-shared hardware, all server has their own local disks. - As files are replicated, I can save a lot of money, RAID is not a MUST. Thn
3 0.88650513 1644 high scalability-2014-05-07-Update on Disqus: It's Still About Realtime, But Go Demolishes Python
Introduction: Our last article on Disqus: How Disqus Went Realtime With 165K Messages Per Second And Less Than .2 Seconds Latency , was a little out of date, but the folks at Disqus have been busy implementing, not talking, so we don't know a lot about what they are doing now, but we do have a short update in C1MM and NGINX by John Watson and an article Trying out this Go thing . So Disqus has grown a bit: 1.3 billion unique visitors 10 billion page views 500 million users engaged in discussions 3 million communities 25 million comments They are still all about realtime, but Go replaced Python in their Realtime system: Original Realtime backend was written in a pretty lightweight Python + gevent. The realtime service is a hybrid of CPU intensive tasks + lots of network IO. Gevent was handling the network IO without an issue, but at higher contention, the CPU was choking everything. Switching over to Go removed that contention, which was the primary issue that was being se
4 0.8734532 481 high scalability-2009-01-02-Strategy: Understanding Your Data Leads to the Best Scalability Solutions
Introduction: In article Building Super-Scalable Web Systems with REST Udi Dahan tells an interesting story of how they made a weather reporting system scale for over 10 million users. So many users hitting their weather database didn't scale. Caching in a straightforward way wouldn't work because weather is obviously local. Caching all local reports would bring the entire database into memory, which would work for some companies, but wasn't cost efficient for them. So in typical REST fashion they turned locations into URIs. For example: http://weather.myclient.com/UK/London. This allows the weather information to be cached by intermediaries instead of hitting their servers. Hopefully for each location their servers will be hit a few times and then the caches will be hit until expiry. In order to send users directly to the correct location an IP location check is performed on login and stored in a cookie. The lookup is done once and from then on out a GET is performed directly on the r
5 0.87017506 1168 high scalability-2012-01-04-How Facebook Handled the New Year's Eve Onslaught
Introduction: How does Facebook handle the massive New Year's Eve traffic spike? Thanks to Mike Swift, in Facebook gets ready for New Year's Eve , we get a little insight as to their method for the madness, nothing really detailed, but still interesting. Problem Setup Facebook expects tha one billion+ photos will be shared on New Year's eve. Facebook's 800 million users are scattered around the world. Three quarters live outside the US. Each user is linked to an average of 130 friends. Photos and posts must appear in less than a second. Opening a homepage requires executing requests on a 100 different servers, and those requests have to be ranked, sorted, and privacy-checked, and then rendered. Different events put different stresses on different parts of Facebook. Photo and Video Uploads - Holidays require hundreds of terabytes of capacity News Feed - News events like big sports events and the death of Steve Jobs drive user status updates Coping Strategies Try
6 0.84675127 1134 high scalability-2011-10-28-Stuff The Internet Says On Scalability For October 28, 2011
8 0.83412087 741 high scalability-2009-11-16-Building Scalable Systems Using Data as a Composite Material
9 0.82302243 818 high scalability-2010-04-30-Behind the scenes of an online marketplace
10 0.81871516 138 high scalability-2007-10-30-Feedblendr Architecture - Using EC2 to Scale
11 0.79914963 1629 high scalability-2014-04-10-Paper: Scalable Atomic Visibility with RAMP Transactions - Scale Linearly to 100 Servers
12 0.77210677 1638 high scalability-2014-04-28-How Disqus Went Realtime with 165K Messages Per Second and Less than .2 Seconds Latency
13 0.76371437 626 high scalability-2009-06-10-Paper: Graph Databases and the Future of Large-Scale Knowledge Management
14 0.76162881 109 high scalability-2007-10-03-Save on a Load Balancer By Using Client Side Load Balancing
15 0.76039779 332 high scalability-2008-05-28-Job queue and search engine
16 0.75934529 772 high scalability-2010-02-05-High Availability Principle : Concurrency Control
17 0.75902784 1199 high scalability-2012-02-27-Zen and the Art of Scaling - A Koan and Epigram Approach
19 0.75432152 953 high scalability-2010-12-03-GPU vs CPU Smackdown : The Rise of Throughput-Oriented Architectures
20 0.75368643 1126 high scalability-2011-09-27-Use Instance Caches to Save Money: Latency == $$$