high_scalability high_scalability-2009 high_scalability-2009-710 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: PaxosLease is a distributed algorithm for lease negotiation. It is based on Paxos, but does not require disk writes or clock synchrony. PaxosLease is used for master lease negotation in the open-source Keyspace replicated key-value store.
sentIndex sentText sentNum sentScore
1 PaxosLease is a distributed algorithm for lease negotiation. [sent-1, score-1.067]
2 It is based on Paxos, but does not require disk writes or clock synchrony. [sent-2, score-0.791]
3 PaxosLease is used for master lease negotation in the open-source Keyspace replicated key-value store. [sent-3, score-1.237]
wordName wordTfidf (topN-words)
[('lease', 0.79), ('paxos', 0.3), ('clock', 0.285), ('algorithm', 0.203), ('replicated', 0.202), ('master', 0.176), ('require', 0.153), ('writes', 0.147), ('disk', 0.125), ('store', 0.116), ('based', 0.081), ('distributed', 0.074), ('used', 0.069)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 710 high scalability-2009-09-20-PaxosLease: Diskless Paxos for Leases
Introduction: PaxosLease is a distributed algorithm for lease negotiation. It is based on Paxos, but does not require disk writes or clock synchrony. PaxosLease is used for master lease negotation in the open-source Keyspace replicated key-value store.
2 0.1733937 357 high scalability-2008-07-26-Google's Paxos Made Live – An Engineering Perspective
Introduction: This is an unusually well written and useful paper . It talks in detail about experiences implementing a complex project, something we don't see very often. They shockingly even admit that creating a working implementation of Paxos was more difficult than just translating the pseudo code. Imagine that, programmers aren't merely typists! I particularly like the explanation of the Paxos algorithm and why anyone would care about it, working with disk corruption, using leases to support simultaneous reads, using epoch numbers to indicate a new master election, using snapshots to prevent unbounded logs, using MultiOp to implement database transactions, how they tested the system, and their openness with the various problems they had. A lot to learn here. From the paper: We describe our experience building a fault-tolerant data-base using the Paxos consensus algorithm. Despite the existing literature in the field, building such a database proved to be non-trivial. We describe selected alg
3 0.13309243 1243 high scalability-2012-05-10-Paper: Paxos Made Moderately Complex
Introduction: If you are a normal human being and find the Paxos protocol confusing, then this paper, Paxos Made Moderately Complex , is a great find. Robbert van Renesse from Cornell University has written a clear and well written paper with excellent explanations. The Abstract: For anybody who has ever tried to implement it, Paxos is by no means a simple protocol, even though it is based on relatively simple invariants. This paper provides imperative pseudo-code for the full Paxos (or Multi-Paxos) protocol without shying away from discussing various implementation details. The initial description avoids optimizations that complicate comprehension. Next we discuss liveness, and list various optimizations that make the protocol practical. Related Articles Paxos on HighScalability.com
4 0.11849879 1374 high scalability-2012-12-18-Georeplication: When Bad Things Happen to Good Systems
Introduction: Georeplication is one of the standard techniques for dealing when bad things--failure and latency--happen to good systems. The problem is always: how do you do that? Murat Demirbas , Associate Professor at SUNY Buffalo, has a couple of really good posts that can help: MDCC: Multi-Data Center Consistency and Making Geo-Replicated Systems Fast as Possible, Consistent when Necessary . In MDCC: Multi-Data Center Consistency Murat discusses a paper that says synchronous wide-area replication can be feasible. There's a quick and clear explanation of Paxos and various optimizations that is worth the price of admission. We find that strong consistency doesn't have to be lost across a WAN: The good thing about using Paxos over the WAN is you /almost/ get the full CAP (all three properties: consistency, availability, and partition-freedom). As we discussed earlier (Paxos taught), Paxos is CP, that is, in the presence of a partition, Paxos keeps consistency over availability. But, P
5 0.086175628 1305 high scalability-2012-08-16-Paper: A Provably Correct Scalable Concurrent Skip List
Introduction: In MemSQL Architecture we learned one of the core strategies MemSQL uses to achieve their need for speed is lock-free skip lists. Skip lists are used to efficiently handle range queries. Making the skip-lists lock-free helps eliminate contention and make writes fast. If this all sounds a little pie-in-the-sky then here's a very good paper on the subject that might help make it clearer: A Provably Correct Scalable Concurrent Skip List . From the abstract: We propose a new concurrent skip list algorithm distinguished by a combination of simplicity and scalability. The algorithm employs optimistic synchronization, searching without acquiring locks, followed by short lock-based validation before adding or removing nodes. It also logically removes an item before physically unlinking it. Unlike some other concurrent skip list algorithms, this algorithm preserves the skiplist properties at all times, which facilitates reasoning about its correctness. Experimental evidence shows that
6 0.086119659 529 high scalability-2009-03-10-Paper: Consensus Protocols: Paxos
7 0.084740289 687 high scalability-2009-08-24-How Google Serves Data from Multiple Datacenters
8 0.08282648 178 high scalability-2007-12-10-1 Master, N Slaves
9 0.080181353 1345 high scalability-2012-10-22-Spanner - It's About Programmers Building Apps Using SQL Semantics at NoSQL Scale
10 0.074257299 1459 high scalability-2013-05-16-Paper: Warp: Multi-Key Transactions for Key-Value Stores
11 0.073324963 1451 high scalability-2013-05-03-Stuff The Internet Says On Scalability For May 3, 2013
12 0.067402676 1529 high scalability-2013-10-08-F1 and Spanner Holistically Compared
13 0.067289934 1475 high scalability-2013-06-13-Busting 4 Modern Hardware Myths - Are Memory, HDDs, and SSDs Really Random Access?
14 0.066502549 575 high scalability-2009-04-21-Thread Pool Engine in MS CLR 4, and Work-Stealing scheduling algorithm
15 0.05963422 852 high scalability-2010-07-07-Strategy: Recompute Instead of Remember Big Data
16 0.058338888 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?
17 0.057576425 670 high scalability-2009-08-05-Anti-RDBMS: A list of distributed key-value stores
18 0.056950286 163 high scalability-2007-11-21-n-phase commit for FS writes, reads stay local
19 0.056823421 672 high scalability-2009-08-06-An Unorthodox Approach to Database Design : The Coming of the Shard
20 0.056811489 188 high scalability-2007-12-19-How can I learn to scale my project?
topicId topicWeight
[(0, 0.039), (1, 0.044), (2, -0.012), (3, 0.01), (4, 0.005), (5, 0.064), (6, 0.02), (7, -0.015), (8, -0.031), (9, 0.002), (10, 0.007), (11, -0.014), (12, -0.037), (13, 0.009), (14, 0.034), (15, 0.031), (16, -0.027), (17, -0.002), (18, -0.01), (19, -0.026), (20, 0.029), (21, 0.023), (22, -0.034), (23, 0.028), (24, -0.056), (25, -0.009), (26, 0.039), (27, 0.013), (28, -0.011), (29, -0.022), (30, -0.007), (31, -0.041), (32, -0.053), (33, -0.011), (34, -0.007), (35, -0.057), (36, 0.021), (37, -0.029), (38, 0.017), (39, -0.027), (40, -0.062), (41, 0.006), (42, -0.021), (43, 0.027), (44, 0.053), (45, -0.005), (46, -0.01), (47, -0.003), (48, 0.024), (49, 0.003)]
simIndex simValue blogId blogTitle
same-blog 1 0.9623282 710 high scalability-2009-09-20-PaxosLease: Diskless Paxos for Leases
Introduction: PaxosLease is a distributed algorithm for lease negotiation. It is based on Paxos, but does not require disk writes or clock synchrony. PaxosLease is used for master lease negotation in the open-source Keyspace replicated key-value store.
2 0.60223228 1611 high scalability-2014-03-12-Paper: Scalable Eventually Consistent Counters over Unreliable Networks
Introduction: Counting at scale in a distributed environment is surprisingly hard . And it's a subject we've covered before in various ways: Big Data Counting: How to count a billion distinct objects using only 1.5KB of Memory , How to update video views count effectively? , Numbers Everyone Should Know (sharded counters) . Kellabyte (which is an excellent blog) in Scalable Eventually Consistent Counters talks about how the Cassandra counter implementation scores well on the scalability and high availability front, but in so doing has "over and under counting problem in partitioned environments." Which is often fine. But if you want more accuracy there's a PN-counter, which is a CRDT (convergent replicated data type) where "all the changes made to a counter on each node rather than storing and modifying a single value so that you can merge all the values into the proper final value. Of course the trade-off here is additional storage and processing but there are ways to optimize this."
Introduction: Teams from Princeton and CMU are working together to solve one of the most difficult problems in the repertoire: scalable geo-distributed data stores. Major companies like Google and Facebook have been working on multiple datacenter database functionality for some time, but there's still a general lack of available systems that work for complex data scenarios. The ideas in this paper-- Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS --are different. It's not another eventually consistent system, or a traditional transaction oriented system, or a replication based system, or a system that punts on the issue. It's something new, a causally consistent system that achieves ALPS system properties. Move over CAP, NoSQL, etc, we have another acronym: ALPS - Available (operations always complete successfully), Low-latency (operations complete quickly (single digit milliseconds)), Partition-tolerant (operates with a partition), and Scalable (just a
4 0.58612889 705 high scalability-2009-09-16-Paper: A practical scalable distributed B-tree
Introduction: We've seen a lot of NoSQL action lately built around distributed hash tables. Btrees are getting jealous. Btrees, once the king of the database world, want their throne back. Paul Buchheit surfaced a paper: A practical scalable distributed B-tree by Marcos K. Aguilera and Wojciech Golab, that might help spark a revolution. From the Abstract: We propose a new algorithm for a practical, fault tolerant, and scalable B-tree distributed over a set of servers. Our algorithm supports practical features not present in prior work: transactions that allow atomic execution of multiple operations over multiple B-trees, online migration of B-tree nodes between servers, and dynamic addition and removal of servers. Moreover, our algorithm is conceptually simple: we use transactions to manipulate B-tree nodes so that clients need not use complicated concurrency and locking protocols used in prior work. To execute these transactions quickly, we rely on three techniques: (1) We use optimistic
5 0.57739305 1463 high scalability-2013-05-23-Paper: Calvin: Fast Distributed Transactions for Partitioned Database Systems
Introduction: Distributed transactions are costly because they use agreement protocols . Calvin says, surprisingly, that using a deterministic database allows you to avoid the use of agreement protocols. The approach is to use a deterministic transaction layer that does all the hard work before acquiring locks and the beginning of transaction execution. Overview: Many distributed storage systems achieve high data access throughput via partitioning and replication, each system with its own advantages and tradeoffs. In order to achieve high scalability, however, today’s systems generally reduce transactional support, disallowing single transactions from spanning multiple partitions. Calvin is a practical transaction scheduling and data replication layer that uses a deterministic ordering guarantee to significantly reduce the normally prohibitive contention costs associated with distributed transactions. Unlike previous deterministic database system prototypes, Calvin supports disk-based storage
6 0.57609797 529 high scalability-2009-03-10-Paper: Consensus Protocols: Paxos
7 0.56327951 1243 high scalability-2012-05-10-Paper: Paxos Made Moderately Complex
8 0.55744195 687 high scalability-2009-08-24-How Google Serves Data from Multiple Datacenters
10 0.5559724 1273 high scalability-2012-06-27-Paper: Logic and Lattices for Distributed Programming
11 0.55041391 357 high scalability-2008-07-26-Google's Paxos Made Live – An Engineering Perspective
12 0.54817456 963 high scalability-2010-12-23-Paper: CRDTs: Consistency without concurrency control
13 0.54701948 510 high scalability-2009-02-09-Paper: Consensus Protocols: Two-Phase Commit
14 0.54487783 507 high scalability-2009-02-03-Paper: Optimistic Replication
15 0.53382593 1153 high scalability-2011-12-08-Update on Scalable Causal Consistency For Wide-Area Storage With COPS
16 0.53037101 1305 high scalability-2012-08-16-Paper: A Provably Correct Scalable Concurrent Skip List
17 0.52626735 1374 high scalability-2012-12-18-Georeplication: When Bad Things Happen to Good Systems
18 0.52377546 979 high scalability-2011-01-27-Comet - An Example of the New Key-Code Databases
19 0.52153462 676 high scalability-2009-08-08-Yahoo!'s PNUTS Database: Too Hot, Too Cold or Just Right?
20 0.51942104 889 high scalability-2010-08-30-Pomegranate - Storing Billions and Billions of Tiny Little Files
topicId topicWeight
[(2, 0.118), (6, 0.368), (10, 0.044), (79, 0.089), (94, 0.125)]
simIndex simValue blogId blogTitle
same-blog 1 0.88344926 710 high scalability-2009-09-20-PaxosLease: Diskless Paxos for Leases
Introduction: PaxosLease is a distributed algorithm for lease negotiation. It is based on Paxos, but does not require disk writes or clock synchrony. PaxosLease is used for master lease negotation in the open-source Keyspace replicated key-value store.
2 0.86405283 93 high scalability-2007-09-16-What software runs on this site?
Introduction: It's pretty slick! olla
3 0.68568361 104 high scalability-2007-10-01-SmugMug Found their Perfect Storage Array
Introduction: SmugMug's CEO & Chief Geek Don MacAskill smugly (hard to resist) gushes over finally finding, after a long and arduous quest, their "best bang-for-the-buck storage array." It's the Dell MD300 . His in-depth explanation of why he prefers the MD3000 should help anyone with their own painful storage deliberations. His key points are: The price is right; DAS via SAS, 15 spindles at 15K rpm each, 512MB of mirrored battery-backed write cache; You can disable read caching; You can disable read-ahead prefetching; The stripe sizes are configurable up to 512KB; The controller ignores host-based flush commands by default; They support an ‘Enhanced JBOD’ mode. His reasoning for the desirability each option is astute and he even gives you the configuration options for carrying out the configuration. This is not your average CEO. Don also speculates that a three tier system using flash (system RAM + flash storage + RAID disks) is a possible future direction. Unfortunately, flash
4 0.67647767 832 high scalability-2010-05-31-Scalable federated security with Kerberos
Introduction: In my last post , I outlined considerations that need to be taken into account when choosing between a centralized and federated security model. So, how do we implement the chosen model? Based on a real-world case study, I will outline a Kerberos architecture that enables cutting-edge collaborative research through federated sharing of resources. Read more on BigDataMatters.com
5 0.55269009 529 high scalability-2009-03-10-Paper: Consensus Protocols: Paxos
Introduction: Update: Barbara Liskov’s Turing Award, and Byzantine Fault Tolerance . Henry Robinson has created an excellent series of articles on consensus protocols. We already covered his 2 Phase Commit article and he also has a 3 Phase Commit article showing how to handle 2PC under single node failures. But that is not enough! 3PC works well under node failures, but fails for network failures. So another consensus mechanism is needed that handles both network and node failures. And that's Paxos . Paxos correctly handles both types of failures, but it does this by becoming inaccessible if too many components fail. This is the "liveness" property of protocols. Paxos waits until the faults are fixed. Read queries can be handled, but updates will be blocked until the protocol thinks it can make forward progress. The liveness of Paxos is primarily dependent on network stability. In a distributed heterogeneous environment you are at risk of losing the ability to make updates. Users hate t
6 0.51911372 213 high scalability-2008-01-15-Does Sun Buying MySQL Change Your Scaling Strategy?
7 0.50119567 794 high scalability-2010-03-11-What would you like to ask Justin.tv?
8 0.46344331 1305 high scalability-2012-08-16-Paper: A Provably Correct Scalable Concurrent Skip List
9 0.43287355 1601 high scalability-2014-02-25-Peter Norvig's 9 Master Steps to Improving a Program
10 0.42116556 243 high scalability-2008-02-07-clusteradmin.blogspot.com - blog about building and administering clusters
11 0.42062762 1023 high scalability-2011-04-14-Strategy: Cache Application Start State to Reduce Spin-up Times
12 0.41793579 605 high scalability-2009-05-22-Distributed content system with bandwidth balancing
13 0.41686124 1423 high scalability-2013-03-13-Iron.io Moved From Ruby to Go: 28 Servers Cut and Colossal Clusterf**ks Prevented
14 0.41601121 115 high scalability-2007-10-07-Using ThreadLocal to pass context information around in web applications
15 0.41299272 1222 high scalability-2012-04-05-Big Data Counting: How to count a billion distinct objects using only 1.5KB of Memory
16 0.40491679 559 high scalability-2009-04-07-Six Lessons Learned Deploying a Large-scale Infrastructure in Amazon EC2
17 0.39695698 1084 high scalability-2011-07-22-Stuff The Internet Says On Scalability For July 22, 2011
18 0.39509767 1223 high scalability-2012-04-06-Stuff The Internet Says On Scalability For April 6, 2012
19 0.39382735 970 high scalability-2011-01-06-BankSimple Mini-Architecture - Using a Next Generation Toolchain
20 0.38626811 575 high scalability-2009-04-21-Thread Pool Engine in MS CLR 4, and Work-Stealing scheduling algorithm