high_scalability high_scalability-2012 high_scalability-2012-1243 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: If you are a normal human being and find the Paxos protocol confusing, then this paper, Paxos Made Moderately Complex , is a great find. Robbert van Renesse from Cornell University has written a clear and well written paper with excellent explanations. The Abstract: For anybody who has ever tried to implement it, Paxos is by no means a simple protocol, even though it is based on relatively simple invariants. This paper provides imperative pseudo-code for the full Paxos (or Multi-Paxos) protocol without shying away from discussing various implementation details. The initial description avoids optimizations that complicate comprehension. Next we discuss liveness, and list various optimizations that make the protocol practical. Related Articles Paxos on HighScalability.com
sentIndex sentText sentNum sentScore
1 If you are a normal human being and find the Paxos protocol confusing, then this paper, Paxos Made Moderately Complex , is a great find. [sent-1, score-0.579]
2 Robbert van Renesse from Cornell University has written a clear and well written paper with excellent explanations. [sent-2, score-0.81]
3 The Abstract: For anybody who has ever tried to implement it, Paxos is by no means a simple protocol, even though it is based on relatively simple invariants. [sent-3, score-0.82]
4 This paper provides imperative pseudo-code for the full Paxos (or Multi-Paxos) protocol without shying away from discussing various implementation details. [sent-4, score-0.847]
5 Next we discuss liveness, and list various optimizations that make the protocol practical. [sent-6, score-0.688]
wordName wordTfidf (topN-words)
[('paxos', 0.433), ('protocol', 0.292), ('optimizations', 0.259), ('renesse', 0.233), ('robbert', 0.233), ('paper', 0.217), ('liveness', 0.209), ('complicate', 0.209), ('moderately', 0.202), ('cornell', 0.181), ('imperative', 0.171), ('anybody', 0.166), ('discussing', 0.161), ('van', 0.159), ('various', 0.151), ('avoids', 0.14), ('university', 0.133), ('written', 0.125), ('abstract', 0.114), ('discuss', 0.108), ('initial', 0.107), ('tried', 0.107), ('normal', 0.103), ('simple', 0.093), ('human', 0.092), ('relatively', 0.085), ('description', 0.083), ('clear', 0.076), ('implement', 0.076), ('implementation', 0.075), ('ever', 0.07), ('though', 0.069), ('list', 0.068), ('excellent', 0.068), ('away', 0.066), ('provides', 0.058), ('complex', 0.056), ('made', 0.055), ('full', 0.054), ('means', 0.053), ('next', 0.053), ('find', 0.05), ('great', 0.042), ('well', 0.04), ('based', 0.039), ('without', 0.038), ('even', 0.038), ('make', 0.029)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000001 1243 high scalability-2012-05-10-Paper: Paxos Made Moderately Complex
Introduction: If you are a normal human being and find the Paxos protocol confusing, then this paper, Paxos Made Moderately Complex , is a great find. Robbert van Renesse from Cornell University has written a clear and well written paper with excellent explanations. The Abstract: For anybody who has ever tried to implement it, Paxos is by no means a simple protocol, even though it is based on relatively simple invariants. This paper provides imperative pseudo-code for the full Paxos (or Multi-Paxos) protocol without shying away from discussing various implementation details. The initial description avoids optimizations that complicate comprehension. Next we discuss liveness, and list various optimizations that make the protocol practical. Related Articles Paxos on HighScalability.com
2 0.22343262 357 high scalability-2008-07-26-Google's Paxos Made Live – An Engineering Perspective
Introduction: This is an unusually well written and useful paper . It talks in detail about experiences implementing a complex project, something we don't see very often. They shockingly even admit that creating a working implementation of Paxos was more difficult than just translating the pseudo code. Imagine that, programmers aren't merely typists! I particularly like the explanation of the Paxos algorithm and why anyone would care about it, working with disk corruption, using leases to support simultaneous reads, using epoch numbers to indicate a new master election, using snapshots to prevent unbounded logs, using MultiOp to implement database transactions, how they tested the system, and their openness with the various problems they had. A lot to learn here. From the paper: We describe our experience building a fault-tolerant data-base using the Paxos consensus algorithm. Despite the existing literature in the field, building such a database proved to be non-trivial. We describe selected alg
3 0.22179301 1374 high scalability-2012-12-18-Georeplication: When Bad Things Happen to Good Systems
Introduction: Georeplication is one of the standard techniques for dealing when bad things--failure and latency--happen to good systems. The problem is always: how do you do that? Murat Demirbas , Associate Professor at SUNY Buffalo, has a couple of really good posts that can help: MDCC: Multi-Data Center Consistency and Making Geo-Replicated Systems Fast as Possible, Consistent when Necessary . In MDCC: Multi-Data Center Consistency Murat discusses a paper that says synchronous wide-area replication can be feasible. There's a quick and clear explanation of Paxos and various optimizations that is worth the price of admission. We find that strong consistency doesn't have to be lost across a WAN: The good thing about using Paxos over the WAN is you /almost/ get the full CAP (all three properties: consistency, availability, and partition-freedom). As we discussed earlier (Paxos taught), Paxos is CP, that is, in the presence of a partition, Paxos keeps consistency over availability. But, P
4 0.19635284 529 high scalability-2009-03-10-Paper: Consensus Protocols: Paxos
Introduction: Update: Barbara Liskov’s Turing Award, and Byzantine Fault Tolerance . Henry Robinson has created an excellent series of articles on consensus protocols. We already covered his 2 Phase Commit article and he also has a 3 Phase Commit article showing how to handle 2PC under single node failures. But that is not enough! 3PC works well under node failures, but fails for network failures. So another consensus mechanism is needed that handles both network and node failures. And that's Paxos . Paxos correctly handles both types of failures, but it does this by becoming inaccessible if too many components fail. This is the "liveness" property of protocols. Paxos waits until the faults are fixed. Read queries can be handled, but updates will be blocked until the protocol thinks it can make forward progress. The liveness of Paxos is primarily dependent on network stability. In a distributed heterogeneous environment you are at risk of losing the ability to make updates. Users hate t
Introduction: When building a system on top of a set of wildly uncooperative and unruly computers you have knowledge problems: knowing when other nodes are dead; knowing when nodes become alive; getting information about other nodes so you can make local decisions, like knowing which node should handle a request based on a scheme for assigning nodes to a certain range of users; learning about new configuration data; agreeing on data values; and so on. How do you solve these problems? A common centralized approach is to use a database and all nodes query it for information. Obvious availability and performance issues for large distributed clusters. Another approach is to use Paxos , a protocol for solving consensus in a network to maintain strict consistency requirements for small groups of unreliable processes. Not practical when larger number of nodes are involved. So what's the super cool decentralized way to bring order to large clusters? Gossip protocols , which maintain relaxed consi
6 0.14172739 234 high scalability-2008-01-30-The AOL XMPP scalability challenge
7 0.13309243 710 high scalability-2009-09-20-PaxosLease: Diskless Paxos for Leases
8 0.12466724 1459 high scalability-2013-05-16-Paper: Warp: Multi-Key Transactions for Key-Value Stores
9 0.10795058 780 high scalability-2010-02-19-Twitter’s Plan to Analyze 100 Billion Tweets
10 0.10098916 1022 high scalability-2011-04-13-Paper: NoSQL Databases - NoSQL Introduction and Overview
11 0.10054207 1451 high scalability-2013-05-03-Stuff The Internet Says On Scalability For May 3, 2013
12 0.10032265 1345 high scalability-2012-10-22-Spanner - It's About Programmers Building Apps Using SQL Semantics at NoSQL Scale
13 0.094604395 1002 high scalability-2011-03-09-Productivity vs. Control tradeoffs in PaaS
14 0.076556772 1529 high scalability-2013-10-08-F1 and Spanner Holistically Compared
15 0.074505419 1596 high scalability-2014-02-14-Stuff The Internet Says On Scalability For February 14th, 2014
16 0.073769026 507 high scalability-2009-02-03-Paper: Optimistic Replication
17 0.072759867 670 high scalability-2009-08-05-Anti-RDBMS: A list of distributed key-value stores
18 0.067497902 960 high scalability-2010-12-20-Netflix: Use Less Chatty Protocols in the Cloud - Plus 26 Fixes
19 0.065582082 687 high scalability-2009-08-24-How Google Serves Data from Multiple Datacenters
20 0.062521309 914 high scalability-2010-10-04-Paper: An Analysis of Linux Scalability to Many Cores
topicId topicWeight
[(0, 0.068), (1, 0.044), (2, -0.0), (3, 0.054), (4, -0.004), (5, 0.051), (6, -0.001), (7, 0.004), (8, -0.043), (9, 0.009), (10, -0.007), (11, 0.014), (12, -0.053), (13, -0.033), (14, 0.025), (15, 0.01), (16, 0.058), (17, -0.017), (18, 0.009), (19, -0.06), (20, 0.071), (21, 0.024), (22, -0.039), (23, 0.052), (24, -0.054), (25, 0.009), (26, 0.025), (27, -0.013), (28, -0.006), (29, -0.018), (30, -0.021), (31, -0.003), (32, -0.108), (33, 0.015), (34, -0.024), (35, -0.042), (36, 0.029), (37, -0.01), (38, 0.014), (39, 0.03), (40, -0.034), (41, 0.054), (42, 0.003), (43, -0.01), (44, 0.014), (45, 0.038), (46, -0.03), (47, 0.026), (48, -0.02), (49, -0.075)]
simIndex simValue blogId blogTitle
same-blog 1 0.97999525 1243 high scalability-2012-05-10-Paper: Paxos Made Moderately Complex
Introduction: If you are a normal human being and find the Paxos protocol confusing, then this paper, Paxos Made Moderately Complex , is a great find. Robbert van Renesse from Cornell University has written a clear and well written paper with excellent explanations. The Abstract: For anybody who has ever tried to implement it, Paxos is by no means a simple protocol, even though it is based on relatively simple invariants. This paper provides imperative pseudo-code for the full Paxos (or Multi-Paxos) protocol without shying away from discussing various implementation details. The initial description avoids optimizations that complicate comprehension. Next we discuss liveness, and list various optimizations that make the protocol practical. Related Articles Paxos on HighScalability.com
2 0.79107952 1273 high scalability-2012-06-27-Paper: Logic and Lattices for Distributed Programming
Introduction: Neil Conway from Berkeley CS is giving an advanced level talk at a meetup today in San Francisco on a new paper: Logic and Lattices for Distributed Programming - extending set logic to support CRDT-style lattices. The description of the meetup is probably the clearest introduction to the paper: Developers are increasingly choosing datastores that sacrifice strong consistency guarantees in exchange for improved performance and availability. Unfortunately, writing reliable distributed programs without the benefit of strong consistency can be very challenging. In this talk, I'll discuss work from our group at UC Berkeley that aims to make it easier to write distributed programs without relying on strong consistency. Bloom is a declarative programming language for distributed computing, while CALM is an analysis technique that identifies programs that are guaranteed to be eventually consistent. I'll then discuss our recent work on extending CALM to support a broader range of
3 0.75374591 1374 high scalability-2012-12-18-Georeplication: When Bad Things Happen to Good Systems
Introduction: Georeplication is one of the standard techniques for dealing when bad things--failure and latency--happen to good systems. The problem is always: how do you do that? Murat Demirbas , Associate Professor at SUNY Buffalo, has a couple of really good posts that can help: MDCC: Multi-Data Center Consistency and Making Geo-Replicated Systems Fast as Possible, Consistent when Necessary . In MDCC: Multi-Data Center Consistency Murat discusses a paper that says synchronous wide-area replication can be feasible. There's a quick and clear explanation of Paxos and various optimizations that is worth the price of admission. We find that strong consistency doesn't have to be lost across a WAN: The good thing about using Paxos over the WAN is you /almost/ get the full CAP (all three properties: consistency, availability, and partition-freedom). As we discussed earlier (Paxos taught), Paxos is CP, that is, in the presence of a partition, Paxos keeps consistency over availability. But, P
4 0.685188 1459 high scalability-2013-05-16-Paper: Warp: Multi-Key Transactions for Key-Value Stores
Introduction: Looks like an interesting take on "a completely asynchronous, low-latency transaction management protocol, in line with the fully distributed NoSQL architecture." Warp: Multi-Key Transactions for Key-Value Stores  overview: Implementing ACID transactions has been a longstanding challenge for NoSQL systems. Because these systems are based on a sharded architecture, transactions necessarily require coordination across multiple servers. Past work in this space has relied either on heavyweight protocols such as Paxos or clock synchronization for this coordination. This paper presents a novel protocol for coordinating distributed transactions with ACID semantics on top of a sharded data store. Called linear transactions, this protocol achieves scalability by distributing the coordination task to only those servers that hold relevant data for each transaction. It achieves high performance by serializing only those transactions whose concurrent execution could potentially yield a vio
5 0.68449122 507 high scalability-2009-02-03-Paper: Optimistic Replication
Introduction: To scale in the large you have to partition. Data has to be spread around, replicated, and kept consistent (keeping replicas sufficiently similar to one another despite operations being submitted independently at different sites). The result is a highly available, well performing, and scalable system. Partitioning is required, but it's a pain to do efficiently and correctly. Until Quantum teleportation becomes a reality how data is kept consistent across a bewildering number of failure scenarios is a key design decision. This excellent paper by Yasushi Saito and Marc Shapiro takes us on a wild ride (OK, maybe not so wild) of different approaches to achieving consistency. What's cool about this paper is they go over some real systems that we are familiar with and cover how they work: DNS (single-master, state-transfer), Usenet (multi-master), PDAs (multi-master, state-transfer, manual or application-specific conflict resolution), Bayou (multi-master, operation-transfer, epidemic
6 0.64706498 963 high scalability-2010-12-23-Paper: CRDTs: Consistency without concurrency control
7 0.64411741 529 high scalability-2009-03-10-Paper: Consensus Protocols: Paxos
8 0.63947117 357 high scalability-2008-07-26-Google's Paxos Made Live – An Engineering Perspective
9 0.63376838 1450 high scalability-2013-05-01-Myth: Eric Brewer on Why Banks are BASE Not ACID - Availability Is Revenue
10 0.63157392 844 high scalability-2010-06-18-Paper: The Declarative Imperative: Experiences and Conjectures in Distributed Logic
11 0.62956548 890 high scalability-2010-09-01-Paper: The Case for Determinism in Database Systems
12 0.62645036 1153 high scalability-2011-12-08-Update on Scalable Causal Consistency For Wide-Area Storage With COPS
13 0.60363084 1146 high scalability-2011-11-23-Paper: Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS
14 0.59629029 510 high scalability-2009-02-09-Paper: Consensus Protocols: Two-Phase Commit
15 0.59197944 1305 high scalability-2012-08-16-Paper: A Provably Correct Scalable Concurrent Skip List
16 0.54854882 705 high scalability-2009-09-16-Paper: A practical scalable distributed B-tree
17 0.54756075 1611 high scalability-2014-03-12-Paper: Scalable Eventually Consistent Counters over Unreliable Networks
18 0.54422212 1629 high scalability-2014-04-10-Paper: Scalable Atomic Visibility with RAMP Transactions - Scale Linearly to 100 Servers
topicId topicWeight
[(1, 0.021), (2, 0.168), (8, 0.383), (10, 0.09), (61, 0.149), (79, 0.05)]
simIndex simValue blogId blogTitle
same-blog 1 0.83996874 1243 high scalability-2012-05-10-Paper: Paxos Made Moderately Complex
Introduction: If you are a normal human being and find the Paxos protocol confusing, then this paper, Paxos Made Moderately Complex , is a great find. Robbert van Renesse from Cornell University has written a clear and well written paper with excellent explanations. The Abstract: For anybody who has ever tried to implement it, Paxos is by no means a simple protocol, even though it is based on relatively simple invariants. This paper provides imperative pseudo-code for the full Paxos (or Multi-Paxos) protocol without shying away from discussing various implementation details. The initial description avoids optimizations that complicate comprehension. Next we discuss liveness, and list various optimizations that make the protocol practical. Related Articles Paxos on HighScalability.com
2 0.75197488 272 high scalability-2008-03-08-Product: FAI - Fully Automatic Installation
Introduction: From their website: FAI is an automated installation tool to install or deploy Debian GNU/Linux and other distributions on a bunch of different hosts or a Cluster. It's more flexible than other tools like kickstart for Red Hat, autoyast and alice for SuSE or Jumpstart for SUN Solaris. FAI can also be used for configuration management of a running system. You can take one or more virgin PCs, turn on the power and after a few minutes Linux is installed, configured and running on all your machines, without any interaction necessary. FAI it's a scalable method for installing and updating all your computers unattended with little effort involved. It's a centralized management system for your Linux deployment. FAI's target group are system administrators who have to install Linux onto one or even hundreds of computers. It's not only a tool for doing a Cluster installation but a general purpose installation tool. It can be used for installing a Beowulf cluster, a rendering farm,
3 0.72265154 186 high scalability-2007-12-13-un-article: the setup behind microsoft.com
Introduction: On the blogs.technet.com article on microsoft.com's infrastructure: The article reads like a blatant ad for it's own products, and is light on the technical side. The juicy bits are here, so you know what the fuss is about: Cytrix Netscaler (= loadbalancer with various optimizations) W2K8 + IIS7 and antivirus software on the webservers 650GB/day ISS log files 8-9GBit/s (unknown if CDN's are included) Simple network filtering: stateless access lists blocking unwanted ports on the routers/switches (hence the debated "no firewalls" claim). Note that this information may not reflect present reality very well; the spokesman appears to be reciting others words.
4 0.65462393 155 high scalability-2007-11-15-Video: Dryad: A general-purpose distributed execution platform
Introduction: Dryad is Microsoft's answer to Google's map-reduce . What's the question: How do you process really large amounts of data? My initial impression of Dryad is it's like a giant Unix command line filter on steroids. There are lots of inputs, outputs, tees, queues, and merge sorts all connected together by a master exec program. What else does Dryad have to offer the scalable infrastructure wars? Dryad models programs as the execution of a directed acyclic graph. Each vertex is a program and edges are typed communication channels (files, TCP pipes, and shared memory channels within a process). Map-reduce uses a different model. It's more like a large distributed sort where the programmer defines functions for mapping, partitioning, and reducing. Each approach seems to borrow from the spirit of its creating organization. The graph approach seems a bit too complicated and map-reduce seems a bit too simple. How ironic, in the Alanis Morissette sense. Dryad is a middleware layer that ex
5 0.64419007 1108 high scalability-2011-08-31-Pud is the Anti-Stack - Windows, CFML, Dropbox, Xeround, JungleDisk, ELB
Introduction: Pud of f*ckedcomany.com (FC) fame, a favorite site of the dot bomb era, and a site I absolutely loved until my company became featured, has given us a look at his backend: Why Must You Laugh At My Back End . For those whose don't remember FC's history, TechCrunch published a fitting eulogy : [FC] first went live in 2000, chronicling failing and troubled companies in its unique and abrasive style after the dot com bust. Within a year it had a massive audience and was getting serious mainstream press attention. As the startup economy became better in 2004, much of the attention the site received went away. But a large and loyal audience remains at the site, coming back day after day for its unique slant on the news. At its peak, FC had 4 million unique monthly visitors. Delightfully, FC was not a real-names kind of site. Hard witty cynicism ruled and not a single cat picture was in sight. It was a blast of fun when all around was the enclosing dark. So when I saw Pud's post
6 0.64008224 1229 high scalability-2012-04-17-YouTube Strategy: Adding Jitter isn't a Bug
7 0.63164771 1183 high scalability-2012-01-30-37signals Still Happily Scaling on Moore RAM and SSDs
8 0.62537831 354 high scalability-2008-07-20-The clouds are coming
9 0.5563032 858 high scalability-2010-07-13-Sponsored Post: VoltDB and Digg are Hiring
10 0.55630177 833 high scalability-2010-06-01-Sponsored Post: Get Your High Scalability Fix at Digg
11 0.53447121 1031 high scalability-2011-04-28-PaaS on OpenStack - Run Applications on Any Cloud, Any Time Using Any Thing
12 0.53443176 607 high scalability-2009-05-26-Database Optimize patterns
13 0.51819104 1434 high scalability-2013-04-03-5 Steps to Benchmarking Managed NoSQL - DynamoDB vs Cassandra
14 0.51480967 269 high scalability-2008-03-08-Audiogalaxy.com Architecture
15 0.51057243 1142 high scalability-2011-11-14-Using Gossip Protocols for Failure Detection, Monitoring, Messaging and Other Good Things
16 0.50666088 1337 high scalability-2012-10-10-Antirez: You Need to Think in Terms of Organizing Your Data for Fetching
17 0.50528145 1207 high scalability-2012-03-12-Google: Taming the Long Latency Tail - When More Machines Equals Worse Results
18 0.50485218 1287 high scalability-2012-07-20-Stuff The Internet Says On Scalability For July 20, 2012
19 0.50008738 332 high scalability-2008-05-28-Job queue and search engine
20 0.49976179 689 high scalability-2009-08-28-Strategy: Solve Only 80 Percent of the Problem