high_scalability high_scalability-2009 high_scalability-2009-497 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Marton Trencseni has collected a wonderful list of different papers on distributed systems. He's organized them into the following sections: The Google Papers, Distributed Filesystems, Non-relational Distributed Databases, The Lamport Papers, and Implementation Issues. Many old favorites on the list and some that are likely new to you. My new favorite is "Frangipani: A Scalable Distributed File System." How can you not love "Frangipani" as a word?
sentIndex sentText sentNum sentScore
1 Marton Trencseni has collected a wonderful list of different papers on distributed systems. [sent-1, score-1.077]
2 He's organized them into the following sections: The Google Papers, Distributed Filesystems, Non-relational Distributed Databases, The Lamport Papers, and Implementation Issues. [sent-2, score-0.257]
3 Many old favorites on the list and some that are likely new to you. [sent-3, score-0.602]
4 My new favorite is "Frangipani: A Scalable Distributed File System. [sent-4, score-0.191]
wordName wordTfidf (topN-words)
[('frangipani', 0.601), ('papers', 0.412), ('marton', 0.256), ('lamport', 0.235), ('filesystems', 0.222), ('favorites', 0.216), ('sections', 0.177), ('distributed', 0.166), ('organized', 0.163), ('list', 0.159), ('collected', 0.155), ('wonderful', 0.145), ('favorite', 0.137), ('word', 0.129), ('love', 0.096), ('following', 0.094), ('likely', 0.092), ('implementation', 0.088), ('old', 0.081), ('file', 0.067), ('databases', 0.064), ('google', 0.058), ('new', 0.054), ('scalable', 0.049), ('different', 0.04), ('many', 0.035)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 497 high scalability-2009-01-19-Papers: Readings in Distributed Systems
Introduction: Marton Trencseni has collected a wonderful list of different papers on distributed systems. He's organized them into the following sections: The Google Papers, Distributed Filesystems, Non-relational Distributed Databases, The Lamport Papers, and Implementation Issues. Many old favorites on the list and some that are likely new to you. My new favorite is "Frangipani: A Scalable Distributed File System." How can you not love "Frangipani" as a word?
2 0.14296527 1049 high scalability-2011-05-31-Awesome List of Advanced Distributed Systems Papers
Introduction: As part of Dr. Indranil Gupta 's CS 525 Spring 2011 Advanced Distributed Systems class, he has collected an incredible list of resources on distributed systems . His research group is also doing some interesting work. The various topics include: Before there Were Clouds, Cloud Computing, P2P Systems, Basic Distributed Computing Concepts, Sensor Networks, Overlays and DHTs, Cloud Programming, Cloud Scheduling, Key-Value Stores, Storage, Sensor Net Routing, Geo-Distribution, P2P Apps, In-network processing, Epidemics, Probabilistic Membership Protocols, Distributed Monitoring and Management, Publish-Subscribe/CDNs, Measurement Studies, Old Wine: Stale or Vintage?, In Byzantium, Cloud Pricing, Other Industrial Systems, Structure of Networks, Completing the Circle, Green Clouds, Distributed Debugging, Flash!, The Middle or the End?, Availability-Aware Systems, Design Methodologies, Handling Stress, Sources of unreliability in networks, Handling Stress, Selfish algorithms, Securi
3 0.080118753 880 high scalability-2010-08-13-Hot Scalability Links for Aug 13, 2010
Introduction: Ezra Zygmuntowicz in a heart warming account of his 4 Years at Engine Yard , has concluded in his experience that: the true future of cloud computing for developers is to not think about servers at all. It is now time to focus on the Application and new levels of abstraction that allow folks to use the computing resources in easier and easier ways. Tweets of Gold: bryanlatten : Nothing like a million caching layers to screw up an already complicated deployment. Thankfully, there is beer. jkalucki : Twitter isn't down, you are just using the wrong access methods... andyedinborough : I don't mean to hate, but why would I give up performance and scalability for a dynamic language? Honestly, I don't get it. AsitSinha : It's amazing.... to see the absence of an understanding of how capability plays a role in scalability. scottgal : Devs are HORRIBLE DBAs. Used to do Scalability labs for MS UK and bad schemas were the single biggest issue (next to bad Indexes
4 0.074218743 570 high scalability-2009-04-15-Implementing large scale web analytics
Introduction: Does anyone know of any articles or papers that discuss the nuts and bolts of how web analytics is implemented at organizations with large volumes of web traffic and a critcal business need to analyze that data - e.g. places like Amazon.com, eBay, and Google? Just as a fun project I'm planning to build my own web log analysis app that can effectively index and query large volumes of web log data (i.e. TB range). But first I'd like to learn more about how it's done in the organizations whose lifeblood depends on this stuff. Even just a high level architectural overview of their approaches would be nice to have.
5 0.065785989 369 high scalability-2008-08-18-Code deployment tools
Introduction: G'day, I'm building an application to manage WordPress PHP code on many servers. Our application will push down code updates to each server, as well as performing backups and testing. I'm considering different methods of pushing updated code onto the individual servers. I'm considering something like Capistrano (I've no experience in Ruby though). I've also considered using subversion and then remotely calling svn commands via SSH. Are there any other tools specifically for this purpose? The servers will have persistent data (the WordPress databases) so I don't want to re-image them every update. Plus, they will each have a different set of plugins / themes, so building many images would be too complex. If there are any papers on code deployment, or other recommended reading, please point the links my way. Likewise, if anyone has any suggestions, or would like more details, just let me know. Cheers - Callum .
6 0.064670719 658 high scalability-2009-07-17-Against all the odds
7 0.064010344 787 high scalability-2010-03-03-Hot Scalability Links for March 3, 2010
8 0.063726485 590 high scalability-2009-05-06-Art of Distributed
9 0.06118555 372 high scalability-2008-08-27-Updating distributed web applications
10 0.059873398 1468 high scalability-2013-05-31-Stuff The Internet Says On Scalability For May 31, 2013
11 0.058467295 92 high scalability-2007-09-15-The Role of Memory within Web 2.0 Architectures and Deployments
12 0.053879958 959 high scalability-2010-12-17-Stuff the Internet Says on Scalability For December 17th, 2010
13 0.052175235 889 high scalability-2010-08-30-Pomegranate - Storing Billions and Billions of Tiny Little Files
14 0.051678535 288 high scalability-2008-03-25-Paper: On Designing and Deploying Internet-Scale Services
15 0.049035296 1022 high scalability-2011-04-13-Paper: NoSQL Databases - NoSQL Introduction and Overview
16 0.047244787 676 high scalability-2009-08-08-Yahoo!'s PNUTS Database: Too Hot, Too Cold or Just Right?
17 0.046761155 1535 high scalability-2013-10-21-Google's Sanjay Ghemawat on What Made Google Google and Great Big Data Career Advice
18 0.046643361 1138 high scalability-2011-11-07-10 Core Architecture Pattern Variations for Achieving Scalability
19 0.046425093 564 high scalability-2009-04-10-counting # of views, calculating most-least viewed
20 0.043469302 1015 high scalability-2011-04-01-Stuff The Internet Says On Scalability For April 1, 2011
topicId topicWeight
[(0, 0.048), (1, 0.032), (2, 0.009), (3, 0.037), (4, 0.01), (5, 0.03), (6, -0.012), (7, 0.006), (8, 0.005), (9, 0.038), (10, -0.005), (11, -0.01), (12, -0.023), (13, -0.021), (14, 0.016), (15, -0.004), (16, -0.012), (17, -0.01), (18, 0.006), (19, -0.017), (20, 0.008), (21, 0.002), (22, -0.014), (23, 0.012), (24, -0.021), (25, -0.027), (26, 0.042), (27, 0.053), (28, -0.039), (29, -0.007), (30, 0.004), (31, -0.003), (32, -0.003), (33, 0.009), (34, -0.029), (35, -0.03), (36, 0.0), (37, -0.038), (38, 0.022), (39, 0.016), (40, -0.016), (41, 0.008), (42, 0.035), (43, 0.012), (44, -0.017), (45, 0.016), (46, -0.016), (47, 0.035), (48, 0.022), (49, 0.025)]
simIndex simValue blogId blogTitle
same-blog 1 0.96362841 497 high scalability-2009-01-19-Papers: Readings in Distributed Systems
Introduction: Marton Trencseni has collected a wonderful list of different papers on distributed systems. He's organized them into the following sections: The Google Papers, Distributed Filesystems, Non-relational Distributed Databases, The Lamport Papers, and Implementation Issues. Many old favorites on the list and some that are likely new to you. My new favorite is "Frangipani: A Scalable Distributed File System." How can you not love "Frangipani" as a word?
2 0.72962171 223 high scalability-2008-01-25-Google: Introduction to Distributed System Design
Introduction: Update: Google added videos on Cluster Computing and MapReduce . There are five lectures: Introduction, MapReduce, Distributed File Systems, Clustering Algorithms, and Graph Algorithms . Advanced website design depends on deep distributed system design knowledge. Where do you get this knowledge? Try Google. They have a a whole Code for Educators program with tutorials and lectures on AJAX programming, distributed systems, and web security. Looks pretty nice.
3 0.64949727 1049 high scalability-2011-05-31-Awesome List of Advanced Distributed Systems Papers
Introduction: As part of Dr. Indranil Gupta 's CS 525 Spring 2011 Advanced Distributed Systems class, he has collected an incredible list of resources on distributed systems . His research group is also doing some interesting work. The various topics include: Before there Were Clouds, Cloud Computing, P2P Systems, Basic Distributed Computing Concepts, Sensor Networks, Overlays and DHTs, Cloud Programming, Cloud Scheduling, Key-Value Stores, Storage, Sensor Net Routing, Geo-Distribution, P2P Apps, In-network processing, Epidemics, Probabilistic Membership Protocols, Distributed Monitoring and Management, Publish-Subscribe/CDNs, Measurement Studies, Old Wine: Stale or Vintage?, In Byzantium, Cloud Pricing, Other Industrial Systems, Structure of Networks, Completing the Circle, Green Clouds, Distributed Debugging, Flash!, The Middle or the End?, Availability-Aware Systems, Design Methodologies, Handling Stress, Sources of unreliability in networks, Handling Stress, Selfish algorithms, Securi
4 0.64119583 1611 high scalability-2014-03-12-Paper: Scalable Eventually Consistent Counters over Unreliable Networks
Introduction: Counting at scale in a distributed environment is surprisingly hard . And it's a subject we've covered before in various ways: Big Data Counting: How to count a billion distinct objects using only 1.5KB of Memory , How to update video views count effectively? , Numbers Everyone Should Know (sharded counters) . Kellabyte (which is an excellent blog) in Scalable Eventually Consistent Counters talks about how the Cassandra counter implementation scores well on the scalability and high availability front, but in so doing has "over and under counting problem in partitioned environments." Which is often fine. But if you want more accuracy there's a PN-counter, which is a CRDT (convergent replicated data type) where "all the changes made to a counter on each node rather than storing and modifying a single value so that you can merge all the values into the proper final value. Of course the trade-off here is additional storage and processing but there are ways to optimize this."
5 0.62505311 705 high scalability-2009-09-16-Paper: A practical scalable distributed B-tree
Introduction: We've seen a lot of NoSQL action lately built around distributed hash tables. Btrees are getting jealous. Btrees, once the king of the database world, want their throne back. Paul Buchheit surfaced a paper: A practical scalable distributed B-tree by Marcos K. Aguilera and Wojciech Golab, that might help spark a revolution. From the Abstract: We propose a new algorithm for a practical, fault tolerant, and scalable B-tree distributed over a set of servers. Our algorithm supports practical features not present in prior work: transactions that allow atomic execution of multiple operations over multiple B-trees, online migration of B-tree nodes between servers, and dynamic addition and removal of servers. Moreover, our algorithm is conceptually simple: we use transactions to manipulate B-tree nodes so that clients need not use complicated concurrency and locking protocols used in prior work. To execute these transactions quickly, we rely on three techniques: (1) We use optimistic
6 0.62200344 635 high scalability-2009-06-22-Improving performance and scalability with DDD
7 0.60183716 400 high scalability-2008-10-01-The Pattern Bible for Distributed Computing
8 0.60100818 592 high scalability-2009-05-06-DyradLINQ
10 0.59916687 510 high scalability-2009-02-09-Paper: Consensus Protocols: Two-Phase Commit
11 0.59604025 590 high scalability-2009-05-06-Art of Distributed
12 0.59120637 649 high scalability-2009-07-02-Product: Facebook's Cassandra - A Massive Distributed Store
13 0.58305609 734 high scalability-2009-10-30-Hot Scalabilty Links for October 30 2009
14 0.58180416 1498 high scalability-2013-08-07-RAFT - In Search of an Understandable Consensus Algorithm
15 0.57756674 844 high scalability-2010-06-18-Paper: The Declarative Imperative: Experiences and Conjectures in Distributed Logic
16 0.56798857 580 high scalability-2009-04-24-INFOSCALE 2009 in June in Hong Kong
17 0.55812758 50 high scalability-2007-07-31-BerkeleyDB & other distributed high performance key-value databases
18 0.55559909 710 high scalability-2009-09-20-PaxosLease: Diskless Paxos for Leases
19 0.54950762 433 high scalability-2008-10-29-CTL - Distributed Control Dispatching Framework
20 0.54301322 1022 high scalability-2011-04-13-Paper: NoSQL Databases - NoSQL Introduction and Overview
topicId topicWeight
[(10, 0.067), (38, 0.373), (40, 0.03), (79, 0.068), (85, 0.271)]
simIndex simValue blogId blogTitle
same-blog 1 0.67506301 497 high scalability-2009-01-19-Papers: Readings in Distributed Systems
Introduction: Marton Trencseni has collected a wonderful list of different papers on distributed systems. He's organized them into the following sections: The Google Papers, Distributed Filesystems, Non-relational Distributed Databases, The Lamport Papers, and Implementation Issues. Many old favorites on the list and some that are likely new to you. My new favorite is "Frangipani: A Scalable Distributed File System." How can you not love "Frangipani" as a word?
2 0.49762169 59 high scalability-2007-08-04-Try Squid as a Reverse Proxy
Introduction: This scalability strategy is brought to you by Erik Osterman: My recommendations for anyone dealing with explosive growth on a limited budget with lots of cachable content (e.g. content capable of returning valid expiration headers) is employ a reverse proxy as mentioned in this article. In the last week, we had a site get AP'd, triggering 100K unique visitors to a single IIS server in under 5 hours. It took out the IIS server. Placing a single squid infront of the server handled the entire onslaught with a max server load of 0.10 on a modest Intel IV 3Ghz. It's trivial to implement for anyone interested...
3 0.47777027 447 high scalability-2008-11-19-High Definition Video Delivery on the Web?
Introduction: How would you architect and implement an SD and HD internet video delivery system such as the BBC iPlayer or Recast Digital's RDV1 . What do you need to consider on top of the Lessons Learned section in the YouTube Architecture post? How is it possible to compete with the big players like Google? Can you just use a CDN and scale efficiently? Would Amazon's cloud services be a viable platform for high-definition video streaming?
4 0.47699559 1049 high scalability-2011-05-31-Awesome List of Advanced Distributed Systems Papers
Introduction: As part of Dr. Indranil Gupta 's CS 525 Spring 2011 Advanced Distributed Systems class, he has collected an incredible list of resources on distributed systems . His research group is also doing some interesting work. The various topics include: Before there Were Clouds, Cloud Computing, P2P Systems, Basic Distributed Computing Concepts, Sensor Networks, Overlays and DHTs, Cloud Programming, Cloud Scheduling, Key-Value Stores, Storage, Sensor Net Routing, Geo-Distribution, P2P Apps, In-network processing, Epidemics, Probabilistic Membership Protocols, Distributed Monitoring and Management, Publish-Subscribe/CDNs, Measurement Studies, Old Wine: Stale or Vintage?, In Byzantium, Cloud Pricing, Other Industrial Systems, Structure of Networks, Completing the Circle, Green Clouds, Distributed Debugging, Flash!, The Middle or the End?, Availability-Aware Systems, Design Methodologies, Handling Stress, Sources of unreliability in networks, Handling Stress, Selfish algorithms, Securi
5 0.4462935 143 high scalability-2007-11-06-Product: ChironFS
Introduction: If you are trying to create highly available file systems, especially across data centers, then ChironFS is one potential solution. It's relatively new, so there aren't lots of experience reports, but it looks worth considering. What is ChironFS and how does it work? Adapted from the ChironFS website: The Chiron Filesystem is a Fuse based filesystem that frees you from single points of failure. It's main purpose is to guarantee filesystem availability using replication. But it isn't a RAID implementation. RAID replicates DEVICES not FILESYSTEMS. Why not just use RAID over some network block device? Because it is a block device and if one server mounts that device in RW mode, no other server will be able to mount it in RW mode. Any real network may have many servers and offer a variety of services. Keeping everything running can become a real nightmare!
6 0.44406998 191 high scalability-2007-12-23-Synchronizing Memcached application
7 0.42357484 820 high scalability-2010-05-03-100 Node Hazelcast cluster on Amazon EC2
8 0.42155597 102 high scalability-2007-09-27-Product: Sequoia Database Clustering Technology
9 0.3996532 1164 high scalability-2011-12-27-PlentyOfFish Update - 6 Billion Pageviews and 32 Billion Images a Month
10 0.39923787 1039 high scalability-2011-05-12-Paper: Mind the Gap: Reconnecting Architecture and OS Research
11 0.36819801 1500 high scalability-2013-08-12-100 Curse Free Lessons from Gordon Ramsay on Building Great Software
12 0.36557809 1032 high scalability-2011-05-02-Stack Overflow Makes Slow Pages 100x Faster by Simple SQL Tuning
13 0.36233485 1577 high scalability-2014-01-13-NYTimes Architecture: No Head, No Master, No Single Point of Failure
14 0.35060242 1239 high scalability-2012-05-04-Stuff The Internet Says On Scalability For May 4, 2012
15 0.34202123 646 high scalability-2009-07-01-Podcast about Facebook's Cassandra Project and the New Wave of Distributed Databases
16 0.33000147 492 high scalability-2009-01-16-Database Sharding for startups
17 0.32862329 89 high scalability-2007-09-10-Is there a difference between partitioning and federation and sharding?
18 0.32119912 1024 high scalability-2011-04-15-Stuff The Internet Says On Scalability For April 15, 2011
19 0.32014292 1452 high scalability-2013-05-06-7 Not So Sexy Tips for Saving Money On Amazon
20 0.31764531 1592 high scalability-2014-02-07-Stuff The Internet Says On Scalability For February 7th, 2014