high_scalability high_scalability-2009 high_scalability-2009-497 knowledge-graph by maker-knowledge-mining

497 high scalability-2009-01-19-Papers: Readings in Distributed Systems


meta infos for this blog

Source: html

Introduction: Marton Trencseni has collected a wonderful list of different papers on distributed systems. He's organized them into the following sections: The Google Papers, Distributed Filesystems, Non-relational Distributed Databases, The Lamport Papers, and Implementation Issues. Many old favorites on the list and some that are likely new to you. My new favorite is "Frangipani: A Scalable Distributed File System." How can you not love "Frangipani" as a word?


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Marton Trencseni has collected a wonderful list of different papers on distributed systems. [sent-1, score-1.077]

2 He's organized them into the following sections: The Google Papers, Distributed Filesystems, Non-relational Distributed Databases, The Lamport Papers, and Implementation Issues. [sent-2, score-0.257]

3 Many old favorites on the list and some that are likely new to you. [sent-3, score-0.602]

4 My new favorite is "Frangipani: A Scalable Distributed File System. [sent-4, score-0.191]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('frangipani', 0.601), ('papers', 0.412), ('marton', 0.256), ('lamport', 0.235), ('filesystems', 0.222), ('favorites', 0.216), ('sections', 0.177), ('distributed', 0.166), ('organized', 0.163), ('list', 0.159), ('collected', 0.155), ('wonderful', 0.145), ('favorite', 0.137), ('word', 0.129), ('love', 0.096), ('following', 0.094), ('likely', 0.092), ('implementation', 0.088), ('old', 0.081), ('file', 0.067), ('databases', 0.064), ('google', 0.058), ('new', 0.054), ('scalable', 0.049), ('different', 0.04), ('many', 0.035)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 497 high scalability-2009-01-19-Papers: Readings in Distributed Systems

Introduction: Marton Trencseni has collected a wonderful list of different papers on distributed systems. He's organized them into the following sections: The Google Papers, Distributed Filesystems, Non-relational Distributed Databases, The Lamport Papers, and Implementation Issues. Many old favorites on the list and some that are likely new to you. My new favorite is "Frangipani: A Scalable Distributed File System." How can you not love "Frangipani" as a word?

2 0.14296527 1049 high scalability-2011-05-31-Awesome List of Advanced Distributed Systems Papers

Introduction: As part of Dr. Indranil Gupta 's  CS 525 Spring 2011 Advanced Distributed Systems  class, he has collected an incredible  list of resources on distributed systems . His research group is also doing some interesting work. The various topics include: Before there Were Clouds, Cloud Computing, P2P Systems, Basic Distributed Computing Concepts, Sensor Networks, Overlays and DHTs, Cloud Programming, Cloud Scheduling, Key-Value Stores, Storage, Sensor Net Routing, Geo-Distribution, P2P Apps, In-network processing, Epidemics, Probabilistic Membership Protocols, Distributed Monitoring and  Management, Publish-Subscribe/CDNs, Measurement Studies, Old Wine: Stale or Vintage?, In Byzantium, Cloud Pricing, Other Industrial Systems, Structure of Networks, Completing the Circle, Green Clouds, Distributed Debugging, Flash!, The Middle or the End?, Availability-Aware Systems, Design Methodologies, Handling Stress, Sources of unreliability in networks, Handling Stress, Selfish algorithms, Securi

3 0.080118753 880 high scalability-2010-08-13-Hot Scalability Links for Aug 13, 2010

Introduction: Ezra Zygmuntowicz in a heart warming account of his 4 Years at Engine Yard , has concluded in his experience that: the true future of cloud computing for developers is to not think about servers at all. It is now time to focus on the Application and new levels of abstraction that allow folks to use the computing resources in easier and easier ways.  Tweets of Gold: bryanlatten : Nothing like a million caching layers to screw up an already complicated deployment. Thankfully, there is beer. jkalucki : Twitter isn't down, you are just using the wrong access methods... andyedinborough : I don't mean to hate, but why would I give up performance and scalability for a dynamic language? Honestly, I don't get it. AsitSinha : It's amazing.... to see the absence of an understanding of how capability plays a role in scalability. scottgal : Devs are HORRIBLE DBAs. Used to do Scalability labs for MS UK and bad schemas were the single biggest issue (next to bad Indexes

4 0.074218743 570 high scalability-2009-04-15-Implementing large scale web analytics

Introduction: Does anyone know of any articles or papers that discuss the nuts and bolts of how web analytics is implemented at organizations with large volumes of web traffic and a critcal business need to analyze that data - e.g. places like Amazon.com, eBay, and Google? Just as a fun project I'm planning to build my own web log analysis app that can effectively index and query large volumes of web log data (i.e. TB range). But first I'd like to learn more about how it's done in the organizations whose lifeblood depends on this stuff. Even just a high level architectural overview of their approaches would be nice to have.

5 0.065785989 369 high scalability-2008-08-18-Code deployment tools

Introduction: G'day, I'm building an application to manage WordPress PHP code on many servers. Our application will push down code updates to each server, as well as performing backups and testing. I'm considering different methods of pushing updated code onto the individual servers. I'm considering something like Capistrano (I've no experience in Ruby though). I've also considered using subversion and then remotely calling svn commands via SSH. Are there any other tools specifically for this purpose? The servers will have persistent data (the WordPress databases) so I don't want to re-image them every update. Plus, they will each have a different set of plugins / themes, so building many images would be too complex. If there are any papers on code deployment, or other recommended reading, please point the links my way. Likewise, if anyone has any suggestions, or would like more details, just let me know. Cheers - Callum .

6 0.064670719 658 high scalability-2009-07-17-Against all the odds

7 0.064010344 787 high scalability-2010-03-03-Hot Scalability Links for March 3, 2010

8 0.063726485 590 high scalability-2009-05-06-Art of Distributed

9 0.06118555 372 high scalability-2008-08-27-Updating distributed web applications

10 0.059873398 1468 high scalability-2013-05-31-Stuff The Internet Says On Scalability For May 31, 2013

11 0.058467295 92 high scalability-2007-09-15-The Role of Memory within Web 2.0 Architectures and Deployments

12 0.053879958 959 high scalability-2010-12-17-Stuff the Internet Says on Scalability For December 17th, 2010

13 0.052175235 889 high scalability-2010-08-30-Pomegranate - Storing Billions and Billions of Tiny Little Files

14 0.051678535 288 high scalability-2008-03-25-Paper: On Designing and Deploying Internet-Scale Services

15 0.049035296 1022 high scalability-2011-04-13-Paper: NoSQL Databases - NoSQL Introduction and Overview

16 0.047244787 676 high scalability-2009-08-08-Yahoo!'s PNUTS Database: Too Hot, Too Cold or Just Right?

17 0.046761155 1535 high scalability-2013-10-21-Google's Sanjay Ghemawat on What Made Google Google and Great Big Data Career Advice

18 0.046643361 1138 high scalability-2011-11-07-10 Core Architecture Pattern Variations for Achieving Scalability

19 0.046425093 564 high scalability-2009-04-10-counting # of views, calculating most-least viewed

20 0.043469302 1015 high scalability-2011-04-01-Stuff The Internet Says On Scalability For April 1, 2011


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.048), (1, 0.032), (2, 0.009), (3, 0.037), (4, 0.01), (5, 0.03), (6, -0.012), (7, 0.006), (8, 0.005), (9, 0.038), (10, -0.005), (11, -0.01), (12, -0.023), (13, -0.021), (14, 0.016), (15, -0.004), (16, -0.012), (17, -0.01), (18, 0.006), (19, -0.017), (20, 0.008), (21, 0.002), (22, -0.014), (23, 0.012), (24, -0.021), (25, -0.027), (26, 0.042), (27, 0.053), (28, -0.039), (29, -0.007), (30, 0.004), (31, -0.003), (32, -0.003), (33, 0.009), (34, -0.029), (35, -0.03), (36, 0.0), (37, -0.038), (38, 0.022), (39, 0.016), (40, -0.016), (41, 0.008), (42, 0.035), (43, 0.012), (44, -0.017), (45, 0.016), (46, -0.016), (47, 0.035), (48, 0.022), (49, 0.025)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96362841 497 high scalability-2009-01-19-Papers: Readings in Distributed Systems

Introduction: Marton Trencseni has collected a wonderful list of different papers on distributed systems. He's organized them into the following sections: The Google Papers, Distributed Filesystems, Non-relational Distributed Databases, The Lamport Papers, and Implementation Issues. Many old favorites on the list and some that are likely new to you. My new favorite is "Frangipani: A Scalable Distributed File System." How can you not love "Frangipani" as a word?

2 0.72962171 223 high scalability-2008-01-25-Google: Introduction to Distributed System Design

Introduction: Update: Google added videos on Cluster Computing and MapReduce . There are five lectures: Introduction, MapReduce, Distributed File Systems, Clustering Algorithms, and Graph Algorithms . Advanced website design depends on deep distributed system design knowledge. Where do you get this knowledge? Try Google. They have a a whole Code for Educators program with tutorials and lectures on AJAX programming, distributed systems, and web security. Looks pretty nice.

3 0.64949727 1049 high scalability-2011-05-31-Awesome List of Advanced Distributed Systems Papers

Introduction: As part of Dr. Indranil Gupta 's  CS 525 Spring 2011 Advanced Distributed Systems  class, he has collected an incredible  list of resources on distributed systems . His research group is also doing some interesting work. The various topics include: Before there Were Clouds, Cloud Computing, P2P Systems, Basic Distributed Computing Concepts, Sensor Networks, Overlays and DHTs, Cloud Programming, Cloud Scheduling, Key-Value Stores, Storage, Sensor Net Routing, Geo-Distribution, P2P Apps, In-network processing, Epidemics, Probabilistic Membership Protocols, Distributed Monitoring and  Management, Publish-Subscribe/CDNs, Measurement Studies, Old Wine: Stale or Vintage?, In Byzantium, Cloud Pricing, Other Industrial Systems, Structure of Networks, Completing the Circle, Green Clouds, Distributed Debugging, Flash!, The Middle or the End?, Availability-Aware Systems, Design Methodologies, Handling Stress, Sources of unreliability in networks, Handling Stress, Selfish algorithms, Securi

4 0.64119583 1611 high scalability-2014-03-12-Paper: Scalable Eventually Consistent Counters over Unreliable Networks

Introduction: Counting at scale in a distributed environment is surprisingly hard . And it's a subject we've covered before in various ways: Big Data Counting: How to count a billion distinct objects using only 1.5KB of Memory , How to update video views count effectively? , Numbers Everyone Should Know (sharded counters) . Kellabyte (which is an excellent blog) in Scalable Eventually Consistent Counters talks about how the Cassandra counter implementation scores well on the scalability and high availability front, but in so doing has "over and under counting problem in partitioned environments." Which is often fine. But if you want more accuracy there's a PN-counter, which is a CRDT (convergent replicated data type) where "all the changes made to a counter on each node rather than storing and modifying a single value so that you can merge all the values into the proper final value. Of course the trade-off here is additional storage and processing but there are ways to optimize this."

5 0.62505311 705 high scalability-2009-09-16-Paper: A practical scalable distributed B-tree

Introduction: We've seen a lot of NoSQL action lately built around distributed hash tables. Btrees are getting jealous. Btrees, once the king of the database world, want their throne back. Paul Buchheit surfaced a paper: A practical scalable distributed B-tree by Marcos K. Aguilera and Wojciech Golab, that might help spark a revolution. From the Abstract: We propose a new algorithm for a practical, fault tolerant, and scalable B-tree distributed over a set of servers. Our algorithm supports practical features not present in prior work: transactions that allow atomic execution of multiple operations over multiple B-trees, online migration of B-tree nodes between servers, and dynamic addition and removal of servers. Moreover, our algorithm is conceptually simple: we use transactions to manipulate B-tree nodes so that clients need not use complicated concurrency and locking protocols used in prior work. To execute these transactions quickly, we rely on three techniques: (1) We use optimistic

6 0.62200344 635 high scalability-2009-06-22-Improving performance and scalability with DDD

7 0.60183716 400 high scalability-2008-10-01-The Pattern Bible for Distributed Computing

8 0.60100818 592 high scalability-2009-05-06-DyradLINQ

9 0.59956157 280 high scalability-2008-03-17-Paper: Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web

10 0.59916687 510 high scalability-2009-02-09-Paper: Consensus Protocols: Two-Phase Commit

11 0.59604025 590 high scalability-2009-05-06-Art of Distributed

12 0.59120637 649 high scalability-2009-07-02-Product: Facebook's Cassandra - A Massive Distributed Store

13 0.58305609 734 high scalability-2009-10-30-Hot Scalabilty Links for October 30 2009

14 0.58180416 1498 high scalability-2013-08-07-RAFT - In Search of an Understandable Consensus Algorithm

15 0.57756674 844 high scalability-2010-06-18-Paper: The Declarative Imperative: Experiences and Conjectures in Distributed Logic

16 0.56798857 580 high scalability-2009-04-24-INFOSCALE 2009 in June in Hong Kong

17 0.55812758 50 high scalability-2007-07-31-BerkeleyDB & other distributed high performance key-value databases

18 0.55559909 710 high scalability-2009-09-20-PaxosLease: Diskless Paxos for Leases

19 0.54950762 433 high scalability-2008-10-29-CTL - Distributed Control Dispatching Framework

20 0.54301322 1022 high scalability-2011-04-13-Paper: NoSQL Databases - NoSQL Introduction and Overview


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(10, 0.067), (38, 0.373), (40, 0.03), (79, 0.068), (85, 0.271)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.67506301 497 high scalability-2009-01-19-Papers: Readings in Distributed Systems

Introduction: Marton Trencseni has collected a wonderful list of different papers on distributed systems. He's organized them into the following sections: The Google Papers, Distributed Filesystems, Non-relational Distributed Databases, The Lamport Papers, and Implementation Issues. Many old favorites on the list and some that are likely new to you. My new favorite is "Frangipani: A Scalable Distributed File System." How can you not love "Frangipani" as a word?

2 0.49762169 59 high scalability-2007-08-04-Try Squid as a Reverse Proxy

Introduction: This scalability strategy is brought to you by Erik Osterman: My recommendations for anyone dealing with explosive growth on a limited budget with lots of cachable content (e.g. content capable of returning valid expiration headers) is employ a reverse proxy as mentioned in this article. In the last week, we had a site get AP'd, triggering 100K unique visitors to a single IIS server in under 5 hours. It took out the IIS server. Placing a single squid infront of the server handled the entire onslaught with a max server load of 0.10 on a modest Intel IV 3Ghz. It's trivial to implement for anyone interested...

3 0.47777027 447 high scalability-2008-11-19-High Definition Video Delivery on the Web?

Introduction: How would you architect and implement an SD and HD internet video delivery system such as the BBC iPlayer or Recast Digital's RDV1 . What do you need to consider on top of the Lessons Learned section in the YouTube Architecture post? How is it possible to compete with the big players like Google? Can you just use a CDN and scale efficiently? Would Amazon's cloud services be a viable platform for high-definition video streaming?

4 0.47699559 1049 high scalability-2011-05-31-Awesome List of Advanced Distributed Systems Papers

Introduction: As part of Dr. Indranil Gupta 's  CS 525 Spring 2011 Advanced Distributed Systems  class, he has collected an incredible  list of resources on distributed systems . His research group is also doing some interesting work. The various topics include: Before there Were Clouds, Cloud Computing, P2P Systems, Basic Distributed Computing Concepts, Sensor Networks, Overlays and DHTs, Cloud Programming, Cloud Scheduling, Key-Value Stores, Storage, Sensor Net Routing, Geo-Distribution, P2P Apps, In-network processing, Epidemics, Probabilistic Membership Protocols, Distributed Monitoring and  Management, Publish-Subscribe/CDNs, Measurement Studies, Old Wine: Stale or Vintage?, In Byzantium, Cloud Pricing, Other Industrial Systems, Structure of Networks, Completing the Circle, Green Clouds, Distributed Debugging, Flash!, The Middle or the End?, Availability-Aware Systems, Design Methodologies, Handling Stress, Sources of unreliability in networks, Handling Stress, Selfish algorithms, Securi

5 0.4462935 143 high scalability-2007-11-06-Product: ChironFS

Introduction: If you are trying to create highly available file systems, especially across data centers, then ChironFS is one potential solution. It's relatively new, so there aren't lots of experience reports, but it looks worth considering. What is ChironFS and how does it work? Adapted from the ChironFS website: The Chiron Filesystem is a Fuse based filesystem that frees you from single points of failure. It's main purpose is to guarantee filesystem availability using replication. But it isn't a RAID implementation. RAID replicates DEVICES not FILESYSTEMS. Why not just use RAID over some network block device? Because it is a block device and if one server mounts that device in RW mode, no other server will be able to mount it in RW mode. Any real network may have many servers and offer a variety of services. Keeping everything running can become a real nightmare!

6 0.44406998 191 high scalability-2007-12-23-Synchronizing Memcached application

7 0.42357484 820 high scalability-2010-05-03-100 Node Hazelcast cluster on Amazon EC2

8 0.42155597 102 high scalability-2007-09-27-Product: Sequoia Database Clustering Technology

9 0.3996532 1164 high scalability-2011-12-27-PlentyOfFish Update - 6 Billion Pageviews and 32 Billion Images a Month

10 0.39923787 1039 high scalability-2011-05-12-Paper: Mind the Gap: Reconnecting Architecture and OS Research

11 0.36819801 1500 high scalability-2013-08-12-100 Curse Free Lessons from Gordon Ramsay on Building Great Software

12 0.36557809 1032 high scalability-2011-05-02-Stack Overflow Makes Slow Pages 100x Faster by Simple SQL Tuning

13 0.36233485 1577 high scalability-2014-01-13-NYTimes Architecture: No Head, No Master, No Single Point of Failure

14 0.35060242 1239 high scalability-2012-05-04-Stuff The Internet Says On Scalability For May 4, 2012

15 0.34202123 646 high scalability-2009-07-01-Podcast about Facebook's Cassandra Project and the New Wave of Distributed Databases

16 0.33000147 492 high scalability-2009-01-16-Database Sharding for startups

17 0.32862329 89 high scalability-2007-09-10-Is there a difference between partitioning and federation and sharding?

18 0.32119912 1024 high scalability-2011-04-15-Stuff The Internet Says On Scalability For April 15, 2011

19 0.32014292 1452 high scalability-2013-05-06-7 Not So Sexy Tips for Saving Money On Amazon

20 0.31764531 1592 high scalability-2014-02-07-Stuff The Internet Says On Scalability For February 7th, 2014