high_scalability high_scalability-2012 high_scalability-2012-1338 knowledge-graph by maker-knowledge-mining

1338 high scalability-2012-10-11-RAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store


meta infos for this blog

Source: html

Introduction: RAMCube is a datacenter oriented design for RAM-based key-value store that supports thousands or tens of thousands of servers to offer up to hundreds of terabytes of RAM storage. Here's the PDF Paper  describing the system and here's a video of the presentation given at HotCloud . The big idea is: RAMCube exploits the proximity of a BCube network to construct a symmetric MultiRing structure, restricting all failure detection and recovery traffic within a one-hop neighborhood, which addresses problems including false failure detection and recovery traffic congestion. In addition, RAMCube leverages BCube’s multiple paths between any pairs of servers to handle switch failures. A few notes: 75% of Facebook data is stored in memcache. RAM is 1000 time faster than disk RAM is used in caches, but this increases application complexity as applications are responsible for cache consistency. Under a high work load a 1% cache miss rate can lead to a 10x performance penalty. So st


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 RAMCube is a datacenter oriented design for RAM-based key-value store that supports thousands or tens of thousands of servers to offer up to hundreds of terabytes of RAM storage. [sent-1, score-0.596]

2 The big idea is: RAMCube exploits the proximity of a BCube network to construct a symmetric MultiRing structure, restricting all failure detection and recovery traffic within a one-hop neighborhood, which addresses problems including false failure detection and recovery traffic congestion. [sent-3, score-2.109]

3 In addition, RAMCube leverages BCube’s multiple paths between any pairs of servers to handle switch failures. [sent-4, score-0.277]

4 RAM is 1000 time faster than disk RAM is used in caches, but this increases application complexity as applications are responsible for cache consistency. [sent-6, score-0.11]

5 Under a high work load a 1% cache miss rate can lead to a 10x performance penalty. [sent-7, score-0.133]

6 So store data directly in RAM instead of using a cache. [sent-8, score-0.061]

7 Storing all data in RAM requires replication to multiple nodes for safety, which brings in datacenter networking issues. [sent-9, score-0.251]

8 The unhappy reality of datacenter networks must be considered to maximize the benefits of RAM. [sent-10, score-0.372]

9 It's difficult to quickly tell temporary network problems from real node failures. [sent-12, score-0.14]

10 Given the high levels of traffic fast failure recovery is difficult. [sent-13, score-0.42]

11 Utilizes the global topology information of BCube and leverage network proximity to restrict all failure detection and recovery traffic within an one-hop neighborhood. [sent-14, score-1.083]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('traf', 0.42), ('bcube', 0.381), ('ramcube', 0.381), ('recovery', 0.223), ('proximity', 0.207), ('failure', 0.197), ('detection', 0.194), ('datacenter', 0.192), ('avideo', 0.119), ('symmetric', 0.119), ('restricting', 0.119), ('articlesbuilding', 0.106), ('ram', 0.104), ('neighborhood', 0.103), ('restrict', 0.103), ('exploits', 0.09), ('runner', 0.087), ('ambient', 0.083), ('construct', 0.082), ('autonomic', 0.082), ('blade', 0.08), ('temporary', 0.079), ('thousands', 0.079), ('false', 0.078), ('topology', 0.078), ('maximize', 0.076), ('pairs', 0.076), ('leverages', 0.076), ('miss', 0.075), ('safety', 0.075), ('paths', 0.073), ('describing', 0.073), ('meets', 0.072), ('notes', 0.071), ('tens', 0.066), ('within', 0.064), ('addresses', 0.061), ('network', 0.061), ('store', 0.061), ('terabytes', 0.06), ('leverage', 0.059), ('brings', 0.059), ('oriented', 0.059), ('super', 0.059), ('cache', 0.058), ('caches', 0.057), ('considered', 0.053), ('responsible', 0.052), ('switch', 0.052), ('benefits', 0.051)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1338 high scalability-2012-10-11-RAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store

Introduction: RAMCube is a datacenter oriented design for RAM-based key-value store that supports thousands or tens of thousands of servers to offer up to hundreds of terabytes of RAM storage. Here's the PDF Paper  describing the system and here's a video of the presentation given at HotCloud . The big idea is: RAMCube exploits the proximity of a BCube network to construct a symmetric MultiRing structure, restricting all failure detection and recovery traffic within a one-hop neighborhood, which addresses problems including false failure detection and recovery traffic congestion. In addition, RAMCube leverages BCube’s multiple paths between any pairs of servers to handle switch failures. A few notes: 75% of Facebook data is stored in memcache. RAM is 1000 time faster than disk RAM is used in caches, but this increases application complexity as applications are responsible for cache consistency. Under a high work load a 1% cache miss rate can lead to a 10x performance penalty. So st

2 0.26109484 1327 high scalability-2012-09-21-Stuff The Internet Says On Scalability For September 21, 2012

Introduction: It's HighScalability Time: @5h15h : Walmart took 40years to get their data warehouse at 400 terabytes. Facebook probably generates that every 4 days  Should your database failover automatically or wait for the guiding hands of a helpful human? Jeremy Zawodny in  Handling Database Failover at Craigslist  says Craigslist and Yahoo! handle failovers manually. Knowing when a failure has happened is so error prone it's better to put in a human breaker in the loop. Others think this could be a SLA buster as write requests can't be processed while the decision is being made. Main issue is knowing anything is true in a distributed system is hard. Review of a paper about scalable things, MPI, and granularity . If you like to read informed critiques that begin with phrases like "this is simply not true" or "utter garbage" then you might find this post by Sébastien Boisvert to be entertaining. The Big Switch: How We Rebuilt Wanelo from Scratch and Lived to Tell About It . Complete

3 0.10684206 1316 high scalability-2012-09-04-Changing Architectures: New Datacenter Networks Will Set Your Code and Data Free

Introduction: One consequence of IT standardization and commodification has been Google’s datacenter is the computer view of the world. In that view all compute resources (memory, CPU, storage) are fungible. They are interchangeable and location independent, individual computers lose identity and become just a part of a service. Thwarting that nirvana has been the abysmal performance of commodity datacenter networks which have caused the preference of architectures that favor the collocation of state and behaviour on the same box. MapReduce famously ships code over to storage nodes for just this reason. Change the network and you change the fundamental assumption driving collocation based software architectures. You are then free to store data anywhere and move compute anywhere you wish. The datacenter becomes the computer. On the host side with an x8 slot running at PCI-Express 3.0 speeds able to push 8GB/sec (that’s bytes) of bandwidth in both directions, we have

4 0.10519443 786 high scalability-2010-03-02-Using the Ambient Cloud as an Application Runtime

Introduction: This is an excerpt from my article Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud. The future looks many, big, complex, and adaptive: Many clouds. Many servers. Many operating systems. Many languages. Many storage services. Many database services. Many software services. Many adjunct human networks (like Mechanical Turk). Many fast interconnects. Many CDNs. Many cache memory pools. Many application profiles (simple request-response, live streaming, computationally complex, sensor driven, memory intensive, storage intensive, monolithic, decomposable, etc). Many legal jurisdictions. Don't want to perform a function on Patriot Act "protected" systems then move the function elsewhere. Many SLAs. Many data driven pricing policies that like airplane pricing algorithms will price "seats" to maximize profit using multi-variate time sensitive pricing models. Many competitive products. The need t

5 0.10270695 750 high scalability-2009-12-16-Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud

Introduction: "But it is not complicated. [There's] just a lot of it." \--Richard Feynmanon how the immense variety of the world arises from simple rules.Contents:Have We Reached the End of Scaling?Applications Become Black Boxes Using Markets to Scale and Control CostsLet's Welcome our Neo-Feudal OverlordsThe Economic Argument for the Ambient CloudWhat Will Kill the Cloud?The Amazing Collective Compute Power of the Ambient CloudUsing the Ambient Cloud as an Application RuntimeApplications as Virtual StatesConclusionWe have not yet begun to scale. The world is still fundamentally disconnected and for all our wisdom we are still in the earliest days of learning how to build truly large planet-scaling applications.Today 350 million users on Facebook is a lot of users and five million followers on Twitter is a lot of followers. This may seem like a lot now, but consider we have no planet wide applications yet. None.Tomorrow the numbers foreshadow a newCambrian explosionof connectivity that will look as

6 0.10263033 1355 high scalability-2012-11-05-Gone Fishin': Building Super Scalable Systems: Blade Runner Meets Autonomic Computing In The Ambient Cloud

7 0.097714275 1392 high scalability-2013-01-23-Building Redundant Datacenter Networks is Not For Sissies - Use an Outside WAN Backbone

8 0.092103519 1142 high scalability-2011-11-14-Using Gossip Protocols for Failure Detection, Monitoring, Messaging and Other Good Things

9 0.08470767 920 high scalability-2010-10-15-Troubles with Sharding - What can we learn from the Foursquare Incident?

10 0.084101841 768 high scalability-2010-02-01-What Will Kill the Cloud?

11 0.080135331 778 high scalability-2010-02-15-The Amazing Collective Compute Power of the Ambient Cloud

12 0.080002204 538 high scalability-2009-03-16-Are Cloud Based Memory Architectures the Next Big Thing?

13 0.079715379 761 high scalability-2010-01-17-Applications Become Black Boxes Using Markets to Scale and Control Costs

14 0.078761801 816 high scalability-2010-04-28-Elasticity for the Enterprise -- Ensuring Continuous High Availability in a Disaster Failure Scenario

15 0.07779038 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT

16 0.076052696 1604 high scalability-2014-03-03-The “Four Hamiltons” Framework for Mitigating Faults in the Cloud: Avoid it, Mask it, Bound it, Fix it Fast

17 0.075446196 687 high scalability-2009-08-24-How Google Serves Data from Multiple Datacenters

18 0.074836925 753 high scalability-2009-12-21-Hot Holiday Scalability Links for 2009

19 0.073516697 1326 high scalability-2012-09-20-How Vimeo Saves 50% on EC2 by Playing a Smarter Game

20 0.071633525 284 high scalability-2008-03-19-RAD Lab is Creating a Datacenter Operating System


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.113), (1, 0.047), (2, 0.018), (3, 0.026), (4, -0.031), (5, 0.019), (6, 0.047), (7, -0.023), (8, -0.037), (9, 0.016), (10, -0.004), (11, 0.002), (12, -0.008), (13, 0.041), (14, 0.008), (15, 0.058), (16, -0.003), (17, -0.005), (18, -0.011), (19, 0.017), (20, -0.007), (21, 0.095), (22, -0.007), (23, -0.004), (24, -0.031), (25, 0.008), (26, -0.01), (27, 0.041), (28, -0.006), (29, -0.041), (30, -0.038), (31, -0.02), (32, 0.012), (33, -0.03), (34, 0.023), (35, 0.026), (36, -0.039), (37, -0.01), (38, -0.048), (39, 0.042), (40, 0.037), (41, -0.019), (42, -0.047), (43, 0.039), (44, 0.028), (45, -0.009), (46, -0.044), (47, 0.011), (48, -0.033), (49, -0.045)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.93849862 1338 high scalability-2012-10-11-RAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store

Introduction: RAMCube is a datacenter oriented design for RAM-based key-value store that supports thousands or tens of thousands of servers to offer up to hundreds of terabytes of RAM storage. Here's the PDF Paper  describing the system and here's a video of the presentation given at HotCloud . The big idea is: RAMCube exploits the proximity of a BCube network to construct a symmetric MultiRing structure, restricting all failure detection and recovery traffic within a one-hop neighborhood, which addresses problems including false failure detection and recovery traffic congestion. In addition, RAMCube leverages BCube’s multiple paths between any pairs of servers to handle switch failures. A few notes: 75% of Facebook data is stored in memcache. RAM is 1000 time faster than disk RAM is used in caches, but this increases application complexity as applications are responsible for cache consistency. Under a high work load a 1% cache miss rate can lead to a 10x performance penalty. So st

2 0.71183789 1316 high scalability-2012-09-04-Changing Architectures: New Datacenter Networks Will Set Your Code and Data Free

Introduction: One consequence of IT standardization and commodification has been Google’s datacenter is the computer view of the world. In that view all compute resources (memory, CPU, storage) are fungible. They are interchangeable and location independent, individual computers lose identity and become just a part of a service. Thwarting that nirvana has been the abysmal performance of commodity datacenter networks which have caused the preference of architectures that favor the collocation of state and behaviour on the same box. MapReduce famously ships code over to storage nodes for just this reason. Change the network and you change the fundamental assumption driving collocation based software architectures. You are then free to store data anywhere and move compute anywhere you wish. The datacenter becomes the computer. On the host side with an x8 slot running at PCI-Express 3.0 speeds able to push 8GB/sec (that’s bytes) of bandwidth in both directions, we have

3 0.66193092 726 high scalability-2009-10-22-Paper: The Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM

Introduction: Stanford Info Lab is taking pains to document a direction we've been moving for a while now, using RAM not just as a cache, but as the primary storage medium. Many quality products  have built on this model. Even if the vision isn't radical, the paper does produce a lot of data backing up the transition, which is in itself helpful. From the The Abstract: Disk-oriented approaches to online storage are becoming increasingly problematic: they do not scale grace-fully to meet the needs of large-scale Web applications, and improvements in disk capacity have far out-stripped improvements in access latency and bandwidth. This paper argues for a new approach to datacenter storage called RAMCloud, where information is kept entirely in DRAM and large-scale systems are created by aggregating the main memories of thousands of commodity servers. We believe that RAMClouds can provide durable and available storage with 100-1000x the throughput of disk-based systems and 100-1000x lower access lat

4 0.64244103 1392 high scalability-2013-01-23-Building Redundant Datacenter Networks is Not For Sissies - Use an Outside WAN Backbone

Introduction: Ivan Pepelnjak, in his short and information packed  REDUNDANT DATA CENTER INTERNET CONNECTIVIT Y video, shows why networking as played at the highest levels is something you want to leave to professionals, like a large animal country vetenarian delivering a stuck foal at 2AM on a dark and stormy night.  There are always a lot questions about the black art of building redundant datacenter networks and there's a shortage of accessible explanations. What I liked about Ivan's video is how effortlessly he explains the issues and tradeoffs you can expect in designing your own solution, as well as giving creative solutions to those problems. A lot of years of experience are boiled down to a 17 minute video. Ivan begins by showing what a canonical fully redundant datacenter would look like: It's like an ark where everything goes two by two. You have two datacenters, each datacenter has redundant core switches, redundant servers, redundant disk arrays, redundant links between d

5 0.63759005 25 high scalability-2007-07-25-Paper: Designing Disaster Tolerant High Availability Clusters

Introduction: A very detailed (339 pages) paper on how to use HP products to create a highly available cluster. It's somewhat dated and obviously concentrates on HP products, but it is still good information. Table of contents: 1. Disaster Tolerance and Recovery in a Serviceguard Cluster 2. Building an Extended Distance Cluster Using ServiceGuard 3. Designing a Metropolitan Cluster 4. Designing a Continental Cluster 5. Building Disaster-Tolerant Serviceguard Solutions Using Metrocluster with Continuous Access XP 6. Building Disaster Tolerant Serviceguard Solutions Using Metrocluster with EMC SRDF 7. Cascading Failover in a Continental Cluster Evaluating the Need for Disaster Tolerance What is a Disaster Tolerant Architecture? Types of Disaster Tolerant Clusters Extended Distance Clusters Metropolitan Cluster Continental Cluster Continental Cluster With Cascading Failover Disaster Tolerant Architecture Guidelines Protecting Nodes through Geographic Dispersion Protecting Data th

6 0.63380033 371 high scalability-2008-08-24-A Scalable, Commodity Data Center Network Architecture

7 0.62030637 387 high scalability-2008-09-22-Paper: On Delivering Embarrassingly Distributed Cloud Services

8 0.61881095 786 high scalability-2010-03-02-Using the Ambient Cloud as an Application Runtime

9 0.61482894 284 high scalability-2008-03-19-RAD Lab is Creating a Datacenter Operating System

10 0.61292803 1256 high scalability-2012-06-04-OpenFlow-SDN is Not a Silver Bullet for Network Scalability

11 0.60765558 101 high scalability-2007-09-27-Product: Ganglia Monitoring System

12 0.59765482 1483 high scalability-2013-06-27-Paper: XORing Elephants: Novel Erasure Codes for Big Data

13 0.59687591 879 high scalability-2010-08-12-Think of Latency as a Pseudo-permanent Network Partition

14 0.58958912 430 high scalability-2008-10-26-Should you use a SAN to scale your architecture?

15 0.58539218 1327 high scalability-2012-09-21-Stuff The Internet Says On Scalability For September 21, 2012

16 0.57936251 1142 high scalability-2011-11-14-Using Gossip Protocols for Failure Detection, Monitoring, Messaging and Other Good Things

17 0.57573998 882 high scalability-2010-08-18-Misco: A MapReduce Framework for Mobile Systems - Start of the Ambient Cloud?

18 0.57503623 750 high scalability-2009-12-16-Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud

19 0.57466453 1355 high scalability-2012-11-05-Gone Fishin': Building Super Scalable Systems: Blade Runner Meets Autonomic Computing In The Ambient Cloud

20 0.57102501 23 high scalability-2007-07-24-Major Websites Down: Or Why You Want to Run in Two or More Data Centers.


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.069), (2, 0.201), (10, 0.053), (47, 0.04), (61, 0.09), (79, 0.171), (91, 0.233), (94, 0.032)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.92384464 642 high scalability-2009-06-29-HighScalability Rated #3 Blog for Developers

Introduction: Hey we're moving up in the world, jumping from 19th place to 3rd place. In case you aren't sure what I'm talking about, Jurgen Appelo goes through this massive effort of ranking blogs according to Google PageRank, Technorati Authority, Alexa Rank, Google links, and Twitter Grader Rank. Through some obviously mistaken calculations HighScalability comes out #3. Given all the superb competition I'm not exactly sure how that can be. Well, thanks for all the excellent people who contribute and all the even more excellent people that read. Now at least I have something worthy to put on my tombstone :-)

2 0.92048001 712 high scalability-2009-10-01-Moving Beyond End-to-End Path Information to Optimize CDN Performance

Introduction: You go through the expense of installing CDNs all over the globe to make sure users always have a node close by and you notice something curious and furious: clients still experience poor latencies. What's up with that? What do you do to find the problem? If you are Google you build a tool (WhyHigh) to figure out what's up. This paper is about the tool and the unexpected problem of high latencies on CDNs. The main problems they found: inefficient routing to nearby nodes and packet queuing. But more useful is the architecture of WhyHigh and how it goes about identifying bottle necks. And even more useful is the general belief in creating sophisticated tools to understand and improve your service. That's what professionals do. From the abstract: Replicating content across a geographically distributed set of servers and redirecting clients to the closest server in terms of latency has emerged as a common paradigm for improving client performance. In this paper, we analyze latenc

same-blog 3 0.90713388 1338 high scalability-2012-10-11-RAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store

Introduction: RAMCube is a datacenter oriented design for RAM-based key-value store that supports thousands or tens of thousands of servers to offer up to hundreds of terabytes of RAM storage. Here's the PDF Paper  describing the system and here's a video of the presentation given at HotCloud . The big idea is: RAMCube exploits the proximity of a BCube network to construct a symmetric MultiRing structure, restricting all failure detection and recovery traffic within a one-hop neighborhood, which addresses problems including false failure detection and recovery traffic congestion. In addition, RAMCube leverages BCube’s multiple paths between any pairs of servers to handle switch failures. A few notes: 75% of Facebook data is stored in memcache. RAM is 1000 time faster than disk RAM is used in caches, but this increases application complexity as applications are responsible for cache consistency. Under a high work load a 1% cache miss rate can lead to a 10x performance penalty. So st

4 0.89243144 826 high scalability-2010-05-12-The Rise of the Virtual Cellular Machines

Introduction: My apologies if you were looking for a post about cell phones. This post is about high density nanodevices. It's a follow up to How will memristors change everything?  for those wishing to pursue these revolutionary ideas in more depth. This is one of those areas where if you are in the space then there's a lot of available information and if you are on the outside then it doesn't even seem to exist. Fortunately, Ben Chandler from  The SyNAPSE Project , was kind enough to point me to a great set of presentations given at the 12th IEEE CNNA - International Workshop on Cellular Nanoscale Networks and their Applications - Towards Megaprocessor Computing. WARNING: these papers contain extreme technical content. If you are like me and you aren't an electrical engineer, much of it may make a sort of surface sense, but the deep and twisty details will fly over head. For the more software minded there are a couple more accessible presentations: Intelligent Machines built with Memristiv

5 0.87975198 722 high scalability-2009-10-15-Hot Scalability Links for Oct 15 2009

Introduction: Update: Social networks in the database: using a graph database . Anders Nawroth puts graphs through their paces by representing, traversing, and performing other common social network operations using a graph database. Update: Deployment with Capistrano  by Charles Max Wood. Simple step-by-step for using Capistrano for deployment. Log-structured file systems: There's one in every SSD  by Valerie Aurora. SSDs have totally changed the performance characteristics of storage! Disks are dead! Long live flash! An Engineer's Guide to Bandwidth  by DGentry. I t's a rough world out there, and we need to to a better job of thinking about and testing under realistic network conditions. Analyzing air traffic performance with InfoBright and MonetDB  by Vadim of the MySQL Performance Blog. Scalable Delivery of Stream Query Result by Zhou, Y ; Salehi, A ; Aberer, K. In this paper, we leverage Distributed Publish/Subscribe System (DPSS), a scalable data dissemination infrastruct

6 0.82976073 742 high scalability-2009-11-17-10 eBay Secrets for Planet Wide Scaling

7 0.82783026 921 high scalability-2010-10-18-NoCAP

8 0.79829901 453 high scalability-2008-12-01-Breakthrough Web-Tier Solutions with Record-Breaking Performance

9 0.79427242 1285 high scalability-2012-07-18-Disks Ain't Dead Yet: GraphChi - a disk-based large-scale graph computation

10 0.79026067 1209 high scalability-2012-03-14-The Azure Outage: Time Is a SPOF, Leap Day Doubly So

11 0.7847541 651 high scalability-2009-07-02-Product: Project Voldemort - A Distributed Database

12 0.7693795 687 high scalability-2009-08-24-How Google Serves Data from Multiple Datacenters

13 0.76890516 780 high scalability-2010-02-19-Twitter’s Plan to Analyze 100 Billion Tweets

14 0.76565963 526 high scalability-2009-03-05-Strategy: In Cloud Computing Systematically Drive Load to the CPU

15 0.76500469 601 high scalability-2009-05-17-Product: Hadoop

16 0.76254725 1048 high scalability-2011-05-27-Stuff The Internet Says On Scalability For May 27, 2011

17 0.76168901 289 high scalability-2008-03-27-Amazon Announces Static IP Addresses and Multiple Datacenter Operation

18 0.76139903 1098 high scalability-2011-08-15-Should any cloud be considered one availability zone? The Amazon experience says yes.

19 0.76125675 1548 high scalability-2013-11-13-Google: Multiplex Multiple Works Loads on Computers to Increase Machine Utilization and Save Money

20 0.76086181 1589 high scalability-2014-02-03-How Google Backs Up the Internet Along With Exabytes of Other Data