high_scalability high_scalability-2008 high_scalability-2008-368 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: How do you design a reliable distributed file system when the expected availability of the individual nodes are only ~1/5? That is the case for P2P systems. Dominik Grolimund, the founder of a Swiss startup Caleido will show you how! They have launched Wuala , the social online storage service which scales as new nodes join the P2P network. The goal of Wua.la is to provide distributed online storage that is: large scalable reliable secure by harnessing the idle resources of participating computers. This challenge is an old dream of computer science. In fact as Andrew Tanenbaum wrote in 1995: "The design of a world-wide, fully transparent distributed filesystem fot simultaneous use by millions of mobile and frequently disconnected users is left as an exercise for the reader" After three years of research and development at at ETH Zurich, the Swiss Federal Institute of Technology on a distributed storage system, Caleido is ready to unveil the resu
sentIndex sentText sentNum sentScore
1 How do you design a reliable distributed file system when the expected availability of the individual nodes are only ~1/5? [sent-1, score-0.282]
2 Dominik Grolimund, the founder of a Swiss startup Caleido will show you how! [sent-3, score-0.109]
3 They have launched Wuala , the social online storage service which scales as new nodes join the P2P network. [sent-4, score-0.235]
4 la is to provide distributed online storage that is: large scalable reliable secure by harnessing the idle resources of participating computers. [sent-6, score-0.538]
5 This challenge is an old dream of computer science. [sent-7, score-0.051]
6 Wuala is a new way of storing, sharing, and publishing files on the internet. [sent-9, score-0.06]
7 It enables its users to trade parts of their local storage for online storage and it allows us to provide a better service for free. [sent-10, score-0.401]
8 In this Google Tech Talk , Dominik will explain what Wuala is and how it works, and he will also show a demo. [sent-11, score-0.062]
9 The availability problem is solved by redundancy (just like in Google File System). [sent-12, score-0.094]
10 However simple replication techniques would result in too much overhead because of the low availability of the nodes. [sent-13, score-0.151]
11 Instead Wuala employs erasure coding and splits the data into small pieces. [sent-14, score-0.205]
12 Optimal erasure codes produce n/r fragments where any n fragments is sufficient to recover the original message. [sent-15, score-0.58]
13 These pieces are then distributed in the P2P network providing good availability at a reasonable overhead. [sent-16, score-0.23]
14 The P2P network consists of client, storage and routing nodes. [sent-17, score-0.151]
15 Dominik also explains how Wuala architecture is designed to provide security and fairness. [sent-19, score-0.071]
16 Wuala employs the 128 bit AES algorithm for encryption and the 2048 bit RSA algorithm for authentication. [sent-20, score-0.469]
17 If you're interested in how Wuala manages encryption, have a look at their publication on Cryptree . [sent-21, score-0.09]
18 They have also implemented distributed reputation audit and maintenance functions. [sent-22, score-0.241]
wordName wordTfidf (topN-words)
[('wuala', 0.74), ('caleido', 0.211), ('dominik', 0.211), ('swiss', 0.156), ('fragments', 0.146), ('encryption', 0.121), ('aes', 0.096), ('eth', 0.096), ('zurich', 0.096), ('storage', 0.094), ('availability', 0.094), ('federal', 0.09), ('publication', 0.09), ('online', 0.088), ('distributed', 0.088), ('rsa', 0.086), ('thisgoogle', 0.086), ('audit', 0.08), ('algorithm', 0.08), ('institute', 0.078), ('participating', 0.074), ('reputation', 0.073), ('disconnected', 0.073), ('splits', 0.073), ('erasure', 0.072), ('provide', 0.071), ('exercise', 0.067), ('codes', 0.065), ('harnessing', 0.065), ('bit', 0.064), ('show', 0.062), ('publishing', 0.06), ('employs', 0.06), ('idle', 0.058), ('result', 0.057), ('consists', 0.057), ('filesystem', 0.055), ('trade', 0.054), ('nodes', 0.053), ('sufficient', 0.053), ('reader', 0.051), ('recover', 0.051), ('simultaneous', 0.051), ('dream', 0.051), ('talk', 0.05), ('reasonable', 0.048), ('produce', 0.047), ('file', 0.047), ('left', 0.047), ('founder', 0.047)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 368 high scalability-2008-08-17-Wuala - P2P Online Storage Cloud
Introduction: How do you design a reliable distributed file system when the expected availability of the individual nodes are only ~1/5? That is the case for P2P systems. Dominik Grolimund, the founder of a Swiss startup Caleido will show you how! They have launched Wuala , the social online storage service which scales as new nodes join the P2P network. The goal of Wua.la is to provide distributed online storage that is: large scalable reliable secure by harnessing the idle resources of participating computers. This challenge is an old dream of computer science. In fact as Andrew Tanenbaum wrote in 1995: "The design of a world-wide, fully transparent distributed filesystem fot simultaneous use by millions of mobile and frequently disconnected users is left as an exercise for the reader" After three years of research and development at at ETH Zurich, the Swiss Federal Institute of Technology on a distributed storage system, Caleido is ready to unveil the resu
2 0.087383762 125 high scalability-2007-10-18-another approach to replication
Introduction: File replication based on erasure codes can reduce total replicas size 2 times and more.
3 0.0776738 1483 high scalability-2013-06-27-Paper: XORing Elephants: Novel Erasure Codes for Big Data
Introduction: Erasure codes are one of those seemingly magical mathematical creations that with the developments described in the paper XORing Elephants: Novel Erasure Codes for Big Data , are set to replace triple replication as the data storage protection mechanism of choice. The result says Robin Harris (StorageMojo) in an excellent article, Facebook’s advanced erasure codes : "WebCos will be able to store massive amounts of data more efficiently than ever before. Bad news: so will anyone else." Robin says with cheap disks triple replication made sense and was economical. With ever bigger BigData the overhead has become costly. But erasure codes have always suffered from unacceptably long time to repair times. This paper describes new Locally Repairable Codes (LRCs) that are efficiently repairable in disk I/O and bandwidth requirements: These systems are now designed to survive the loss of up to four storage elements – disks, servers, nodes or even entire data centers – without losing
4 0.062073078 750 high scalability-2009-12-16-Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud
Introduction: "But it is not complicated. [There's] just a lot of it." \--Richard Feynmanon how the immense variety of the world arises from simple rules.Contents:Have We Reached the End of Scaling?Applications Become Black Boxes Using Markets to Scale and Control CostsLet's Welcome our Neo-Feudal OverlordsThe Economic Argument for the Ambient CloudWhat Will Kill the Cloud?The Amazing Collective Compute Power of the Ambient CloudUsing the Ambient Cloud as an Application RuntimeApplications as Virtual StatesConclusionWe have not yet begun to scale. The world is still fundamentally disconnected and for all our wisdom we are still in the earliest days of learning how to build truly large planet-scaling applications.Today 350 million users on Facebook is a lot of users and five million followers on Twitter is a lot of followers. This may seem like a lot now, but consider we have no planet wide applications yet. None.Tomorrow the numbers foreshadow a newCambrian explosionof connectivity that will look as
Introduction: All in all this is still my favorite post and I still think it's an accurate vision of a future. Not everyone agrees, but I guess we'll see..."But it is not complicated. [There's] just a lot of it." \--Richard Feynmanon how the immense variety of the world arises from simple rules.Contents:Have We Reached the End of Scaling?Applications Become Black Boxes Using Markets to Scale and Control CostsLet's Welcome our Neo-Feudal OverlordsThe Economic Argument for the Ambient CloudWhat Will Kill the Cloud?The Amazing Collective Compute Power of the Ambient CloudUsing the Ambient Cloud as an Application RuntimeApplications as Virtual StatesConclusionWe have not yet begun to scale. The world is still fundamentally disconnected and for all our wisdom we are still in the earliest days of learning how to build truly large planet-scaling applications.Today 350 million users on Facebook is a lot of users and five million followers on Twitter is a lot of followers. This may seem like a lot now, but c
6 0.055197015 889 high scalability-2010-08-30-Pomegranate - Storing Billions and Billions of Tiny Little Files
7 0.052430779 1468 high scalability-2013-05-31-Stuff The Internet Says On Scalability For May 31, 2013
8 0.052057222 420 high scalability-2008-10-15-Tokyo Tech Tsubame Grid Storage Implementation
9 0.051889185 20 high scalability-2007-07-16-Paper: The Clustered Storage Revolution
10 0.05145552 1424 high scalability-2013-03-15-Stuff The Internet Says On Scalability For March 15, 2013
11 0.051346064 96 high scalability-2007-09-18-Amazon Architecture
12 0.050443843 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT
13 0.049086004 1509 high scalability-2013-08-30-Stuff The Internet Says On Scalability For August 30, 2013
14 0.048722655 786 high scalability-2010-03-02-Using the Ambient Cloud as an Application Runtime
19 0.047282334 1501 high scalability-2013-08-13-In Memoriam: Lavabit Architecture - Creating a Scalable Email Service
20 0.047273334 881 high scalability-2010-08-16-Scaling an AWS infrastructure - Tools and Patterns
topicId topicWeight
[(0, 0.099), (1, 0.031), (2, 0.008), (3, 0.027), (4, -0.015), (5, 0.016), (6, 0.011), (7, -0.008), (8, -0.024), (9, 0.03), (10, 0.023), (11, -0.001), (12, 0.008), (13, -0.034), (14, 0.018), (15, 0.049), (16, -0.01), (17, 0.008), (18, -0.013), (19, 0.001), (20, 0.025), (21, 0.023), (22, -0.013), (23, 0.02), (24, -0.041), (25, -0.012), (26, 0.016), (27, -0.015), (28, -0.037), (29, -0.02), (30, 0.003), (31, 0.002), (32, -0.021), (33, -0.005), (34, -0.023), (35, -0.023), (36, 0.025), (37, 0.03), (38, 0.013), (39, -0.001), (40, 0.005), (41, -0.058), (42, 0.001), (43, 0.0), (44, -0.006), (45, 0.012), (46, -0.02), (47, 0.005), (48, -0.006), (49, -0.005)]
simIndex simValue blogId blogTitle
same-blog 1 0.96222955 368 high scalability-2008-08-17-Wuala - P2P Online Storage Cloud
Introduction: How do you design a reliable distributed file system when the expected availability of the individual nodes are only ~1/5? That is the case for P2P systems. Dominik Grolimund, the founder of a Swiss startup Caleido will show you how! They have launched Wuala , the social online storage service which scales as new nodes join the P2P network. The goal of Wua.la is to provide distributed online storage that is: large scalable reliable secure by harnessing the idle resources of participating computers. This challenge is an old dream of computer science. In fact as Andrew Tanenbaum wrote in 1995: "The design of a world-wide, fully transparent distributed filesystem fot simultaneous use by millions of mobile and frequently disconnected users is left as an exercise for the reader" After three years of research and development at at ETH Zurich, the Swiss Federal Institute of Technology on a distributed storage system, Caleido is ready to unveil the resu
2 0.78683501 979 high scalability-2011-01-27-Comet - An Example of the New Key-Code Databases
Introduction: Comet is an active distributed key-value store built at the University of Washington. The paper describing Comet is Comet: An active distributed key-value store , there are also slides , and a MP3 of a presentation given at OSDI '10 . Here's a succinct overview of Comet : Today's cloud storage services, such as Amazon S3 or peer-to-peer DHTs, are highly inflexible and impose a variety of constraints on their clients: specific replication and consistency schemes, fixed data timeouts, limited logging, etc. We witnessed such inflexibility first-hand as part of our Vanish work, where we used a DHT to store encryption keys temporarily. To address this issue, we built Comet, an extensible storage service that allows clients to inject snippets of code that control their data's behavior inside the storage service. I found this paper quite interesting because it takes the initial steps of collocating code with a key-value store, which turns it into what might called a key-code
Introduction: DataDirect Networks (www.ddn.com) is searching for beta testers for our exciting new object-based clustered storage system. Does this sound like you? * Need to store millions to hundreds of billions of files * Want to use one big file system but can't because no single file system scales big enough * Running out of inodes * Have to constantly tweak file systems to perform better * Need to replicate content to more than one data center across geographies * Have thumbnail images or other small files that wreak havoc on your file and storage systems * Constantly tweaking and engineering around performance and scalability limits * No storage system delivers enough IOPS to serve your content * Spend time load balancing the storage environment * Want a single, simple way to manage all this data If this sounds like you, please contact me at jgoldstein@ddn.com. DataDirect Networks is a 10-year old, well-established storage systems company specializing in Extreme Sto
4 0.73336321 20 high scalability-2007-07-16-Paper: The Clustered Storage Revolution
Introduction: The Clustered Storage Revolution If the clustered file system, clustered storage system, storage virtualization movement is new to you then this is a good intro paper. It's a both vendor puff piece and informative, so it might be worth your time. A Quick Hit of What's Inside Clustered storage architectures have the ability to pull together two or more storage devices to behave as a single entity. Clustered storage can be broken down into three types: 2-way simple failover clustering Namespace aggregation Clustered storage with a distributed file systems (DFS)
5 0.731206 128 high scalability-2007-10-21-Paper: Standardizing Storage Clusters (with pNFS)
Introduction: pNFS (parallel NFS) is the next generation of NFS and its main claim to fame is that it's clustered, which "enables clients to directly access file data spread over multiple storage servers in parallel. As a result, each client can leverage the full aggregate bandwidth of a clustered storage service at the granularity of an individual file." About pNFS StorageMojo says: pNFS is going to commoditize parallel data access. In 5 years we won’t know how we got along without it . Something to watch.
6 0.73004329 1483 high scalability-2013-06-27-Paper: XORing Elephants: Novel Erasure Codes for Big Data
7 0.72542411 278 high scalability-2008-03-16-Product: GlusterFS
8 0.71705049 103 high scalability-2007-09-28-Kosmos File System (KFS) is a New High End Google File System Option
9 0.71354568 1549 high scalability-2013-11-15-Stuff The Internet Says On Scalability For November 15th, 2013
10 0.7112869 112 high scalability-2007-10-04-You Can Now Store All Your Stuff on Your Own Google Like File System
11 0.70707685 50 high scalability-2007-07-31-BerkeleyDB & other distributed high performance key-value databases
12 0.70318127 705 high scalability-2009-09-16-Paper: A practical scalable distributed B-tree
13 0.70153117 12 high scalability-2007-07-15-Isilon Clustred Storage System
14 0.70065421 101 high scalability-2007-09-27-Product: Ganglia Monitoring System
15 0.70059019 839 high scalability-2010-06-09-Paper: Propagation Networks: A Flexible and Expressive Substrate for Computation
16 0.69965285 1142 high scalability-2011-11-14-Using Gossip Protocols for Failure Detection, Monitoring, Messaging and Other Good Things
17 0.69680107 1035 high scalability-2011-05-05-Paper: A Study of Practical Deduplication
18 0.69389319 1479 high scalability-2013-06-21-Stuff The Internet Says On Scalability For June 21, 2013
19 0.69374567 1316 high scalability-2012-09-04-Changing Architectures: New Datacenter Networks Will Set Your Code and Data Free
20 0.69122487 889 high scalability-2010-08-30-Pomegranate - Storing Billions and Billions of Tiny Little Files
topicId topicWeight
[(1, 0.127), (2, 0.158), (5, 0.017), (10, 0.018), (27, 0.018), (31, 0.291), (61, 0.029), (79, 0.111), (85, 0.013), (94, 0.099), (98, 0.011)]
simIndex simValue blogId blogTitle
1 0.87819666 207 high scalability-2008-01-10-Sharding with Cookie-Based Session Storage
Introduction: In a recent project, I utilized RoR's cookie-based session storage to shard geographically distinct user groups. My technique for doing so was unique and, although it was a premature optimization, it is none-the-less an idea worth exploring.
2 0.85193753 62 high scalability-2007-08-08-Partial String Matching
Introduction: Is there any alternative to LIKE '%...%' OR LIKE '%...%' in MySQL if you have to offer partial string matching on a large dataset?
same-blog 3 0.83134001 368 high scalability-2008-08-17-Wuala - P2P Online Storage Cloud
Introduction: How do you design a reliable distributed file system when the expected availability of the individual nodes are only ~1/5? That is the case for P2P systems. Dominik Grolimund, the founder of a Swiss startup Caleido will show you how! They have launched Wuala , the social online storage service which scales as new nodes join the P2P network. The goal of Wua.la is to provide distributed online storage that is: large scalable reliable secure by harnessing the idle resources of participating computers. This challenge is an old dream of computer science. In fact as Andrew Tanenbaum wrote in 1995: "The design of a world-wide, fully transparent distributed filesystem fot simultaneous use by millions of mobile and frequently disconnected users is left as an exercise for the reader" After three years of research and development at at ETH Zurich, the Swiss Federal Institute of Technology on a distributed storage system, Caleido is ready to unveil the resu
4 0.7796343 615 high scalability-2009-06-01-HotPads on AWS
Introduction: HotPads abandoned our managed hosting in December and took the leap over to EC2 and its siblings. The presentation has a lot of detail on costs and other things to watch out for, so if you're currently planning your "cloud" architecture, you'll find some of this really helpful.
5 0.77344412 1651 high scalability-2014-05-20-It's Networking. In Space! Or How E.T. Will Phone Home.
Introduction: What will the version of the Internet that follows us to the stars look like? Yes, people are really thinking seriously about this sort of thing. Specifically the InterPlanetary Networking Special Interest Group (IPNSIG). Ansible-like faster-than-light communication it isn't. There's no magical warp drive. Nor is a network of telepaths acting as a 'verse spanning telegraph system. It's more mundane than that. And in many ways more interesting as it's sort of like the old Internet on steroids, the one that was based on on UUCP and dial-up connections, but over vastly longer distances and with much longer delays : The Interplanetary Internet (based on IPN, also called InterPlaNet) is a conceived computer network in space, consisting of a set of network nodes which can communicate with each other.[1][2] Communication would be greatly delayed by the great interplanetary distances, so the IPN needs a new set of protocols and technology that are tolerant to large delays and
6 0.71009517 785 high scalability-2010-02-26-MySQL and Memcached: End of an Era?
7 0.70641816 702 high scalability-2009-09-11-The interactive cloud
8 0.691908 1255 high scalability-2012-06-01-Stuff The Internet Says On Scalability For June 1, 2012
9 0.68420279 294 high scalability-2008-04-01-How to update video views count effectively?
10 0.68086064 888 high scalability-2010-08-27-OpenStack - The Answer to: How do We Compete with Amazon?
11 0.66668004 892 high scalability-2010-09-02-Distributed Hashing Algorithms by Example: Consistent Hashing
12 0.64392924 1516 high scalability-2013-09-13-Stuff The Internet Says On Scalability For September 13, 2013
13 0.64373213 266 high scalability-2008-03-04-Manage Downtime Risk by Connecting Multiple Data Centers into a Secure Virtual LAN
14 0.6420573 229 high scalability-2008-01-29-Building scalable storage into application - Instead of MogileFS OpenAFS etc.
15 0.64159548 719 high scalability-2009-10-09-Have you collectl'd yet? If not, maybe collectl-utils will make it easier to do so
16 0.6408782 863 high scalability-2010-07-22-How can we spark the movement of research out of the Ivory Tower and into production?
17 0.64012778 1174 high scalability-2012-01-13-Stuff The Internet Says On Scalability For January 13, 2012
18 0.63845819 976 high scalability-2011-01-20-75% Chance of Scale - Leveraging the New Scaleogenic Environment for Growth
19 0.63712204 303 high scalability-2008-04-18-Scaling Mania at MySQL Conference 2008
20 0.63492715 645 high scalability-2009-06-30-Hot New Trend: Linking Clouds Through Cheap IP VPNs Instead of Private Lines