high_scalability high_scalability-2007 high_scalability-2007-143 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: If you are trying to create highly available file systems, especially across data centers, then ChironFS is one potential solution. It's relatively new, so there aren't lots of experience reports, but it looks worth considering. What is ChironFS and how does it work? Adapted from the ChironFS website: The Chiron Filesystem is a Fuse based filesystem that frees you from single points of failure. It's main purpose is to guarantee filesystem availability using replication. But it isn't a RAID implementation. RAID replicates DEVICES not FILESYSTEMS. Why not just use RAID over some network block device? Because it is a block device and if one server mounts that device in RW mode, no other server will be able to mount it in RW mode. Any real network may have many servers and offer a variety of services. Keeping everything running can become a real nightmare!
sentIndex sentText sentNum sentScore
1 If you are trying to create highly available file systems, especially across data centers, then ChironFS is one potential solution. [sent-1, score-0.46]
2 It's relatively new, so there aren't lots of experience reports, but it looks worth considering. [sent-2, score-0.311]
3 Adapted from the ChironFS website: The Chiron Filesystem is a Fuse based filesystem that frees you from single points of failure. [sent-4, score-0.641]
4 It's main purpose is to guarantee filesystem availability using replication. [sent-5, score-0.663]
5 Why not just use RAID over some network block device? [sent-8, score-0.242]
6 Because it is a block device and if one server mounts that device in RW mode, no other server will be able to mount it in RW mode. [sent-9, score-1.208]
7 Any real network may have many servers and offer a variety of services. [sent-10, score-0.338]
wordName wordTfidf (topN-words)
[('chironfs', 0.448), ('rw', 0.406), ('filesystem', 0.35), ('raid', 0.286), ('device', 0.256), ('fuse', 0.182), ('mounts', 0.182), ('block', 0.177), ('frees', 0.154), ('nightmare', 0.152), ('mount', 0.149), ('adapted', 0.144), ('replicates', 0.133), ('guarantee', 0.104), ('mode', 0.1), ('reports', 0.096), ('purpose', 0.09), ('variety', 0.088), ('real', 0.083), ('devices', 0.082), ('centers', 0.082), ('potential', 0.078), ('relatively', 0.074), ('keeping', 0.073), ('points', 0.069), ('main', 0.069), ('offer', 0.068), ('worth', 0.068), ('looks', 0.066), ('network', 0.065), ('trying', 0.063), ('especially', 0.063), ('lots', 0.057), ('become', 0.055), ('server', 0.054), ('website', 0.053), ('file', 0.05), ('availability', 0.05), ('highly', 0.049), ('everything', 0.046), ('experience', 0.046), ('able', 0.044), ('available', 0.043), ('create', 0.042), ('running', 0.038), ('across', 0.036), ('one', 0.036), ('based', 0.034), ('may', 0.034), ('single', 0.034)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 143 high scalability-2007-11-06-Product: ChironFS
Introduction: If you are trying to create highly available file systems, especially across data centers, then ChironFS is one potential solution. It's relatively new, so there aren't lots of experience reports, but it looks worth considering. What is ChironFS and how does it work? Adapted from the ChironFS website: The Chiron Filesystem is a Fuse based filesystem that frees you from single points of failure. It's main purpose is to guarantee filesystem availability using replication. But it isn't a RAID implementation. RAID replicates DEVICES not FILESYSTEMS. Why not just use RAID over some network block device? Because it is a block device and if one server mounts that device in RW mode, no other server will be able to mount it in RW mode. Any real network may have many servers and offer a variety of services. Keeping everything running can become a real nightmare!
2 0.24085411 271 high scalability-2008-03-08-Product: DRBD - Distributed Replicated Block Device
Introduction: From their website: DRBD is a block device which is designed to build high availability clusters. This is done by mirroring a whole block device via (a dedicated) network. You could see it as a network raid-1. DRBD takes over the data, writes it to the local disk and sends it to the other host. On the other host, it takes it to the disk there. The other components needed are a cluster membership service, which is supposed to be heartbeat, and some kind of application that works on top of a block device. Examples: A filesystem & fsck. A journaling FS. A database with recovery capabilities. Each device (DRBD provides more than one of these devices) has a state, which can be 'primary' or 'secondary'. On the node with the primary device the application is supposed to run and to access the device (/dev/drbdX). Every write is sent to the local 'lower level block device' and to the node with the device in 'secondary' state. The secondary device simply writes the data to its lowe
3 0.24054761 53 high scalability-2007-08-01-Product: MogileFS
Introduction: MogileFS is an open source distributed filesystem. Its properties and features include: Application level, No single point of failure, Automatic file replication, Better than RAID, Flat Namespace, Shared-Nothing, No RAID required, Local filesystem agnostic.
4 0.10226515 1279 high scalability-2012-07-09-Data Replication in NoSQL Databases
Introduction: This is the third guest post ( part 1 , part 2 ) of a series by Greg Lindahl, CTO of blekko, the spam free search engine. Previously, Greg was Founder and Distinguished Engineer at PathScale, at which he was the architect of the InfiniPath low-latency InfiniBand HCA, used to build tightly-coupled supercomputing clusters. blekko's home-grown NoSQL database was designed from the start to support a web-scale search engine, with 1,000s of servers and petabytes of disk. Data replication is a very important part of keeping the database up and serving queries. Like many NoSQL database authors, we decided to keep R=3 copies of each piece of data in the database, and not use RAID to improve reliability. The key goal we were shooting for was a database which degrades gracefully when there are many small failures over time, without needing human intervention. Why don't we like RAID for big NoSQL databases? Most big storage systems use RAID levels like 3, 4, 5, or 10 to improve relia
5 0.09983135 1348 high scalability-2012-10-26-Stuff The Internet Says On Scalability For October 26, 2012
Introduction: It's HighScalability Time: 1.5 Billion Pageviews : Etsy in September; 200 dedicated database servers : Tumblr Quotable Quotes: @rbranson : Datadog stays available where it counts (metrics injest) by using Cassandra, combined with an RDBMS for queries. Nice. @jmhodges : Few engineers know what modern hw is capable of, in part, because the only people that see the numbers are in orgs that had to care or die. @tinagroves : Storing the brain in the cloud might cost $38/month asserts Jim Adelius at #strataconf in talk on #bigdata and thought crimes. Why is it hard to scale a database, in layman’s terms? on Quora. Some really good answers. My answer would involve a cookie jar filled with all different kinds of cookies and a motley crew of kindergartners all trying to get cookies at the same time while keebler elves are trying to fill up the jar, all at the same time. Rackspace now has their own cloud block storage prod
6 0.097492114 585 high scalability-2009-04-29-How to choice and build perfect server
7 0.083249293 1177 high scalability-2012-01-19-Is it time to get rid of the Linux OS model in the cloud?
8 0.082118466 283 high scalability-2008-03-18-Shared filesystem on EC2
9 0.079476759 1508 high scalability-2013-08-28-Sean Hull's 20 Biggest Bottlenecks that Reduce and Slow Down Scalability
10 0.076940112 1473 high scalability-2013-06-10-The 10 Deadly Sins Against Scalability
11 0.076669231 1017 high scalability-2011-04-06-Netflix: Run Consistency Checkers All the time to Fixup Transactions
12 0.074977376 1511 high scalability-2013-09-04-Wide Fast SATA: the Recipe for Hot Performance
13 0.072691716 1053 high scalability-2011-06-06-Apple iCloud: Syncing and Distributed Storage Over Streaming and Centralized Storage
14 0.072502188 98 high scalability-2007-09-18-Sync data on all servers
15 0.071618535 1114 high scalability-2011-09-13-Must see: 5 Steps to Scaling MongoDB (Or Any DB) in 8 Minutes
16 0.06631317 1000 high scalability-2011-03-08-Medialets Architecture - Defeating the Daunting Mobile Device Data Deluge
17 0.06466525 821 high scalability-2010-05-03-MocoSpace Architecture - 3 Billion Mobile Page Views a Month
18 0.064517781 881 high scalability-2010-08-16-Scaling an AWS infrastructure - Tools and Patterns
19 0.059398532 68 high scalability-2007-08-20-TypePad Architecture
20 0.059319898 961 high scalability-2010-12-21-SQL + NoSQL = Yes !
topicId topicWeight
[(0, 0.083), (1, 0.037), (2, -0.007), (3, -0.022), (4, -0.036), (5, 0.018), (6, 0.038), (7, -0.012), (8, -0.0), (9, 0.005), (10, -0.015), (11, -0.017), (12, -0.001), (13, -0.02), (14, 0.03), (15, 0.04), (16, 0.009), (17, 0.04), (18, -0.034), (19, 0.007), (20, 0.017), (21, 0.012), (22, -0.034), (23, 0.03), (24, -0.011), (25, -0.01), (26, 0.051), (27, -0.022), (28, -0.068), (29, 0.01), (30, -0.051), (31, -0.005), (32, 0.05), (33, -0.043), (34, 0.01), (35, 0.019), (36, 0.021), (37, 0.006), (38, -0.026), (39, -0.023), (40, -0.022), (41, -0.052), (42, -0.054), (43, 0.033), (44, -0.005), (45, -0.044), (46, -0.002), (47, 0.056), (48, -0.045), (49, 0.06)]
simIndex simValue blogId blogTitle
same-blog 1 0.93650019 143 high scalability-2007-11-06-Product: ChironFS
Introduction: If you are trying to create highly available file systems, especially across data centers, then ChironFS is one potential solution. It's relatively new, so there aren't lots of experience reports, but it looks worth considering. What is ChironFS and how does it work? Adapted from the ChironFS website: The Chiron Filesystem is a Fuse based filesystem that frees you from single points of failure. It's main purpose is to guarantee filesystem availability using replication. But it isn't a RAID implementation. RAID replicates DEVICES not FILESYSTEMS. Why not just use RAID over some network block device? Because it is a block device and if one server mounts that device in RW mode, no other server will be able to mount it in RW mode. Any real network may have many servers and offer a variety of services. Keeping everything running can become a real nightmare!
2 0.78874385 53 high scalability-2007-08-01-Product: MogileFS
Introduction: MogileFS is an open source distributed filesystem. Its properties and features include: Application level, No single point of failure, Automatic file replication, Better than RAID, Flat Namespace, Shared-Nothing, No RAID required, Local filesystem agnostic.
3 0.73362112 271 high scalability-2008-03-08-Product: DRBD - Distributed Replicated Block Device
Introduction: From their website: DRBD is a block device which is designed to build high availability clusters. This is done by mirroring a whole block device via (a dedicated) network. You could see it as a network raid-1. DRBD takes over the data, writes it to the local disk and sends it to the other host. On the other host, it takes it to the disk there. The other components needed are a cluster membership service, which is supposed to be heartbeat, and some kind of application that works on top of a block device. Examples: A filesystem & fsck. A journaling FS. A database with recovery capabilities. Each device (DRBD provides more than one of these devices) has a state, which can be 'primary' or 'secondary'. On the node with the primary device the application is supposed to run and to access the device (/dev/drbdX). Every write is sent to the local 'lower level block device' and to the node with the device in 'secondary' state. The secondary device simply writes the data to its lowe
4 0.68987024 98 high scalability-2007-09-18-Sync data on all servers
Introduction: I have a few apache servers ( arround 11 atm ) serving a small amount of data ( arround 44 gigs right now ). For some time I have been using rsync to keep all the content equal on all servers, but the amount of data has been growing, and rsync takes a few too much time to "compare" all data from source to destination, and create a lot of I/O. I have been taking a look at MogileFS, it seems a good and reliable option, but as the fuse module is not finished, we should have to rewrite all our apps, and its not an option atm. Any ideas? I just want a "real time, non resource-hungry" solution alternative for rsync. If I get more features on the way, then they are welcome :) Why I prefer to use a Distributed File System instead of using NAS + NFS? - I need 2 NAS, if I dont want a point of failure, and NAS hard is expensive. - Non-shared hardware, all server has their own local disks. - As files are replicated, I can save a lot of money, RAID is not a MUST. Thn
5 0.68389887 103 high scalability-2007-09-28-Kosmos File System (KFS) is a New High End Google File System Option
Introduction: There's a new clustered file system on the spindle: Kosmos File System (KFS) . Thanks to Rich Skrenta for turning me on to KFS and I think his blog post says it all. KFS is an open source project written in C++ by search startup Kosmix . The team members have a good pedigree so there's a better than average chance this software will be worth considering. After you stop trying to turn KFS into "Kentucky Fried File System" in your mind, take a look at KFS' intriguing feature set: Incremental scalability: New chunkserver nodes can be added as storage needs increase; the system automatically adapts to the new nodes. Availability: Replication is used to provide availability due to chunk server failures. Typically, files are replicated 3-way. Per file degree of replication: The degree of replication is configurable on a per file basis, with a max. limit of 64. Re-replication: Whenever the degree of replication for a file drops below the configured amount (
6 0.67803884 1279 high scalability-2012-07-09-Data Replication in NoSQL Databases
7 0.66359466 112 high scalability-2007-10-04-You Can Now Store All Your Stuff on Your Own Google Like File System
8 0.64295042 278 high scalability-2008-03-16-Product: GlusterFS
9 0.61561829 488 high scalability-2009-01-08-file synchronization solutions
10 0.6124779 889 high scalability-2010-08-30-Pomegranate - Storing Billions and Billions of Tiny Little Files
11 0.6100651 229 high scalability-2008-01-29-Building scalable storage into application - Instead of MogileFS OpenAFS etc.
12 0.59580666 128 high scalability-2007-10-21-Paper: Standardizing Storage Clusters (with pNFS)
13 0.58506775 430 high scalability-2008-10-26-Should you use a SAN to scale your architecture?
14 0.57657117 1442 high scalability-2013-04-17-Tachyon - Fault Tolerant Distributed File System with 300 Times Higher Throughput than HDFS
15 0.57456899 653 high scalability-2009-07-08-Servers Component - How to choice and build perfect server
16 0.57310098 50 high scalability-2007-07-31-BerkeleyDB & other distributed high performance key-value databases
17 0.57213336 585 high scalability-2009-04-29-How to choice and build perfect server
18 0.56535321 283 high scalability-2008-03-18-Shared filesystem on EC2
19 0.56357181 1035 high scalability-2011-05-05-Paper: A Study of Practical Deduplication
topicId topicWeight
[(2, 0.162), (61, 0.043), (79, 0.114), (85, 0.533)]
simIndex simValue blogId blogTitle
1 0.93849874 59 high scalability-2007-08-04-Try Squid as a Reverse Proxy
Introduction: This scalability strategy is brought to you by Erik Osterman: My recommendations for anyone dealing with explosive growth on a limited budget with lots of cachable content (e.g. content capable of returning valid expiration headers) is employ a reverse proxy as mentioned in this article. In the last week, we had a site get AP'd, triggering 100K unique visitors to a single IIS server in under 5 hours. It took out the IIS server. Placing a single squid infront of the server handled the entire onslaught with a max server load of 0.10 on a modest Intel IV 3Ghz. It's trivial to implement for anyone interested...
2 0.91332507 1049 high scalability-2011-05-31-Awesome List of Advanced Distributed Systems Papers
Introduction: As part of Dr. Indranil Gupta 's CS 525 Spring 2011 Advanced Distributed Systems class, he has collected an incredible list of resources on distributed systems . His research group is also doing some interesting work. The various topics include: Before there Were Clouds, Cloud Computing, P2P Systems, Basic Distributed Computing Concepts, Sensor Networks, Overlays and DHTs, Cloud Programming, Cloud Scheduling, Key-Value Stores, Storage, Sensor Net Routing, Geo-Distribution, P2P Apps, In-network processing, Epidemics, Probabilistic Membership Protocols, Distributed Monitoring and Management, Publish-Subscribe/CDNs, Measurement Studies, Old Wine: Stale or Vintage?, In Byzantium, Cloud Pricing, Other Industrial Systems, Structure of Networks, Completing the Circle, Green Clouds, Distributed Debugging, Flash!, The Middle or the End?, Availability-Aware Systems, Design Methodologies, Handling Stress, Sources of unreliability in networks, Handling Stress, Selfish algorithms, Securi
same-blog 3 0.88215566 143 high scalability-2007-11-06-Product: ChironFS
Introduction: If you are trying to create highly available file systems, especially across data centers, then ChironFS is one potential solution. It's relatively new, so there aren't lots of experience reports, but it looks worth considering. What is ChironFS and how does it work? Adapted from the ChironFS website: The Chiron Filesystem is a Fuse based filesystem that frees you from single points of failure. It's main purpose is to guarantee filesystem availability using replication. But it isn't a RAID implementation. RAID replicates DEVICES not FILESYSTEMS. Why not just use RAID over some network block device? Because it is a block device and if one server mounts that device in RW mode, no other server will be able to mount it in RW mode. Any real network may have many servers and offer a variety of services. Keeping everything running can become a real nightmare!
4 0.87024516 191 high scalability-2007-12-23-Synchronizing Memcached application
Introduction: I have an application with couple of web servers that uses MemcacheD. How can i synchronize concurrent put to the cache? The value of the entry is list. Atomic append operation could have been helpful, but unfortunately memcahe doesn't support atomic append.
5 0.86701465 447 high scalability-2008-11-19-High Definition Video Delivery on the Web?
Introduction: How would you architect and implement an SD and HD internet video delivery system such as the BBC iPlayer or Recast Digital's RDV1 . What do you need to consider on top of the Lessons Learned section in the YouTube Architecture post? How is it possible to compete with the big players like Google? Can you just use a CDN and scale efficiently? Would Amazon's cloud services be a viable platform for high-definition video streaming?
6 0.85284168 820 high scalability-2010-05-03-100 Node Hazelcast cluster on Amazon EC2
7 0.82039905 102 high scalability-2007-09-27-Product: Sequoia Database Clustering Technology
8 0.8044765 1039 high scalability-2011-05-12-Paper: Mind the Gap: Reconnecting Architecture and OS Research
9 0.77821559 1032 high scalability-2011-05-02-Stack Overflow Makes Slow Pages 100x Faster by Simple SQL Tuning
10 0.76461887 1164 high scalability-2011-12-27-PlentyOfFish Update - 6 Billion Pageviews and 32 Billion Images a Month
11 0.74680495 1500 high scalability-2013-08-12-100 Curse Free Lessons from Gordon Ramsay on Building Great Software
12 0.74671376 1577 high scalability-2014-01-13-NYTimes Architecture: No Head, No Master, No Single Point of Failure
13 0.73875326 1239 high scalability-2012-05-04-Stuff The Internet Says On Scalability For May 4, 2012
14 0.71202075 492 high scalability-2009-01-16-Database Sharding for startups
15 0.6868881 1024 high scalability-2011-04-15-Stuff The Internet Says On Scalability For April 15, 2011
16 0.68618834 118 high scalability-2007-10-09-High Load on production Webservers after Sourcecode sync
17 0.67310655 1592 high scalability-2014-02-07-Stuff The Internet Says On Scalability For February 7th, 2014
18 0.67074096 638 high scalability-2009-06-26-PlentyOfFish Architecture
19 0.66982645 1361 high scalability-2012-11-22-Gone Fishin': PlentyOfFish Architecture
20 0.6657185 1327 high scalability-2012-09-21-Stuff The Internet Says On Scalability For September 21, 2012