high_scalability high_scalability-2008 high_scalability-2008-308 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: I've been trying to find a high availability file storage solution without success. I tried GlusterFS which looks very promising but experienced problems with stability and don't want something I can't easily control and rely on. Other solutions are too complicated or have a SPOF. So I'm thinking of the following setup: Two NFS servers, a primary and a warm backup. The primary server will be rsynced with the warm backup every minute or two. I can do it so frequently as a PHP script will know which directories have changed recently from a database and only rsync those. Both servers will be NFS mounted on a cluster of web servers as /mnt/nfs-primary (sym linked as /home/websites) and /mnt/nfs-backup. I'll then use Ucarp (http://www.ucarp.org/project/ucarp) to monitor both NFS servers availability every couple of seconds and when one goes down, the Ucarp up script will be set to change the symbolic link on all web servers for the /home/websites dir from /mnt/nfs-primary to /mn
sentIndex sentText sentNum sentScore
1 I've been trying to find a high availability file storage solution without success. [sent-1, score-0.207]
2 I tried GlusterFS which looks very promising but experienced problems with stability and don't want something I can't easily control and rely on. [sent-2, score-0.569]
3 Other solutions are too complicated or have a SPOF. [sent-3, score-0.116]
4 So I'm thinking of the following setup: Two NFS servers, a primary and a warm backup. [sent-4, score-0.586]
5 The primary server will be rsynced with the warm backup every minute or two. [sent-5, score-0.792]
6 I can do it so frequently as a PHP script will know which directories have changed recently from a database and only rsync those. [sent-6, score-0.917]
7 Both servers will be NFS mounted on a cluster of web servers as /mnt/nfs-primary (sym linked as /home/websites) and /mnt/nfs-backup. [sent-7, score-0.465]
8 Can it really be this simple or am I missing something? [sent-11, score-0.081]
9 Just setting up a trial system now but would be interested in feedback. [sent-12, score-0.219]
10 :) Also, I can't find out whether it's best to use NFS V3 or V4 these days? [sent-13, score-0.15]
wordName wordTfidf (topN-words)
[('nfs', 0.508), ('ucarp', 0.372), ('script', 0.29), ('rsync', 0.257), ('primary', 0.252), ('warm', 0.219), ('backup', 0.193), ('directories', 0.158), ('symbolic', 0.146), ('glusterfs', 0.141), ('mounted', 0.122), ('servers', 0.121), ('linked', 0.101), ('promising', 0.098), ('setting', 0.085), ('availability', 0.083), ('trial', 0.081), ('missing', 0.081), ('rely', 0.08), ('frequently', 0.079), ('whether', 0.078), ('tried', 0.077), ('couple', 0.076), ('stability', 0.076), ('minute', 0.073), ('previous', 0.073), ('find', 0.072), ('link', 0.071), ('ca', 0.07), ('changed', 0.069), ('switch', 0.069), ('experienced', 0.069), ('complicated', 0.067), ('something', 0.066), ('seconds', 0.065), ('setup', 0.065), ('recently', 0.064), ('following', 0.058), ('thinking', 0.057), ('php', 0.057), ('days', 0.056), ('monitor', 0.055), ('every', 0.055), ('looks', 0.055), ('gets', 0.054), ('interested', 0.053), ('trying', 0.052), ('goes', 0.049), ('solutions', 0.049), ('easily', 0.048)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000001 308 high scalability-2008-04-22-Simple NFS failover solution with symbolic link?
Introduction: I've been trying to find a high availability file storage solution without success. I tried GlusterFS which looks very promising but experienced problems with stability and don't want something I can't easily control and rely on. Other solutions are too complicated or have a SPOF. So I'm thinking of the following setup: Two NFS servers, a primary and a warm backup. The primary server will be rsynced with the warm backup every minute or two. I can do it so frequently as a PHP script will know which directories have changed recently from a database and only rsync those. Both servers will be NFS mounted on a cluster of web servers as /mnt/nfs-primary (sym linked as /home/websites) and /mnt/nfs-backup. I'll then use Ucarp (http://www.ucarp.org/project/ucarp) to monitor both NFS servers availability every couple of seconds and when one goes down, the Ucarp up script will be set to change the symbolic link on all web servers for the /home/websites dir from /mnt/nfs-primary to /mn
2 0.34011191 310 high scalability-2008-04-29-High performance file server
Introduction: What have bunch of applications which run on Debian servers, which processes huge amount of data stored in a shared NFS drive. we have 3 applications working as a pipeline, which process data stored in the NFS drive. The first application processes the data and store the output in some folder in the NFS drive, the second app in the pipeline process the data from the previous step and so on. The data load to the pipeline is like 1 GBytes per minute. I think the NFS drive is the bottleneck here. Would buying a specialized file server improve the performance of data read write from the disk ?
3 0.18540061 128 high scalability-2007-10-21-Paper: Standardizing Storage Clusters (with pNFS)
Introduction: pNFS (parallel NFS) is the next generation of NFS and its main claim to fame is that it's clustered, which "enables clients to directly access file data spread over multiple storage servers in parallel. As a result, each client can leverage the full aggregate bandwidth of a clustered storage service at the granularity of an individual file." About pNFS StorageMojo says: pNFS is going to commoditize parallel data access. In 5 years we won’t know how we got along without it . Something to watch.
4 0.17531599 63 high scalability-2007-08-09-Lots of questions for high scalability - high availability
Introduction: Hey, I do have a website that I would like to scale. Right now we have 10 servers but this does not scale well. I know how to deal with my apache web servers but have problems with sql servers. I would like to use the "scale out" system and add servers when we need. We have over 100Gb of data for mysql and we tried to have around 20G per server. It works well except that if a server goes down then 1/5 of the user can't access the website. We could use replication but we would need to at least double sql servers to replicate each server. And maybe in the future it's not gonna be enough we would need maybe 3 slaves per master ... well I don't really like this idea. I would prefer to have 8 servers that all deal with data from the 5 servers we have right now and then we could add new servers when we need. I looked at NFS but that does not seem to be a good idea for SQL servers ? Can you confirm?
5 0.17443193 1386 high scalability-2013-01-14-MongoDB and GridFS for Inter and Intra Datacenter Data Replication
Introduction: This is a guest post by Jeff Behl , VP Ops @ LogicMonitor. Jeff has been a bit herder for the last 20 years, architecting and overseeing the infrastructure for a number of SaaS based companies. Data Replication for Disaster Recovery An inevitable part of disaster recovery planning is making sure customer data exists in multiple locations. In the case of LogicMonitor, a SaaS-based monitoring solution for physical, virtual, and cloud environments, we wanted copies of customer data files both within a data center and outside of it. The former was to protect against the loss of individual servers within a facility, and the latter for recovery in the event of the complete loss of a data center. Where we were: Rsync Like most everyone who starts off in a Linux environment, we used our trusty friend rsync to copy data around. Rsync is tried, true and tested, and works well when the number of servers, the amount of data, and the number of files is not horrendous.
6 0.16353579 1501 high scalability-2013-08-13-In Memoriam: Lavabit Architecture - Creating a Scalable Email Service
7 0.15449926 98 high scalability-2007-09-18-Sync data on all servers
8 0.13639967 1041 high scalability-2011-05-15-Building a Database remote availability site
9 0.13358539 516 high scalability-2009-02-19-Heavy upload server scalability
10 0.13037865 1521 high scalability-2013-09-23-Salesforce Architecture - How they Handle 1.3 Billion Transactions a Day
11 0.11212905 68 high scalability-2007-08-20-TypePad Architecture
12 0.10161661 150 high scalability-2007-11-12-Slashdot Architecture - How the Old Man of the Internet Learned to Scale
13 0.09414281 488 high scalability-2009-01-08-file synchronization solutions
14 0.090846039 233 high scalability-2008-01-30-How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data
15 0.090314686 182 high scalability-2007-12-12-Oracle Can Do Read-Write Splitting Too
16 0.086706847 840 high scalability-2010-06-10-The Four Meta Secrets of Scaling at Facebook
17 0.085414723 1421 high scalability-2013-03-11-Low Level Scalability Solutions - The Conditioning Collection
18 0.084115364 383 high scalability-2008-09-10-Shard servers -- go big or small?
19 0.082726657 278 high scalability-2008-03-16-Product: GlusterFS
topicId topicWeight
[(0, 0.114), (1, 0.02), (2, -0.023), (3, -0.081), (4, -0.0), (5, -0.007), (6, 0.084), (7, -0.047), (8, 0.044), (9, 0.021), (10, -0.022), (11, -0.016), (12, -0.006), (13, 0.008), (14, 0.05), (15, 0.03), (16, 0.026), (17, 0.033), (18, -0.058), (19, 0.039), (20, 0.031), (21, -0.011), (22, -0.029), (23, 0.025), (24, 0.003), (25, 0.032), (26, 0.057), (27, -0.006), (28, -0.058), (29, 0.0), (30, -0.024), (31, -0.037), (32, 0.015), (33, -0.012), (34, -0.01), (35, 0.007), (36, 0.024), (37, -0.003), (38, 0.006), (39, 0.014), (40, 0.038), (41, -0.112), (42, -0.034), (43, 0.011), (44, -0.001), (45, 0.034), (46, 0.001), (47, 0.001), (48, 0.043), (49, -0.028)]
simIndex simValue blogId blogTitle
same-blog 1 0.94665486 308 high scalability-2008-04-22-Simple NFS failover solution with symbolic link?
Introduction: I've been trying to find a high availability file storage solution without success. I tried GlusterFS which looks very promising but experienced problems with stability and don't want something I can't easily control and rely on. Other solutions are too complicated or have a SPOF. So I'm thinking of the following setup: Two NFS servers, a primary and a warm backup. The primary server will be rsynced with the warm backup every minute or two. I can do it so frequently as a PHP script will know which directories have changed recently from a database and only rsync those. Both servers will be NFS mounted on a cluster of web servers as /mnt/nfs-primary (sym linked as /home/websites) and /mnt/nfs-backup. I'll then use Ucarp (http://www.ucarp.org/project/ucarp) to monitor both NFS servers availability every couple of seconds and when one goes down, the Ucarp up script will be set to change the symbolic link on all web servers for the /home/websites dir from /mnt/nfs-primary to /mn
2 0.74977601 229 high scalability-2008-01-29-Building scalable storage into application - Instead of MogileFS OpenAFS etc.
Introduction: I am planning the scaling of a hosted service, similar to typepad etc. and would appreciate feedback on my plan so far. Looking into scaling storage, I have come accross MogileFS and OpenAFS. My concern with these is I am not at all experienced with them and as the sole tech guy I don't want to build something into this hosting service that proves complex to update and adminster. So, I'm thinking of building replication and scalability right into the application, in a similar but simplified way to how MogileFS works (I think). So, for our database table of uploaded files, here's how it currently looks (simplified): fileid (pkey) filename ownerid For adding the replication and scalability, I would add a few more columns: serveroneid servertwoid serverthreeid s3 At the time the user uploads a file, it will go to a specific server (managed by the application) and the id of that server will be placed in the "serverone" column. Then hourly or so, a cro
3 0.73159385 98 high scalability-2007-09-18-Sync data on all servers
Introduction: I have a few apache servers ( arround 11 atm ) serving a small amount of data ( arround 44 gigs right now ). For some time I have been using rsync to keep all the content equal on all servers, but the amount of data has been growing, and rsync takes a few too much time to "compare" all data from source to destination, and create a lot of I/O. I have been taking a look at MogileFS, it seems a good and reliable option, but as the fuse module is not finished, we should have to rewrite all our apps, and its not an option atm. Any ideas? I just want a "real time, non resource-hungry" solution alternative for rsync. If I get more features on the way, then they are welcome :) Why I prefer to use a Distributed File System instead of using NAS + NFS? - I need 2 NAS, if I dont want a point of failure, and NAS hard is expensive. - Non-shared hardware, all server has their own local disks. - As files are replicated, I can save a lot of money, RAID is not a MUST. Thn
4 0.71153671 63 high scalability-2007-08-09-Lots of questions for high scalability - high availability
Introduction: Hey, I do have a website that I would like to scale. Right now we have 10 servers but this does not scale well. I know how to deal with my apache web servers but have problems with sql servers. I would like to use the "scale out" system and add servers when we need. We have over 100Gb of data for mysql and we tried to have around 20G per server. It works well except that if a server goes down then 1/5 of the user can't access the website. We could use replication but we would need to at least double sql servers to replicate each server. And maybe in the future it's not gonna be enough we would need maybe 3 slaves per master ... well I don't really like this idea. I would prefer to have 8 servers that all deal with data from the 5 servers we have right now and then we could add new servers when we need. I looked at NFS but that does not seem to be a good idea for SQL servers ? Can you confirm?
5 0.70393854 516 high scalability-2009-02-19-Heavy upload server scalability
Introduction: Hi, We are running a backup solution that uploads every night the files our clients worked on during the day (Cabonite-like). We have currently about 10GB of data per night, via http PUT requests (1 per file), and the files are written as-is on a NAS. Our architecture is basically compound of a load balancer (hardware, sticky sessions), 5 servers (Tomcat under RHEL4/5, ) and a NAS (nfs 3). Since our number of clients is rising, (as is our system load) how would you recommend we could scale our infrastructure? hardware and software? Should we go towards NAS sharding, more servers, NIO on tomcat...? Thanks for your inputs!
6 0.70142227 1386 high scalability-2013-01-14-MongoDB and GridFS for Inter and Intra Datacenter Data Replication
7 0.6847555 283 high scalability-2008-03-18-Shared filesystem on EC2
8 0.68156266 1279 high scalability-2012-07-09-Data Replication in NoSQL Databases
9 0.67741901 103 high scalability-2007-09-28-Kosmos File System (KFS) is a New High End Google File System Option
10 0.67294538 884 high scalability-2010-08-23-6 Ways to Kill Your Servers - Learning How to Scale the Hard Way
11 0.66814911 112 high scalability-2007-10-04-You Can Now Store All Your Stuff on Your Own Google Like File System
12 0.66769814 128 high scalability-2007-10-21-Paper: Standardizing Storage Clusters (with pNFS)
13 0.66610754 605 high scalability-2009-05-22-Distributed content system with bandwidth balancing
14 0.6540938 430 high scalability-2008-10-26-Should you use a SAN to scale your architecture?
15 0.64556515 1521 high scalability-2013-09-23-Salesforce Architecture - How they Handle 1.3 Billion Transactions a Day
16 0.64436603 143 high scalability-2007-11-06-Product: ChironFS
17 0.6424641 488 high scalability-2009-01-08-file synchronization solutions
18 0.63004768 140 high scalability-2007-11-02-How WordPress.com Tracks 300 Servers Handling 10 Million Pageviews
19 0.61798376 278 high scalability-2008-03-16-Product: GlusterFS
20 0.6160481 81 high scalability-2007-09-06-Scaling IMAP and POP3
topicId topicWeight
[(1, 0.166), (2, 0.119), (30, 0.473), (61, 0.053), (79, 0.058)]
simIndex simValue blogId blogTitle
1 0.97818893 131 high scalability-2007-10-25-Should JSPs be avoided for high scalability?
Introduction: I just heard about some web sites where Velocity templates are used to render HTML instead of using JSPs and all the processing in performed in servlets. Can JSPs cause issue with scalability? Thanks, Unmesh
2 0.95727319 991 high scalability-2011-02-16-Paper: An Experimental Investigation of the Akamai Adaptive Video Streaming
Introduction: Video is hot on the Internet and people are really interested in knowing how to make it work. Dan Rayburn has a post pointing to a fascinating paper: An Experimental Investigation of the Akamai Adaptive Video Streaming , which talks in some detail about the protocols big players like YouTube, Skype and Akamai use to serve video over on an inherently video unfriendly medium like the Internet. For Akamai they found: Each video is encoded in five versions at different bit rates and stored in separate files. The client sends commands to the server with an average inter departure time of about 2 s, i.e. the control algorithm is executed on average each 2 seconds. Akamai uses only the video level to adapt the video source to the available bandwidth, whereas the frame rate of the video is kept constant. When a sudden drop in the available bandwidth occurs, short interruptions of the video playback can occur due to the a large actuation delay. For a sudden increase of the avai
3 0.95023662 1016 high scalability-2011-04-04-Scaling Social Ecommerce Architecture Case study
Introduction: A recent study showed that over 92 percent of executives from leading retailers are focusing their marketing efforts on Facebook and subsequent applications. Furthermore, over 71 percent of users have confirmed they are more likely to make a purchase after “liking” a brand they find online. ( source ) Sears Architect Tomer Gabel provides an insightful overview on how they built a Social Ecommerce solution for Sears.com that can handle complex relationship quires in real time. Tomer goes through: the architectural considerations behind their solution why they chose memory over disk how they partitioned the data to gain scalability why they chose to execute code with the data using GigaSpaces Map/Reduce execution framework how they integrated with Facebook why they chose GigaSpaces over Coherence and Terracotta for in-memory caching and scale In this post I tried to summarize the main takeaway from the interview. You can also watch the full interview (highly reco
4 0.95014894 14 high scalability-2007-07-15-Web Analytics: An Hour a Day
Introduction: Web Analytics: An Hour A Day is the first book by an in-the-trenches practitioner of web analytics. It provides a unique insider’s perspective of the challenges and opportunities that web analytics presents to each person who touches the Web in your organization. Rather than spamming you with metrics and definitions, Web Analytics: An Hour A Day will enhance your mindset and teach you how to fish for yourself. Avinash Kaushik is a expert in web analytics and author of the top-rated blog Occam’s Razor (http://www.kaushik.net/avinash). In this book, he goes beyond web analytics concepts and definitions to provide a step-by-step guide to implementing a successful web analytics strategy. His revolutionary approach to web analytics challenges prevalent thinking about the field and guides readers to a solution that will provide truly informed and actionable insights.
5 0.93225491 1459 high scalability-2013-05-16-Paper: Warp: Multi-Key Transactions for Key-Value Stores
Introduction: Looks like an interesting take on "a completely asynchronous, low-latency transaction management protocol, in line with the fully distributed NoSQL architecture." Warp: Multi-Key Transactions for Key-Value Stores  overview: Implementing ACID transactions has been a longstanding challenge for NoSQL systems. Because these systems are based on a sharded architecture, transactions necessarily require coordination across multiple servers. Past work in this space has relied either on heavyweight protocols such as Paxos or clock synchronization for this coordination. This paper presents a novel protocol for coordinating distributed transactions with ACID semantics on top of a sharded data store. Called linear transactions, this protocol achieves scalability by distributing the coordination task to only those servers that hold relevant data for each transaction. It achieves high performance by serializing only those transactions whose concurrent execution could potentially yield a vio
6 0.91108334 16 high scalability-2007-07-16-Book: High Performance MySQL
same-blog 7 0.87615073 308 high scalability-2008-04-22-Simple NFS failover solution with symbolic link?
8 0.87314731 182 high scalability-2007-12-12-Oracle Can Do Read-Write Splitting Too
9 0.86184084 500 high scalability-2009-01-22-Heterogeneous vs. Homogeneous System Architectures
10 0.82381272 261 high scalability-2008-02-25-Make Your Site Run 10 Times Faster
11 0.79173595 334 high scalability-2008-05-29-Amazon Improves Diagonal Scaling Support with High-CPU Instances
12 0.78630805 783 high scalability-2010-02-24-Hot Scalability Links for February 24, 2010
13 0.78117543 43 high scalability-2007-07-30-Product: ImageShack
14 0.77808177 263 high scalability-2008-02-27-Product: System Imager - Automate Deployment and Installs
15 0.77778798 831 high scalability-2010-05-26-End-To-End Performance Study of Cloud Services
16 0.76440895 336 high scalability-2008-05-31-Biggest Under Reported Story: Google's BigTable Costs 10 Times Less than Amazon's SimpleDB
17 0.74913085 1284 high scalability-2012-07-16-Cinchcast Architecture - Producing 1,500 Hours of Audio Every Day
18 0.69251299 44 high scalability-2007-07-30-Product: Photobucket
19 0.68404377 291 high scalability-2008-03-29-20 New Rules for Faster Web Pages