high_scalability high_scalability-2009 high_scalability-2009-605 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: I am looking for a way to distribute files over servers in different physical locations. My main concern is that I have bandwidth limitations on each location, and wish to spread the bandwidth load evenly. Atm. I just have 1:1 copies of the files on all servers, and have the application pick a random server to serve the file as a temp fix... It's a small video streaming service. I want to spoonfeed the stream to the client with a max bandwidth output, and support seek. At present I use php to limit the network stream, and read the file at a given offset sendt as a get parameter from the player for seek. It's psuedo streaming, but it works. I have been looking at MogileFS, which would solve the storage part. With MogileFS I can make use of my current php solution as it supports lighttpd and apache (with mod_rewrite or similar). However I don't see how I can apply MogileFS to check for bandwidth % usage? Any reccomendations for how I can solve this?
sentIndex sentText sentNum sentScore
1 I am looking for a way to distribute files over servers in different physical locations. [sent-1, score-0.52]
2 My main concern is that I have bandwidth limitations on each location, and wish to spread the bandwidth load evenly. [sent-2, score-1.134]
3 I just have 1:1 copies of the files on all servers, and have the application pick a random server to serve the file as a temp fix. [sent-4, score-0.888]
4 I want to spoonfeed the stream to the client with a max bandwidth output, and support seek. [sent-8, score-0.752]
5 At present I use php to limit the network stream, and read the file at a given offset sendt as a get parameter from the player for seek. [sent-9, score-1.087]
6 I have been looking at MogileFS, which would solve the storage part. [sent-11, score-0.295]
7 With MogileFS I can make use of my current php solution as it supports lighttpd and apache (with mod_rewrite or similar). [sent-12, score-0.635]
8 However I don't see how I can apply MogileFS to check for bandwidth % usage? [sent-13, score-0.431]
9 Any reccomendations for how I can solve this? [sent-14, score-0.143]
wordName wordTfidf (topN-words)
[('mogilefs', 0.581), ('bandwidth', 0.262), ('temp', 0.206), ('streaming', 0.195), ('stream', 0.192), ('offset', 0.172), ('parameter', 0.162), ('php', 0.161), ('files', 0.155), ('concern', 0.15), ('solve', 0.143), ('max', 0.141), ('lighttpd', 0.14), ('player', 0.129), ('wish', 0.126), ('copies', 0.12), ('limitations', 0.118), ('file', 0.117), ('output', 0.115), ('looking', 0.113), ('distribute', 0.111), ('pick', 0.107), ('location', 0.102), ('spread', 0.096), ('limit', 0.096), ('present', 0.095), ('random', 0.095), ('serve', 0.088), ('supports', 0.086), ('apply', 0.086), ('check', 0.083), ('however', 0.082), ('apache', 0.081), ('main', 0.081), ('video', 0.075), ('usage', 0.075), ('similar', 0.074), ('current', 0.073), ('physical', 0.073), ('client', 0.07), ('given', 0.069), ('servers', 0.068), ('solution', 0.052), ('small', 0.051), ('support', 0.048), ('read', 0.044), ('use', 0.042), ('want', 0.039), ('storage', 0.039), ('load', 0.039)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 605 high scalability-2009-05-22-Distributed content system with bandwidth balancing
Introduction: I am looking for a way to distribute files over servers in different physical locations. My main concern is that I have bandwidth limitations on each location, and wish to spread the bandwidth load evenly. Atm. I just have 1:1 copies of the files on all servers, and have the application pick a random server to serve the file as a temp fix... It's a small video streaming service. I want to spoonfeed the stream to the client with a max bandwidth output, and support seek. At present I use php to limit the network stream, and read the file at a given offset sendt as a get parameter from the player for seek. It's psuedo streaming, but it works. I have been looking at MogileFS, which would solve the storage part. With MogileFS I can make use of my current php solution as it supports lighttpd and apache (with mod_rewrite or similar). However I don't see how I can apply MogileFS to check for bandwidth % usage? Any reccomendations for how I can solve this?
2 0.35920262 229 high scalability-2008-01-29-Building scalable storage into application - Instead of MogileFS OpenAFS etc.
Introduction: I am planning the scaling of a hosted service, similar to typepad etc. and would appreciate feedback on my plan so far. Looking into scaling storage, I have come accross MogileFS and OpenAFS. My concern with these is I am not at all experienced with them and as the sole tech guy I don't want to build something into this hosting service that proves complex to update and adminster. So, I'm thinking of building replication and scalability right into the application, in a similar but simplified way to how MogileFS works (I think). So, for our database table of uploaded files, here's how it currently looks (simplified): fileid (pkey) filename ownerid For adding the replication and scalability, I would add a few more columns: serveroneid servertwoid serverthreeid s3 At the time the user uploads a file, it will go to a specific server (managed by the application) and the id of that server will be placed in the "serverone" column. Then hourly or so, a cro
3 0.1514799 44 high scalability-2007-07-30-Product: Photobucket
Introduction: Photobucket's free account has a storage limit and a download bandwidth limit of 10 GB per month. There's no bandwidth limit on the $25 Pro account.
4 0.13149229 98 high scalability-2007-09-18-Sync data on all servers
Introduction: I have a few apache servers ( arround 11 atm ) serving a small amount of data ( arround 44 gigs right now ). For some time I have been using rsync to keep all the content equal on all servers, but the amount of data has been growing, and rsync takes a few too much time to "compare" all data from source to destination, and create a lot of I/O. I have been taking a look at MogileFS, it seems a good and reliable option, but as the fuse module is not finished, we should have to rewrite all our apps, and its not an option atm. Any ideas? I just want a "real time, non resource-hungry" solution alternative for rsync. If I get more features on the way, then they are welcome :) Why I prefer to use a Distributed File System instead of using NAS + NFS? - I need 2 NAS, if I dont want a point of failure, and NAS hard is expensive. - Non-shared hardware, all server has their own local disks. - As files are replicated, I can save a lot of money, RAID is not a MUST. Thn
5 0.12972313 239 high scalability-2008-02-04-Streaming Video on Amazon EC2?
Introduction: An Amazon EC2 Flash Video Streaming solution has been announced by Wowza Media. What do you think about the future of similar solutions? Is Amazon EC2 and S3 ready for video streaming? I have found threads on their forums related to the performance, scalability and high availability of the hosted streaming solution. How would you make it scalable? Is it really cheaper than traditional hosting? Looking forward to your thoughts!
6 0.1288242 991 high scalability-2011-02-16-Paper: An Experimental Investigation of the Akamai Adaptive Video Streaming
7 0.1253286 554 high scalability-2009-04-04-Digg Architecture
8 0.12476985 177 high scalability-2007-12-08-thesimsonstage.ea.com
9 0.11698319 889 high scalability-2010-08-30-Pomegranate - Storing Billions and Billions of Tiny Little Files
10 0.1158131 796 high scalability-2010-03-16-Justin.tv's Live Video Broadcasting Architecture
11 0.11497907 1359 high scalability-2012-11-15-Gone Fishin': Justin.Tv's Live Video Broadcasting Architecture
12 0.11184599 434 high scalability-2008-10-30-Olio Web2.0 Toolkit - Evaluate Web Technologies and Tools
13 0.10890231 532 high scalability-2009-03-11-Sharding and Connection Pools
14 0.099564962 1402 high scalability-2013-02-07-Ask HighScalability: Web asset server concept - 3rd party software available?
15 0.09954419 43 high scalability-2007-07-30-Product: ImageShack
16 0.096637592 304 high scalability-2008-04-19-How to build a real-time analytics system?
17 0.094925508 274 high scalability-2008-03-12-YouTube Architecture
18 0.086212292 30 high scalability-2007-07-26-Product: AWStats a Log Analyzer
19 0.084788136 722 high scalability-2009-10-15-Hot Scalability Links for Oct 15 2009
20 0.082849741 112 high scalability-2007-10-04-You Can Now Store All Your Stuff on Your Own Google Like File System
topicId topicWeight
[(0, 0.12), (1, 0.034), (2, -0.021), (3, -0.079), (4, -0.002), (5, -0.022), (6, 0.059), (7, -0.006), (8, 0.015), (9, 0.081), (10, 0.001), (11, -0.076), (12, -0.033), (13, -0.04), (14, 0.068), (15, 0.054), (16, -0.001), (17, 0.055), (18, -0.055), (19, -0.04), (20, -0.017), (21, -0.016), (22, -0.015), (23, 0.04), (24, 0.077), (25, 0.009), (26, 0.106), (27, -0.074), (28, -0.0), (29, -0.034), (30, 0.002), (31, -0.065), (32, 0.018), (33, 0.061), (34, -0.019), (35, 0.012), (36, -0.02), (37, -0.054), (38, -0.003), (39, -0.03), (40, -0.035), (41, -0.044), (42, -0.029), (43, -0.021), (44, 0.064), (45, -0.032), (46, 0.005), (47, 0.045), (48, 0.074), (49, -0.05)]
simIndex simValue blogId blogTitle
same-blog 1 0.97419924 605 high scalability-2009-05-22-Distributed content system with bandwidth balancing
Introduction: I am looking for a way to distribute files over servers in different physical locations. My main concern is that I have bandwidth limitations on each location, and wish to spread the bandwidth load evenly. Atm. I just have 1:1 copies of the files on all servers, and have the application pick a random server to serve the file as a temp fix... It's a small video streaming service. I want to spoonfeed the stream to the client with a max bandwidth output, and support seek. At present I use php to limit the network stream, and read the file at a given offset sendt as a get parameter from the player for seek. It's psuedo streaming, but it works. I have been looking at MogileFS, which would solve the storage part. With MogileFS I can make use of my current php solution as it supports lighttpd and apache (with mod_rewrite or similar). However I don't see how I can apply MogileFS to check for bandwidth % usage? Any reccomendations for how I can solve this?
Introduction: I am planning the scaling of a hosted service, similar to typepad etc. and would appreciate feedback on my plan so far. Looking into scaling storage, I have come accross MogileFS and OpenAFS. My concern with these is I am not at all experienced with them and as the sole tech guy I don't want to build something into this hosting service that proves complex to update and adminster. So, I'm thinking of building replication and scalability right into the application, in a similar but simplified way to how MogileFS works (I think). So, for our database table of uploaded files, here's how it currently looks (simplified): fileid (pkey) filename ownerid For adding the replication and scalability, I would add a few more columns: serveroneid servertwoid serverthreeid s3 At the time the user uploads a file, it will go to a specific server (managed by the application) and the id of that server will be placed in the "serverone" column. Then hourly or so, a cro
3 0.72308242 1402 high scalability-2013-02-07-Ask HighScalability: Web asset server concept - 3rd party software available?
Introduction: We are serving dynamic (PHP) websites and their assets (JS/CSS/images/videos/binary downloads) via the same apache hosts. The static files are only being used as origins for CDN services used to distribute those files. Yet, in the current development-deploy pipeline, these files are checked into the same version control repositories as the code is. This is what we would like to change, for several reasons (decouple asset deployment from development & developers, lessen size of code repositories, etc.) My idea is to do the following: Set up a media server (cluster) which serves as an API (REST e.g.). You can PUT files to it, and get back the URL the file is available through from the public. In between input and output, the media service deals with everything that's necessary to serve the files: Upload them to the CDN, create the public URL, write the meta data to a (relational?) database, assign a version number... This API can be used by a) the application/website directly to provid
4 0.68495041 283 high scalability-2008-03-18-Shared filesystem on EC2
Introduction: Hi. I'm looking for a way to share files between EC2 nodes. Currently we are using glusterfs to do this. It has been reliable recently, but in the past it has crashed under high load and we've had trouble starting it up again. We've only been able to restart it by removing the files, restarting the cluster, and filing it up again with our files from backup. This takes ages, and will take even longer the more files we get. What worries me is that it seems to make each node a point of failure for the entire system. One node crashes and soon the entire cluster has crashed. The other problem is adding another node. It seems like you have to take down the whole thing, reconfigure to include the new node, and restart. This kind of defeats the horizontal scaling strategy. We are using 2 EC2 instances as web servers, 1 as a DB master, and 1 as a slave. GlusterFS is installed on the web server machines as well as the DB slave machine (we backup files to s3 from this machine). The files
5 0.65663368 98 high scalability-2007-09-18-Sync data on all servers
Introduction: I have a few apache servers ( arround 11 atm ) serving a small amount of data ( arround 44 gigs right now ). For some time I have been using rsync to keep all the content equal on all servers, but the amount of data has been growing, and rsync takes a few too much time to "compare" all data from source to destination, and create a lot of I/O. I have been taking a look at MogileFS, it seems a good and reliable option, but as the fuse module is not finished, we should have to rewrite all our apps, and its not an option atm. Any ideas? I just want a "real time, non resource-hungry" solution alternative for rsync. If I get more features on the way, then they are welcome :) Why I prefer to use a Distributed File System instead of using NAS + NFS? - I need 2 NAS, if I dont want a point of failure, and NAS hard is expensive. - Non-shared hardware, all server has their own local disks. - As files are replicated, I can save a lot of money, RAID is not a MUST. Thn
6 0.60807163 516 high scalability-2009-02-19-Heavy upload server scalability
7 0.5979445 308 high scalability-2008-04-22-Simple NFS failover solution with symbolic link?
8 0.58370531 181 high scalability-2007-12-11-Hosting and CDN for startup video sharing site
9 0.57275158 198 high scalability-2008-01-01-HOW CDN works
10 0.56798089 488 high scalability-2009-01-08-file synchronization solutions
11 0.56326872 251 high scalability-2008-02-18-How to deal with an I-O bottleneck to disk?
12 0.54908407 889 high scalability-2010-08-30-Pomegranate - Storing Billions and Billions of Tiny Little Files
13 0.54069841 796 high scalability-2010-03-16-Justin.tv's Live Video Broadcasting Architecture
14 0.54022765 1359 high scalability-2012-11-15-Gone Fishin': Justin.Tv's Live Video Broadcasting Architecture
15 0.53958446 1037 high scalability-2011-05-10-Viddler Architecture - 7 Million Embeds a Day and 1500 Req-Sec Peak
16 0.53521901 274 high scalability-2008-03-12-YouTube Architecture
17 0.53275394 884 high scalability-2010-08-23-6 Ways to Kill Your Servers - Learning How to Scale the Hard Way
18 0.53142416 369 high scalability-2008-08-18-Code deployment tools
19 0.53058851 63 high scalability-2007-08-09-Lots of questions for high scalability - high availability
20 0.52587271 294 high scalability-2008-04-01-How to update video views count effectively?
topicId topicWeight
[(1, 0.09), (2, 0.129), (94, 0.643)]
simIndex simValue blogId blogTitle
1 0.99060792 559 high scalability-2009-04-07-Six Lessons Learned Deploying a Large-scale Infrastructure in Amazon EC2
Introduction: Lessons learned from OpenX's large-scale deployment to Amazon EC2: Expect failures; what's more, embrace them Fully automate your infrastructure deployments Design your infrastructure so that it scales horizontally Establish clear measurable goals Be prepared to quickly identify and eliminate bottlenecks Play wack-a-mole for a while, until things get stable
Introduction: The reference configurations described in this blueprint are starting points for building Sun Customer Ready HPC Clusters configured with Sun Fire X2100 M2 and X2200 M2 servers. The configurations define how Sun Systems Group products can be configured in a typical grid rack deployment. This document describes configurations in detail using Sun Fire X2100 M2 and X2200 M2 servers with a Gigabit Ethernet data fabric, as well as configurations using Sun Fire X2200 M2 servers with a high-speed InfiniBand fabric. These configurations focus on single rack solutions, with external connections through uplink ports of the switches. These reference configurations have been architected using Sun's expertise gained in actual, real-world installations. Within certain constraints, as described in the later sections, the system can be tailored to the customer needs. Certain system components described in this document are only available through Sun's factory integration. Although the information
same-blog 3 0.95462596 605 high scalability-2009-05-22-Distributed content system with bandwidth balancing
Introduction: I am looking for a way to distribute files over servers in different physical locations. My main concern is that I have bandwidth limitations on each location, and wish to spread the bandwidth load evenly. Atm. I just have 1:1 copies of the files on all servers, and have the application pick a random server to serve the file as a temp fix... It's a small video streaming service. I want to spoonfeed the stream to the client with a max bandwidth output, and support seek. At present I use php to limit the network stream, and read the file at a given offset sendt as a get parameter from the player for seek. It's psuedo streaming, but it works. I have been looking at MogileFS, which would solve the storage part. With MogileFS I can make use of my current php solution as it supports lighttpd and apache (with mod_rewrite or similar). However I don't see how I can apply MogileFS to check for bandwidth % usage? Any reccomendations for how I can solve this?
4 0.92393267 115 high scalability-2007-10-07-Using ThreadLocal to pass context information around in web applications
Introduction: Hi, In java web servers, each http request is handled by a thread in thread pool. So for a Servlet handling the request, a thread is assigned. It is tempting (and very convinient) to keep context information in the threadlocal variable. I recently had a requirement where we need to assign logged in user id and timestamp to request sent to web services. Because we already had the code in place, it was extremely difficult to change the method signatures to pass user id everywhere. The solution I thought is class ReferenceIdGenerator { public static setReferenceId(String login) { threadLocal.set(login + System.currentMillis()); } public static String getReferenceId() { return threadLocal.get(); } private static ThreadLocal threadLocal = new ThreadLocal(); } class MySevlet { void service(.....) { HttpSession session = request.getSession(false); String userId = session.get("userId"); ReferenceIdGenerator.setRefernceId(userId
5 0.90211183 91 high scalability-2007-09-13-Design Preparations for Scaling
Introduction: Hi there, what do you think is crucial in the code designing of a scalable site? How does one prepare for webfarms and clusters (e.g. in PHP)? Thanks, Stephan
6 0.90163577 1601 high scalability-2014-02-25-Peter Norvig's 9 Master Steps to Improving a Program
7 0.89412516 1305 high scalability-2012-08-16-Paper: A Provably Correct Scalable Concurrent Skip List
8 0.80913264 834 high scalability-2010-06-01-Web Speed Can Push You Off of Google Search Rankings! What Can You Do?
10 0.77241945 241 high scalability-2008-02-05-SLA monitoring
11 0.72798622 970 high scalability-2011-01-06-BankSimple Mini-Architecture - Using a Next Generation Toolchain
12 0.71484447 1025 high scalability-2011-04-16-The NewSQL Market Breakdown
13 0.71243936 1412 high scalability-2013-02-25-SongPop Scales to 1 Million Active Users on GAE, Showing PaaS is not Passé
14 0.70507765 39 high scalability-2007-07-30-Product: Akamai
15 0.70183718 827 high scalability-2010-05-14-Hot Scalability Links for May 14, 2010
16 0.7002691 78 high scalability-2007-09-01-2 tier switch selection for colocation
17 0.69758433 1084 high scalability-2011-07-22-Stuff The Internet Says On Scalability For July 22, 2011
18 0.69337851 44 high scalability-2007-07-30-Product: Photobucket
19 0.68204802 1223 high scalability-2012-04-06-Stuff The Internet Says On Scalability For April 6, 2012
20 0.66519749 1023 high scalability-2011-04-14-Strategy: Cache Application Start State to Reduce Spin-up Times