high_scalability high_scalability-2008 high_scalability-2008-262 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Hi, I was wondering if anyone has found any information on how to architect a system to support high availability file uploads. My scenario: I have an Apache server proxying requests to a bunch of Tomcat Java application servers. When I need to upgrade my site, I stop and upgrade each of the Tomcat servers one at a time. This seems to work well as Apache automatically routes subsequent requests for the stopped app server to the remaining app servers that are up. The problem is that if a user is uploading a file when the app server is stopped, the upload fails and the user has to upload the file again. This is problematic as uploading files is an integral feature of the site and it's frustrating for the users to have to restart their uploads every time I upgrade the site (which I want to be able to do frequently). Has anyone seen any information on how this can be done or have ideas on how this can be architected? I imagine sites like Flickr must have a solution to this problem
sentIndex sentText sentNum sentScore
1 Hi, I was wondering if anyone has found any information on how to architect a system to support high availability file uploads. [sent-1, score-0.658]
2 My scenario: I have an Apache server proxying requests to a bunch of Tomcat Java application servers. [sent-2, score-0.504]
3 When I need to upgrade my site, I stop and upgrade each of the Tomcat servers one at a time. [sent-3, score-0.809]
4 This seems to work well as Apache automatically routes subsequent requests for the stopped app server to the remaining app servers that are up. [sent-4, score-1.393]
5 The problem is that if a user is uploading a file when the app server is stopped, the upload fails and the user has to upload the file again. [sent-5, score-1.666]
6 This is problematic as uploading files is an integral feature of the site and it's frustrating for the users to have to restart their uploads every time I upgrade the site (which I want to be able to do frequently). [sent-6, score-2.038]
7 Has anyone seen any information on how this can be done or have ideas on how this can be architected? [sent-7, score-0.516]
8 I imagine sites like Flickr must have a solution to this problem as I have seen presentations they say that they are able to upgrade their site several times a day without the users noticing. [sent-8, score-1.297]
wordName wordTfidf (topN-words)
[('upgrade', 0.329), ('uploading', 0.288), ('tomcat', 0.263), ('stopped', 0.259), ('upload', 0.23), ('proxying', 0.204), ('integral', 0.181), ('site', 0.181), ('app', 0.165), ('file', 0.16), ('problematic', 0.159), ('anyone', 0.155), ('frustrating', 0.154), ('apache', 0.148), ('subsequent', 0.148), ('seen', 0.146), ('architected', 0.138), ('uploads', 0.136), ('presentations', 0.134), ('remaining', 0.128), ('restart', 0.127), ('routes', 0.126), ('flickr', 0.113), ('requests', 0.11), ('wondering', 0.109), ('fails', 0.109), ('scenario', 0.109), ('bunch', 0.103), ('frequently', 0.102), ('able', 0.093), ('stop', 0.089), ('information', 0.089), ('architect', 0.087), ('server', 0.087), ('imagine', 0.082), ('user', 0.08), ('automatically', 0.077), ('problem', 0.077), ('users', 0.074), ('files', 0.07), ('ideas', 0.07), ('seems', 0.066), ('sites', 0.066), ('feature', 0.065), ('servers', 0.062), ('several', 0.059), ('found', 0.058), ('say', 0.056), ('done', 0.056), ('java', 0.055)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999988 262 high scalability-2008-02-26-Architecture to Allow High Availability File Upload
Introduction: Hi, I was wondering if anyone has found any information on how to architect a system to support high availability file uploads. My scenario: I have an Apache server proxying requests to a bunch of Tomcat Java application servers. When I need to upgrade my site, I stop and upgrade each of the Tomcat servers one at a time. This seems to work well as Apache automatically routes subsequent requests for the stopped app server to the remaining app servers that are up. The problem is that if a user is uploading a file when the app server is stopped, the upload fails and the user has to upload the file again. This is problematic as uploading files is an integral feature of the site and it's frustrating for the users to have to restart their uploads every time I upgrade the site (which I want to be able to do frequently). Has anyone seen any information on how this can be done or have ideas on how this can be architected? I imagine sites like Flickr must have a solution to this problem
2 0.14124906 41 high scalability-2007-07-30-Product: Flickr
Introduction: Flickr offers a free basic account with limited upload bandwidth and limited storage. Download bandwidth is unlimited. Upgrading to a paid Pro account for $25/year removes all upload and storage restrictions. Flickr's terms of use warn that "professional or corporate uses of Flickr are prohibited", and all external images require a link back to Flickr.
Introduction: This is a guest post by Dave Hagler Systems Architect at AOL. The AOL homepages receive more than 8 million visitors per day . That’s more daily viewers than Good Morning America or the Today Show on television. Over a billion page views are served each month. AOL.com has been a major internet destination since 1996, and still has a strong following of loyal users. The architecture for AOL.com is in it’s 5th generation . It has essentially been rebuilt from scratch 5 times over two decades. The current architecture was designed 6 years ago. Pieces have been upgraded and new components have been added along the way, but the overall design remains largely intact. The code, tools, development and deployment processes are highly tuned over 6 years of continual improvement, making the AOL.com architecture battle tested and very stable. The engineering team is made up of developers, testers, and operations and totals around 25 people . The majority are in Dulles, Virginia
4 0.12476418 372 high scalability-2008-08-27-Updating distributed web applications
Introduction: Hi, we've got a web application, which runs without the common standalone application servers like tomcat or jboss, rather it runs with an embedded jetty server. Now we are planing to run instances of this application on multiple machines, with a load balancer serving the requests. The big question is: is there a common scenario on how to update these applications? Lets think of 10 instances on 10 machines (one instance per machine), where we want to update each of these applications version. The brute force approach would be, to stop all instances, update and then restart it. This is a lot of manual work ;) Another problem is down-time: so someone must only shutdown one server after another, but then there are multiple application versions around. Can someone please provide us with a hint for this problem? Perhaps papers, tools or something like that? Thanks a lot :)
5 0.11938266 516 high scalability-2009-02-19-Heavy upload server scalability
Introduction: Hi, We are running a backup solution that uploads every night the files our clients worked on during the day (Cabonite-like). We have currently about 10GB of data per night, via http PUT requests (1 per file), and the files are written as-is on a NAS. Our architecture is basically compound of a load balancer (hardware, sticky sessions), 5 servers (Tomcat under RHEL4/5, ) and a NAS (nfs 3). Since our number of clients is rising, (as is our system load) how would you recommend we could scale our infrastructure? hardware and software? Should we go towards NAS sharding, more servers, NIO on tomcat...? Thanks for your inputs!
6 0.11832695 1402 high scalability-2013-02-07-Ask HighScalability: Web asset server concept - 3rd party software available?
7 0.11571358 532 high scalability-2009-03-11-Sharding and Connection Pools
8 0.11554503 126 high scalability-2007-10-20-Should you build your next website using 3tera's grid OS?
9 0.10996655 232 high scalability-2008-01-29-When things aren't scalable
10 0.10427602 319 high scalability-2008-05-14-Scaling an image upload service
11 0.10381992 691 high scalability-2009-08-31-Squarespace Architecture - A Grid Handles Hundreds of Millions of Requests a Month
12 0.099551797 1102 high scalability-2011-08-22-Strategy: Run a Scalable, Available, and Cheap Static Site on S3 or GitHub
13 0.099320509 152 high scalability-2007-11-13-Flickr Architecture
14 0.098530784 106 high scalability-2007-10-02-Secrets to Fotolog's Scaling Success
15 0.096446455 621 high scalability-2009-06-06-Graph server
16 0.09582819 808 high scalability-2010-04-12-Poppen.de Architecture
17 0.095602356 276 high scalability-2008-03-15-New Website Design Considerations
18 0.09533909 176 high scalability-2007-12-07-Synchronizing databases in different geographic locations
19 0.090276726 517 high scalability-2009-02-21-Google AppEngine - A Second Look
20 0.090097442 1288 high scalability-2012-07-23-Ask HighScalability: How Do I Build My MegaUpload + Itunes + YouTube Startup?
topicId topicWeight
[(0, 0.141), (1, 0.036), (2, -0.014), (3, -0.162), (4, -0.01), (5, -0.057), (6, 0.005), (7, -0.017), (8, 0.02), (9, 0.052), (10, -0.036), (11, -0.01), (12, -0.0), (13, -0.035), (14, 0.094), (15, -0.043), (16, 0.009), (17, 0.015), (18, -0.005), (19, -0.0), (20, 0.015), (21, -0.009), (22, -0.014), (23, 0.005), (24, -0.01), (25, -0.046), (26, 0.029), (27, -0.067), (28, -0.004), (29, -0.062), (30, 0.019), (31, -0.018), (32, 0.006), (33, -0.064), (34, -0.026), (35, 0.053), (36, 0.044), (37, -0.053), (38, -0.004), (39, 0.009), (40, -0.026), (41, -0.001), (42, -0.035), (43, -0.034), (44, -0.029), (45, -0.016), (46, -0.017), (47, 0.063), (48, -0.01), (49, 0.08)]
simIndex simValue blogId blogTitle
same-blog 1 0.97412044 262 high scalability-2008-02-26-Architecture to Allow High Availability File Upload
Introduction: Hi, I was wondering if anyone has found any information on how to architect a system to support high availability file uploads. My scenario: I have an Apache server proxying requests to a bunch of Tomcat Java application servers. When I need to upgrade my site, I stop and upgrade each of the Tomcat servers one at a time. This seems to work well as Apache automatically routes subsequent requests for the stopped app server to the remaining app servers that are up. The problem is that if a user is uploading a file when the app server is stopped, the upload fails and the user has to upload the file again. This is problematic as uploading files is an integral feature of the site and it's frustrating for the users to have to restart their uploads every time I upgrade the site (which I want to be able to do frequently). Has anyone seen any information on how this can be done or have ideas on how this can be architected? I imagine sites like Flickr must have a solution to this problem
2 0.78537399 251 high scalability-2008-02-18-How to deal with an I-O bottleneck to disk?
Introduction: A site I'm working with has an I/O bottleneck. They're using a static server to deliver all of the pictures/video content/zip downloads ecetera but now that the bandwith out of that server is approaching 50Mbit/second the latency on serving small files has increased to become unacceptable. I'm curious how other people have dealt with this situation. Seperating into two different servers would require a significant change to the sites architecutre (because the premise is that all uploads go into one server, all subdirectorie are created in one directory, etc.) and may not really solve the problem.
3 0.72605115 1288 high scalability-2012-07-23-Ask HighScalability: How Do I Build My MegaUpload + Itunes + YouTube Startup?
Introduction: This question was sent in by Val, who asking for a little help in creating the next big thing. Any ideas? I'm planning to run my own, first startup website and have been surfing the webs for relevant info to plan the technology I will use for it (the frontend and the backend, including the software and the hardware). The website will be something like a combination of: MegaUpload (users will upload their files) iTunes (users will be paid for their uploads) and YouTube (in the future I'm planning to let users watch/listen to the content online, without downloading). I don't have any investors yet, nor the budget - I'm still preparing the idea and I'm going to create first implementation (an "alpha version") before I show it to potential investors. Hence the initial technologies have to be extremely cheap *but* also highly scalable in the future so that I don't have to redo anything when the website grows. Unfortunately I don't have much experience in running
4 0.71184766 379 high scalability-2008-09-04-Database question for upcoming project
Introduction: We will be developing an RIA that will have a lot of database access. Think something like a QuickBooks but with about 50 transactions entered per hour per user. Users will be in the system for 7 to 9 hours a day and there will be around 20,000 users, all logged in at the same time. Reporting will be done just like a QuickBooks style app plus a lot of extra things you don't do in QuickBooks. Our operations is familiar with W2003 Server and MS SQL Server so they are recommending we stick with that. I originally requested Linux and PostgreSQL. How far can a single database server get me? If we have a 4 processor, 8 core, 128gb server, how far am I going to get before I need to shard or do something else? I know there are a lot of factors involved but in general for this size of a site, what should the strategy be? I've read almost all articles on this website but most of the applications are not RIA type of apps with this type of usage or they are architectures for
5 0.71052247 611 high scalability-2009-05-31-Need help on Site loading & database optimization - URGENT
Introduction: Hi Friends, I need some help in making site access fast. On an average my site has the traffic 2500 hits per day and on 16th May it had 60,000 hits. On this day site was loading very slow even it was getting time out. I also check out the processes running by using "top" command it was indicating mysql was taking too much load. There are around 166 tables (Including PHPBB forum) in my database. All contents on site are displayed by fetching it from database. I have also added indexing to respective tables where it is required. Plain PHP/HTML coding is used. Technology: PHP -- 5.2 MYSQL -- 5.0 Apache -- 2.0 Linux Following is all the server details of my site: CPU : Single Socket Dual Core AMD Opteron 1212HE Memory: 2GB DDR RAM Hard Drive: 250GB SATA Ethernet: 100Mb Primary Ethernet Card (/var/log) # uname -a Linux 2.6.9-67.0.15.ELsmp #1 SMP Tue Apr 22 13:50:33 EDT 2008 i686 athlon i386 GNU/Linux kernel version: 2.6.9-67.0.15.ELsmp (/var/log) # free -m total used
6 0.6977492 232 high scalability-2008-01-29-When things aren't scalable
7 0.69550097 59 high scalability-2007-08-04-Try Squid as a Reverse Proxy
8 0.69544798 1268 high scalability-2012-06-20-Ask HighScalability: How do I organize millions of images?
9 0.68416178 256 high scalability-2008-02-21-Tracking usage of public resources - throttling accesses per hour
10 0.6804415 229 high scalability-2008-01-29-Building scalable storage into application - Instead of MogileFS OpenAFS etc.
11 0.68000674 329 high scalability-2008-05-27-Secure Remote Administration for Large-Scale Networks
12 0.67151093 8 high scalability-2007-07-12-Should I use LAMP or Windows?
13 0.66805154 488 high scalability-2009-01-08-file synchronization solutions
14 0.66651458 1402 high scalability-2013-02-07-Ask HighScalability: Web asset server concept - 3rd party software available?
15 0.66284275 167 high scalability-2007-11-27-Starting a website from scratch - what technologies should I use?
16 0.66225678 598 high scalability-2009-05-12-P2P server technology?
17 0.66219765 319 high scalability-2008-05-14-Scaling an image upload service
18 0.66217238 283 high scalability-2008-03-18-Shared filesystem on EC2
19 0.65303832 176 high scalability-2007-12-07-Synchronizing databases in different geographic locations
20 0.65297216 321 high scalability-2008-05-17-WebSphere Commerce High Availability and Performance Configurations
topicId topicWeight
[(1, 0.078), (2, 0.265), (5, 0.209), (61, 0.183), (79, 0.144)]
simIndex simValue blogId blogTitle
same-blog 1 0.94126093 262 high scalability-2008-02-26-Architecture to Allow High Availability File Upload
Introduction: Hi, I was wondering if anyone has found any information on how to architect a system to support high availability file uploads. My scenario: I have an Apache server proxying requests to a bunch of Tomcat Java application servers. When I need to upgrade my site, I stop and upgrade each of the Tomcat servers one at a time. This seems to work well as Apache automatically routes subsequent requests for the stopped app server to the remaining app servers that are up. The problem is that if a user is uploading a file when the app server is stopped, the upload fails and the user has to upload the file again. This is problematic as uploading files is an integral feature of the site and it's frustrating for the users to have to restart their uploads every time I upgrade the site (which I want to be able to do frequently). Has anyone seen any information on how this can be done or have ideas on how this can be architected? I imagine sites like Flickr must have a solution to this problem
2 0.91583627 1113 high scalability-2011-09-09-Stuff The Internet Says On Scalability For September 9, 2011
Introduction: Scale the modern way / No brush / No lather / No rub-in / Big tube 35 cents - Drug stores / HighScalability : GAE Serves 1.5 Billion Pages a Day Potent quotables : @ kendallmiller : The code changes I'm most proud of are the ones few people will ever see - like I just tripled the scalability of our session analysis. @ Kellblog : Heard: "Cassandra is more a system on which you build a DBMS than a DBMS itself." @DDevine_au : Ah dammit . I'm thinking of using a # NoSQL database. Down the rabbit hole I go. A comprehensive guide to parallel video decoding . Emeric Grange with a sweet explanation of the decoding process. Node.js vs. Scala - "Scaling in the large" . tedsuo tldrs it: in node, there is only one concurrency model. A number of other platforms offer multiple concurrency models. If you want access to one of those other models down the line, you will have to carve off that part of your application and rewrite i
3 0.90202463 153 high scalability-2007-11-13-Friendster Lost Lead Because of a Failure to Scale
Introduction: Hey, this scaling stuff might just be important. Jim Scheinman, former Bebo and Friendster exec, puts the blame squarely on Friendster's inability to scale as why they lost the social networking race: VB : Can you tell me a bit about what you learned in your time at Friendster? JS : For me, it basically came down to failed execution on the technology side — we had millions of Friendster members begging us to get the site working faster so they could log in and spend hours social networking with their friends. I remember coming in to the office for months reading thousands of customer service emails telling us that if we didn’t get our site working better soon, they’d be ‘forced to join’ a new social networking site that had just launched called MySpace…the rest is history. To be fair to Friendster’s technology team at the time, they were on the forefront of many new scaling and database issues that web sites simply hadn’t had to deal with prior to Friendster. As is often
4 0.89724278 1523 high scalability-2013-09-27-Stuff The Internet Says On Scalability For September 27, 2013
Introduction: Hey, it's HighScalability time: ( The WINLAB at Rutgers, with software defined radios tied into GENI. ) 384 cores & 32TB of RAM : Oracle's SPARC M6 Quotable Quotes: @jennyinc : 2003: "I replaced you with a set of very small shell scripts." 2013: "I replaced your scripts with a six-figure enterprise DevOps platform." @tomdale : OH: “Redis is so fast, why don’t we replace RAM with Redis?” @petrillic : OH "Promises/futures are the one-night stands of architectural constructs" nice #strangeloop @TwitterEng : "Java and Scala let Twitter readily share and modify its enormous codebase across a team of hundreds of developers." Lots of juicy numbers revealed at Structure:Europe : Netflix streams 114,000 years of video every month; Custom build Netflix boxes for its content-delivery network that contain between 100 and 150 terabytes of stor
5 0.89428896 534 high scalability-2009-03-12-Google TechTalk: Amdahl's Law in the Multicore Era
Introduction: Over the last several decades computer architects have been phenomenally successful turning the transistor bounty provided by Moore's Law into chips with ever increasing single-threaded performance. During many of these successful years, however, many researchers paid scant attention to multiprocessor work. Now as vendors turn to multicore chips, researchers are reacting with more papers on multi-threaded systems. While this is good, we are concerned that further work on single-thread performance will be squashed. To help understand future high-level trade-offs, we develop a corollary to Amdahl's Law for multicore chips [Hill & Marty, IEEE Computer 2008]. It models fixed chip resources for alternative designs that use symmetric cores, asymmetric cores, or dynamic techniques that allow cores to work together on sequential execution. Our results encourage multicore designers to view performance of the entire chip rather than focus on core efficiencies. Moreover, we observe that obtai
6 0.88479829 899 high scalability-2010-09-09-How did Google Instant become Faster with 5-7X More Results Pages?
7 0.87945068 1273 high scalability-2012-06-27-Paper: Logic and Lattices for Distributed Programming
8 0.87799537 485 high scalability-2009-01-05-Messaging is not just for investment banks
9 0.87308413 595 high scalability-2009-05-08-Publish-subscribe model does not scale?
10 0.86857563 1487 high scalability-2013-07-05-Stuff The Internet Says On Scalability For July 5, 2013
11 0.85336387 341 high scalability-2008-06-06-GigaOm Structure 08 Conference on June 25th in San Francisco
13 0.85061544 1247 high scalability-2012-05-18-Stuff The Internet Says On Scalability For May 18, 2012
14 0.84962648 1535 high scalability-2013-10-21-Google's Sanjay Ghemawat on What Made Google Google and Great Big Data Career Advice
15 0.84807396 1242 high scalability-2012-05-09-Cell Architectures
16 0.84724563 1142 high scalability-2011-11-14-Using Gossip Protocols for Failure Detection, Monitoring, Messaging and Other Good Things
17 0.84404647 1337 high scalability-2012-10-10-Antirez: You Need to Think in Terms of Organizing Your Data for Fetching
18 0.83917934 169 high scalability-2007-12-01-many website, one setup, many databases
19 0.83858377 1153 high scalability-2011-12-08-Update on Scalable Causal Consistency For Wide-Area Storage With COPS
20 0.83498001 1461 high scalability-2013-05-20-The Tumblr Architecture Yahoo Bought for a Cool Billion Dollars