high_scalability high_scalability-2008 high_scalability-2008-319 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Hi, First of all I want to to say that this is an extremely interesting and informative website. i have enjoyed reading the various posts on how the big sites scale to meet the needs of their customers. The service we are developing is a webcam service. The client application sends images to the server via HTTP POST and they are saved in folder specified by the users id. When a new image is sent to the server it will overwrite the current image. Users can then view the images via our web server. Ideally we want the images to upload as quickly as possible and allow users to view them as quickly as possible. Would I be correct to assume that when the number of uploading clients exceeds the capability of the server the only way to scale is to add more hardware. Also I assume that to use HTTP accelerator caches will not speed up viewing the images as the new images will invalidate the cache. I appreciate any input on the subject.
sentIndex sentText sentNum sentScore
1 Hi, First of all I want to to say that this is an extremely interesting and informative website. [sent-1, score-0.382]
2 i have enjoyed reading the various posts on how the big sites scale to meet the needs of their customers. [sent-2, score-0.758]
3 The service we are developing is a webcam service. [sent-3, score-0.096]
4 The client application sends images to the server via HTTP POST and they are saved in folder specified by the users id. [sent-4, score-1.508]
5 When a new image is sent to the server it will overwrite the current image. [sent-5, score-0.669]
6 Users can then view the images via our web server. [sent-6, score-0.781]
7 Ideally we want the images to upload as quickly as possible and allow users to view them as quickly as possible. [sent-7, score-1.372]
8 Would I be correct to assume that when the number of uploading clients exceeds the capability of the server the only way to scale is to add more hardware. [sent-8, score-1.142]
9 Also I assume that to use HTTP accelerator caches will not speed up viewing the images as the new images will invalidate the cache. [sent-9, score-1.953]
wordName wordTfidf (topN-words)
[('images', 0.497), ('overwrite', 0.246), ('assume', 0.237), ('folder', 0.2), ('enjoyed', 0.184), ('accelerator', 0.172), ('exceeds', 0.168), ('invalidate', 0.168), ('specified', 0.166), ('uploading', 0.164), ('viewing', 0.157), ('appreciate', 0.152), ('view', 0.151), ('quickly', 0.147), ('sends', 0.136), ('via', 0.133), ('upload', 0.131), ('capability', 0.128), ('input', 0.122), ('saved', 0.122), ('correct', 0.12), ('subject', 0.12), ('http', 0.111), ('caches', 0.11), ('posts', 0.107), ('sent', 0.102), ('server', 0.098), ('image', 0.098), ('developing', 0.096), ('extremely', 0.095), ('informative', 0.094), ('meet', 0.094), ('clients', 0.092), ('reading', 0.087), ('users', 0.084), ('want', 0.081), ('various', 0.08), ('current', 0.076), ('allow', 0.076), ('sites', 0.075), ('client', 0.072), ('scale', 0.069), ('speed', 0.066), ('add', 0.066), ('say', 0.064), ('needs', 0.062), ('possible', 0.058), ('post', 0.05), ('new', 0.049), ('interesting', 0.048)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 319 high scalability-2008-05-14-Scaling an image upload service
Introduction: Hi, First of all I want to to say that this is an extremely interesting and informative website. i have enjoyed reading the various posts on how the big sites scale to meet the needs of their customers. The service we are developing is a webcam service. The client application sends images to the server via HTTP POST and they are saved in folder specified by the users id. When a new image is sent to the server it will overwrite the current image. Users can then view the images via our web server. Ideally we want the images to upload as quickly as possible and allow users to view them as quickly as possible. Would I be correct to assume that when the number of uploading clients exceeds the capability of the server the only way to scale is to add more hardware. Also I assume that to use HTTP accelerator caches will not speed up viewing the images as the new images will invalidate the cache. I appreciate any input on the subject.
2 0.25904316 1268 high scalability-2012-06-20-Ask HighScalability: How do I organize millions of images?
Introduction: Does anyone have any advice or suggestions on how to store millions of images? Currently images are stored in a MS SQL database which performance wise isn't ideal. We'd like to migrate the images over to a file system structure but I'd assume we don't just want to dump millions of images into a single directory. Besides having to contend with naming collisions, the windows filesystem might not perform optimally with that many files. I'm assuming one approach may be to assign each user a unique CSLID, create a folder based on the CSLID and then place one users files in that particular folder. Even so, this could result in hundreds of thousands of folders. Whats the best organizational scheme/heirachy for doing this?
3 0.20525019 5 high scalability-2007-07-10-mixi.jp Architecture
Introduction: Mixi is a fast growing social networking site in Japan. They provide services like: diary, community, message, review, and photo album. Having a lot in common with LiveJournal they also developed many of the same approaches. Their write up on how they scaled their system is easily one of the best out there. Site: http://mixi.jp Information Sources mixi.jp - scaling out with open source Platform Linux Apache MySQL Perl Memcached Squid Shard What's Inside? They grew to approximately 4 million users in two years and add over 15,000 new users/day. Ranks 35th on Alexa and 3rd in Japan. More than 100 MySQL servers Add more than 10 servers/month Use non-persistent connections. Diary traffic is 85% read and 15% write. Message traffic is is 75% read and 25% write. Ran into replication performance problems so they had to split the database. Considered splitting vertically by user or splitting horizontally by table type. The ende
4 0.20296265 291 high scalability-2008-03-29-20 New Rules for Faster Web Pages
Introduction: Update: Nice explanation in The importance of bandwidth versus latency of how long latencies cause cascading delays in resource loading. Doloto tries to optimize how resources are loaded. Twenty new rules have been added to the original 14 rules for sizzling web performance. Part of scalability is worrying about performance too. The front-end is where 80-90% of end-user response time is spent and following these best practices improved the performance of Yahoo! properties by 25-50%. The rules are divided into server, content, cookie, JavaScript, CSS, images, and mobile categories. The new rules are: Flush the buffer early [server] Use GET for AJAX requests [server] Post-load components [content] Preload components [content] Reduce the number of DOM elements [content] Split components across domains [content] Minimize the number of iframes [content] No 404s [content] Reduce cookie size [cookie] Use cookie-free domains for components [coo
5 0.14754501 261 high scalability-2008-02-25-Make Your Site Run 10 Times Faster
Introduction: This is what Mike Peters says he can do : make your site run 10 times faster. His test bed is "half a dozen servers parsing 200,000 pages per hour over 40 IP addresses, 24 hours a day." Before optimization CPU spiked to 90% with 50 concurrent connections. After optimization each machine "was effectively handling 500 concurrent connections per second with CPU at 8% and no degradation in performance." Mike identifies six major bottlenecks: Database write access (read is cheaper) Database read access PHP, ASP, JSP and any other server side scripting Client side JavaScript Multiple/Fat Images, scripts or css files from different domains on your page Slow keep-alive client connections, clogging your available sockets Mike's solutions: Switch all database writes to offline processing Minimize number of database read access to the bare minimum. No more than two queries per page. Denormalize your database and Optimize MySQL tables Implement MemCached and cha
6 0.11784653 1164 high scalability-2011-12-27-PlentyOfFish Update - 6 Billion Pageviews and 32 Billion Images a Month
7 0.11416443 41 high scalability-2007-07-30-Product: Flickr
8 0.10806181 638 high scalability-2009-06-26-PlentyOfFish Architecture
9 0.10687011 1361 high scalability-2012-11-22-Gone Fishin': PlentyOfFish Architecture
10 0.10516986 399 high scalability-2008-10-01-Joyent - Cloud Computing Built on Accelerators
11 0.10467443 245 high scalability-2008-02-12-Product: rPath - Creating and Managing Virtual Appliances
12 0.10427602 262 high scalability-2008-02-26-Architecture to Allow High Availability File Upload
13 0.10123631 1333 high scalability-2012-10-04-LinkedIn Moved from Rails to Node: 27 Servers Cut and Up to 20x Faster
14 0.094372734 1402 high scalability-2013-02-07-Ask HighScalability: Web asset server concept - 3rd party software available?
15 0.091318242 978 high scalability-2011-01-26-Google Pro Tip: Use Back-of-the-envelope-calculations to Choose the Best Design
16 0.08551608 26 high scalability-2007-07-25-Paper: Lightweight Web servers
17 0.084614798 106 high scalability-2007-10-02-Secrets to Fotolog's Scaling Success
18 0.084148183 941 high scalability-2010-11-15-How Google's Instant Previews Reduces HTTP Requests
19 0.08378724 286 high scalability-2008-03-20-Paper: Asynchronous HTTP and Comet architectures
20 0.083085172 1259 high scalability-2012-06-07-3 Secrets to Lightning Fast Mobile Design at Instagram
topicId topicWeight
[(0, 0.111), (1, 0.043), (2, -0.02), (3, -0.109), (4, 0.01), (5, -0.052), (6, 0.006), (7, -0.019), (8, -0.016), (9, 0.051), (10, -0.02), (11, 0.005), (12, 0.004), (13, -0.038), (14, 0.031), (15, 0.026), (16, 0.01), (17, 0.005), (18, 0.034), (19, -0.054), (20, -0.031), (21, -0.006), (22, -0.035), (23, -0.027), (24, 0.036), (25, -0.005), (26, 0.014), (27, -0.016), (28, -0.001), (29, -0.029), (30, 0.016), (31, -0.014), (32, 0.021), (33, -0.01), (34, -0.011), (35, 0.028), (36, 0.091), (37, 0.054), (38, 0.024), (39, -0.11), (40, -0.024), (41, -0.014), (42, -0.013), (43, 0.006), (44, -0.034), (45, 0.024), (46, 0.022), (47, -0.021), (48, -0.03), (49, 0.104)]
simIndex simValue blogId blogTitle
same-blog 1 0.96685565 319 high scalability-2008-05-14-Scaling an image upload service
Introduction: Hi, First of all I want to to say that this is an extremely interesting and informative website. i have enjoyed reading the various posts on how the big sites scale to meet the needs of their customers. The service we are developing is a webcam service. The client application sends images to the server via HTTP POST and they are saved in folder specified by the users id. When a new image is sent to the server it will overwrite the current image. Users can then view the images via our web server. Ideally we want the images to upload as quickly as possible and allow users to view them as quickly as possible. Would I be correct to assume that when the number of uploading clients exceeds the capability of the server the only way to scale is to add more hardware. Also I assume that to use HTTP accelerator caches will not speed up viewing the images as the new images will invalidate the cache. I appreciate any input on the subject.
2 0.81339657 1268 high scalability-2012-06-20-Ask HighScalability: How do I organize millions of images?
Introduction: Does anyone have any advice or suggestions on how to store millions of images? Currently images are stored in a MS SQL database which performance wise isn't ideal. We'd like to migrate the images over to a file system structure but I'd assume we don't just want to dump millions of images into a single directory. Besides having to contend with naming collisions, the windows filesystem might not perform optimally with that many files. I'm assuming one approach may be to assign each user a unique CSLID, create a folder based on the CSLID and then place one users files in that particular folder. Even so, this could result in hundreds of thousands of folders. Whats the best organizational scheme/heirachy for doing this?
3 0.76597667 135 high scalability-2007-10-27-.Net2 and AJAX scalability?
Introduction: Am I mad to cons i der using .Net2 and AJAX for a high-scalabi l ity app l ication? In case you wonder why, it's the legacy of a webs i te bui l t on IIS and .Net 1.1, and we're look i ng for ways to make the content more attractive and interact i ve. In this case, it's a medical image l i brary being shared by a few Wikis and on l ine coursework for medica l students ( < 15K users) and doctors ( < 150K users) But I'm worr i ed about the performance overhead. We a l ready have a performance prob l em because of personal i sing the content for users according to their type (student or doctor), and for doctors, their grade and special i ty.
4 0.7510938 291 high scalability-2008-03-29-20 New Rules for Faster Web Pages
Introduction: Update: Nice explanation in The importance of bandwidth versus latency of how long latencies cause cascading delays in resource loading. Doloto tries to optimize how resources are loaded. Twenty new rules have been added to the original 14 rules for sizzling web performance. Part of scalability is worrying about performance too. The front-end is where 80-90% of end-user response time is spent and following these best practices improved the performance of Yahoo! properties by 25-50%. The rules are divided into server, content, cookie, JavaScript, CSS, images, and mobile categories. The new rules are: Flush the buffer early [server] Use GET for AJAX requests [server] Post-load components [content] Preload components [content] Reduce the number of DOM elements [content] Split components across domains [content] Minimize the number of iframes [content] No 404s [content] Reduce cookie size [cookie] Use cookie-free domains for components [coo
5 0.65303779 251 high scalability-2008-02-18-How to deal with an I-O bottleneck to disk?
Introduction: A site I'm working with has an I/O bottleneck. They're using a static server to deliver all of the pictures/video content/zip downloads ecetera but now that the bandwith out of that server is approaching 50Mbit/second the latency on serving small files has increased to become unacceptable. I'm curious how other people have dealt with this situation. Seperating into two different servers would require a significant change to the sites architecutre (because the premise is that all uploads go into one server, all subdirectorie are created in one directory, etc.) and may not really solve the problem.
6 0.65119159 59 high scalability-2007-08-04-Try Squid as a Reverse Proxy
7 0.64067304 176 high scalability-2007-12-07-Synchronizing databases in different geographic locations
8 0.63096339 598 high scalability-2009-05-12-P2P server technology?
9 0.60504794 262 high scalability-2008-02-26-Architecture to Allow High Availability File Upload
10 0.60051352 942 high scalability-2010-11-15-Strategy: Biggest Performance Impact is to Reduce the Number of HTTP Requests
11 0.59382486 150 high scalability-2007-11-12-Slashdot Architecture - How the Old Man of the Internet Learned to Scale
12 0.59178901 26 high scalability-2007-07-25-Paper: Lightweight Web servers
13 0.59020019 1333 high scalability-2012-10-04-LinkedIn Moved from Rails to Node: 27 Servers Cut and Up to 20x Faster
14 0.58354992 67 high scalability-2007-08-17-What is the best hosting option?
15 0.5814181 100 high scalability-2007-09-26-Use a CDN to Instantly Improve Your Website's Performance by 20% or More
16 0.57893741 1402 high scalability-2013-02-07-Ask HighScalability: Web asset server concept - 3rd party software available?
17 0.5785985 856 high scalability-2010-07-12-Creating Scalable Digital Libraries
18 0.57795417 1579 high scalability-2014-01-14-SharePoint VPS solution
19 0.57778013 204 high scalability-2008-01-08-Virus Scanning for Uploaded content
20 0.57753766 573 high scalability-2009-04-16-Serving 250M quotes-day at CNBC.com with aiCache
topicId topicWeight
[(1, 0.181), (2, 0.217), (30, 0.126), (49, 0.275), (79, 0.011), (94, 0.063)]
simIndex simValue blogId blogTitle
1 0.89632863 400 high scalability-2008-10-01-The Pattern Bible for Distributed Computing
Introduction: Software design patterns are an emerging tool for guiding and documenting system design. Patterns usually describe software abstractions used by advanced designers and programmers in their software. Patterns can provide guidance for designing highly scalable distributed systems. Let's see how! Patterns are in essence solutions to problems. Most of them are expressed in a format called Alexandrian form which draws on constructs used by Christopher Alexander. There are variants but most look like this: The pattern name The problem the pattern is trying to solve Context Solution Examples Design rationale: This tells where the pattern came from, why it works, and why experts use it Patterns rarely stand alone. Each pattern works on a context, and transforms the system in that context to produce a new system in a new context. New problems arise in the new system and context, and the next ‘‘layer’’ of patterns can be applied. A pattern language is a structured col
same-blog 2 0.82907963 319 high scalability-2008-05-14-Scaling an image upload service
Introduction: Hi, First of all I want to to say that this is an extremely interesting and informative website. i have enjoyed reading the various posts on how the big sites scale to meet the needs of their customers. The service we are developing is a webcam service. The client application sends images to the server via HTTP POST and they are saved in folder specified by the users id. When a new image is sent to the server it will overwrite the current image. Users can then view the images via our web server. Ideally we want the images to upload as quickly as possible and allow users to view them as quickly as possible. Would I be correct to assume that when the number of uploading clients exceeds the capability of the server the only way to scale is to add more hardware. Also I assume that to use HTTP accelerator caches will not speed up viewing the images as the new images will invalidate the cache. I appreciate any input on the subject.
3 0.80899072 843 high scalability-2010-06-16-WTF is Elastic Data Grid? (By Example)
Introduction: Forrester released their new wave report: T he Forrester Wave™: Elastic Caching Platforms, Q2 2010 where they listed GigaSpaces, IBM, Oracle, and Terracotta as leading vendors in the field. In this post I'd like to take some time to explain what some of these terms mean, and why they’re important to you. I’ll start with a definition of Elastic Data Grid (Elastic Caching), how it is different then other caching and NoSQL alternatives, and more importantly -- I'll illustrate how it works through some real code examples. You can read the full story here .
4 0.80777478 321 high scalability-2008-05-17-WebSphere Commerce High Availability and Performance Configurations
Introduction: Nobody came up with an example of a website powered by a Websphere product (which has a community edition) and backed up by a DB2 database. I guess you all know about usopen.org so here's the story: While the re-emergence of 35-year-old Andre Agassi and the continued dominance of wunderkind Maria Sharapova have highlighted the on-court headlines at this year's U.S. Open Tennis Championships in Flushing Meadows, N.Y., IBM is hoping its new Power5 chip-based IT support for USOpen.org can make news among those more interested in .NET than tennis nets. Big Blue has partnered with the U.S. Tennis Association and the U.S. Open -- the most prestigious tennis tournament in the U.S. -- since 1992. Together, they launched USOpen.org in 1995 so racket heads could follow the matches online. The iSeries' role this year is in powering a Web-based end-user application called "Point Tracker," a graphics tool using autonomic technology that recreates the trajectory of every shot. On-c
5 0.79751533 737 high scalability-2009-11-05-A Yes for a NoSQL Taxonomy
Introduction: NorthScale's Steven Yen in his highly entertaining NoSQL is a Horseless Carriage presentation has come up with a NoSQL taxonomy that thankfully focuses a little more on what NoSQL is, than what it isn't : key‐value‐cache memcached, repcached, coherence, infinispan, eXtreme scale, jboss cache, velocity, terracoqa key‐value‐store keyspace, flare, schema‐free, RAMCloud eventually‐consistent key‐value‐store dynamo, voldemort, Dynomite, SubRecord, Mo8onDb, Dovetaildb ordered‐key‐value‐store tokyo tyrant, lightcloud, NMDB, luxio, memcachedb, actord data‐structures server redis tuple‐store gigaspaces, coord, apache river object database ZopeDB, db4o, Shoal document store CouchDB, Mongo, Jackrabbit, XML Databases, ThruDB, CloudKit, Perservere, Riak Basho, Scalaris wide columnar store BigTable, Hbase, Cassandra, Hypertable, KAI, OpenNeptune, Qbase, KDI "Who will win?"
6 0.76162046 35 high scalability-2007-07-28-Product: FastStats Log Analyzer
7 0.75978094 735 high scalability-2009-11-01-Squeeze more performance from Parallelism
8 0.75676626 1311 high scalability-2012-08-24-Stuff The Internet Says On Scalability For August 24, 2012
9 0.73941302 310 high scalability-2008-04-29-High performance file server
10 0.73212886 1081 high scalability-2011-07-18-Building your own Facebook Realtime Analytics System
11 0.73025852 263 high scalability-2008-02-27-Product: System Imager - Automate Deployment and Installs
12 0.72514158 399 high scalability-2008-10-01-Joyent - Cloud Computing Built on Accelerators
13 0.7231409 783 high scalability-2010-02-24-Hot Scalability Links for February 24, 2010
14 0.72193569 1051 high scalability-2011-06-01-Why is your network so slow? Your switch should tell you.
15 0.71920186 699 high scalability-2009-09-10-How to handle so many socket connection
16 0.7145381 1646 high scalability-2014-05-12-4 Architecture Issues When Scaling Web Applications: Bottlenecks, Database, CPU, IO
17 0.71400028 788 high scalability-2010-03-04-How MySpace Tested Their Live Site with 1 Million Concurrent Users
18 0.71372396 1258 high scalability-2012-06-05-Thesis: Concurrent Programming for Scalable Web Architectures
19 0.70679206 1482 high scalability-2013-06-26-Leveraging Cloud Computing at Yelp - 102 Million Monthly Vistors and 39 Million Reviews
20 0.70048225 1114 high scalability-2011-09-13-Must see: 5 Steps to Scaling MongoDB (Or Any DB) in 8 Minutes