high_scalability high_scalability-2008 high_scalability-2008-261 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: This is what Mike Peters says he can do : make your site run 10 times faster. His test bed is "half a dozen servers parsing 200,000 pages per hour over 40 IP addresses, 24 hours a day." Before optimization CPU spiked to 90% with 50 concurrent connections. After optimization each machine "was effectively handling 500 concurrent connections per second with CPU at 8% and no degradation in performance." Mike identifies six major bottlenecks: Database write access (read is cheaper) Database read access PHP, ASP, JSP and any other server side scripting Client side JavaScript Multiple/Fat Images, scripts or css files from different domains on your page Slow keep-alive client connections, clogging your available sockets Mike's solutions: Switch all database writes to offline processing Minimize number of database read access to the bare minimum. No more than two queries per page. Denormalize your database and Optimize MySQL tables Implement MemCached and cha
sentIndex sentText sentNum sentScore
1 His test bed is "half a dozen servers parsing 200,000 pages per hour over 40 IP addresses, 24 hours a day. [sent-2, score-0.444]
2 " Before optimization CPU spiked to 90% with 50 concurrent connections. [sent-3, score-0.393]
3 After optimization each machine "was effectively handling 500 concurrent connections per second with CPU at 8% and no degradation in performance. [sent-4, score-0.528]
4 Denormalize your database and Optimize MySQL tables Implement MemCached and change your database-access layer to fetch information from the in-memory database first. [sent-7, score-0.443]
5 If your system has high reads, keep MySQL tables as MyISAM. [sent-9, score-0.12]
6 If your system has high writes, switch MySQL tables to InnoDB. [sent-10, score-0.12]
7 Precompile all php scripts using eAccelerator If you're using WordPress, implement WP-Cache Reduce size of all images by using an image optimizer Merge multiple css/js files into one, Minify your . [sent-12, score-0.656]
8 js scripts Avoid hardlinking to images or scripts residing on other domains. [sent-13, score-0.957]
9 YSlow analyze your web pages on the fly, giving you a performance grade and recommending the changes you need to make. [sent-18, score-0.359]
10 conf to kill connections after 5 seconds of inactivity, turn gzip compression on. [sent-20, score-0.37]
11 Configure Apache to add Expire and ETag headers, allowing client web browsers to cache images, . [sent-21, score-0.195]
12 js files Consider dumping Apache and replacing it with Lighttpd or Nginx. [sent-23, score-0.316]
wordName wordTfidf (topN-words)
[('scripts', 0.302), ('images', 0.252), ('mike', 0.18), ('connections', 0.165), ('clogging', 0.156), ('etag', 0.156), ('inactivity', 0.156), ('side', 0.147), ('firebug', 0.14), ('jsp', 0.14), ('bed', 0.14), ('minify', 0.14), ('optimization', 0.137), ('recommending', 0.135), ('spiked', 0.135), ('asp', 0.131), ('yslow', 0.131), ('dumping', 0.127), ('gzip', 0.124), ('identifies', 0.124), ('firefox', 0.121), ('wordpress', 0.121), ('concurrent', 0.121), ('tables', 0.12), ('expire', 0.115), ('database', 0.115), ('grade', 0.113), ('headers', 0.111), ('pages', 0.111), ('apache', 0.107), ('degradation', 0.105), ('browsers', 0.104), ('parsing', 0.103), ('mysql', 0.103), ('files', 0.102), ('references', 0.101), ('residing', 0.101), ('writes', 0.095), ('bare', 0.094), ('fetch', 0.093), ('lighttpd', 0.092), ('client', 0.091), ('dozen', 0.09), ('css', 0.089), ('replacing', 0.087), ('read', 0.086), ('offline', 0.086), ('fly', 0.086), ('domains', 0.085), ('kill', 0.081)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 261 high scalability-2008-02-25-Make Your Site Run 10 Times Faster
Introduction: This is what Mike Peters says he can do : make your site run 10 times faster. His test bed is "half a dozen servers parsing 200,000 pages per hour over 40 IP addresses, 24 hours a day." Before optimization CPU spiked to 90% with 50 concurrent connections. After optimization each machine "was effectively handling 500 concurrent connections per second with CPU at 8% and no degradation in performance." Mike identifies six major bottlenecks: Database write access (read is cheaper) Database read access PHP, ASP, JSP and any other server side scripting Client side JavaScript Multiple/Fat Images, scripts or css files from different domains on your page Slow keep-alive client connections, clogging your available sockets Mike's solutions: Switch all database writes to offline processing Minimize number of database read access to the bare minimum. No more than two queries per page. Denormalize your database and Optimize MySQL tables Implement MemCached and cha
2 0.17791173 47 high scalability-2007-07-30-Product: Yslow to speed up your web pages
Introduction: Update : Speed up Apache - how I went from F to A in YSlow . Good example of using YSlow to speed up a website with solid code examples. Every layer in the multi-layer cake that is your website contributes to how long a page takes to display. YSlow , from Yahoo, is a cool tool for discovering how the ingredients of your site's top layer contribute to performance. YSlow analyzes web pages and tells you why they're slow based on the rules for high performance web sites. YSlow is a Firefox add-on integrated with the popular Firebug web development tool. YSlow gives you: Performance report card HTTP/HTML summary List of components in the page Tools including JSLint
3 0.15658796 291 high scalability-2008-03-29-20 New Rules for Faster Web Pages
Introduction: Update: Nice explanation in The importance of bandwidth versus latency of how long latencies cause cascading delays in resource loading. Doloto tries to optimize how resources are loaded. Twenty new rules have been added to the original 14 rules for sizzling web performance. Part of scalability is worrying about performance too. The front-end is where 80-90% of end-user response time is spent and following these best practices improved the performance of Yahoo! properties by 25-50%. The rules are divided into server, content, cookie, JavaScript, CSS, images, and mobile categories. The new rules are: Flush the buffer early [server] Use GET for AJAX requests [server] Post-load components [content] Preload components [content] Reduce the number of DOM elements [content] Split components across domains [content] Minimize the number of iframes [content] No 404s [content] Reduce cookie size [cookie] Use cookie-free domains for components [coo
4 0.14754501 319 high scalability-2008-05-14-Scaling an image upload service
Introduction: Hi, First of all I want to to say that this is an extremely interesting and informative website. i have enjoyed reading the various posts on how the big sites scale to meet the needs of their customers. The service we are developing is a webcam service. The client application sends images to the server via HTTP POST and they are saved in folder specified by the users id. When a new image is sent to the server it will overwrite the current image. Users can then view the images via our web server. Ideally we want the images to upload as quickly as possible and allow users to view them as quickly as possible. Would I be correct to assume that when the number of uploading clients exceeds the capability of the server the only way to scale is to add more hardware. Also I assume that to use HTTP accelerator caches will not speed up viewing the images as the new images will invalidate the cache. I appreciate any input on the subject.
5 0.13448043 638 high scalability-2009-06-26-PlentyOfFish Architecture
Introduction: Update 5 : PlentyOfFish Update - 6 Billion Pageviews And 32 Billion Images A Month Update 4 : Jeff Atwood costs out Markus' scale up approach against a scale out approach and finds scale up wanting. The discussion in the comments is as interesting as the article. My guess is Markus doesn't want to rewrite his software to work across a scale out cluster so even if it's more expensive scale up works better for his needs. Update 3 : POF now has 200 million images and serves 10,000 images served per second. They'll be moving to a 250,000 IOPS RamSan to handle the load. Also upgraded to a core database machine with 512 GB of RAM, 32 CPU’s, SQLServer 2008 and Windows 2008. Update 2 : This seems to be a POF Peer1 love fest infomercial . It's pretty content free, but the production values are high. Lots of quirky sounds and fish swimming on the screen. Update : by Facebook standards Read/WriteWeb says POF is worth a cool one billion dollars . It helps to talk like Dr. Evil whe
6 0.13299739 1361 high scalability-2012-11-22-Gone Fishin': PlentyOfFish Architecture
7 0.12693086 5 high scalability-2007-07-10-mixi.jp Architecture
8 0.12598448 360 high scalability-2008-08-04-A Bunch of Great Strategies for Using Memcached and MySQL Better Together
9 0.12572911 32 high scalability-2007-07-26-Product: eAccelerator a PHP Accelerator
10 0.12016861 1501 high scalability-2013-08-13-In Memoriam: Lavabit Architecture - Creating a Scalable Email Service
11 0.11729131 1456 high scalability-2013-05-13-The Secret to 10 Million Concurrent Connections -The Kernel is the Problem, Not the Solution
12 0.10817187 1268 high scalability-2012-06-20-Ask HighScalability: How do I organize millions of images?
13 0.10768361 1508 high scalability-2013-08-28-Sean Hull's 20 Biggest Bottlenecks that Reduce and Slow Down Scalability
14 0.10757221 942 high scalability-2010-11-15-Strategy: Biggest Performance Impact is to Reduce the Number of HTTP Requests
16 0.10320947 152 high scalability-2007-11-13-Flickr Architecture
17 0.10085272 554 high scalability-2009-04-04-Digg Architecture
18 0.097584136 313 high scalability-2008-05-02-Friends for Sale Architecture - A 300 Million Page View-Month Facebook RoR App
19 0.096550204 274 high scalability-2008-03-12-YouTube Architecture
20 0.096459471 808 high scalability-2010-04-12-Poppen.de Architecture
topicId topicWeight
[(0, 0.154), (1, 0.077), (2, -0.067), (3, -0.177), (4, 0.004), (5, 0.046), (6, -0.001), (7, -0.034), (8, 0.002), (9, 0.027), (10, -0.025), (11, -0.093), (12, 0.053), (13, 0.013), (14, -0.004), (15, -0.011), (16, -0.005), (17, 0.011), (18, 0.04), (19, -0.04), (20, 0.035), (21, 0.031), (22, -0.053), (23, 0.021), (24, 0.062), (25, 0.036), (26, -0.033), (27, -0.006), (28, 0.032), (29, -0.062), (30, -0.034), (31, 0.003), (32, 0.004), (33, 0.042), (34, 0.024), (35, -0.001), (36, 0.012), (37, 0.078), (38, 0.001), (39, -0.069), (40, -0.049), (41, -0.005), (42, -0.008), (43, 0.016), (44, 0.004), (45, 0.0), (46, 0.03), (47, -0.013), (48, 0.006), (49, 0.034)]
simIndex simValue blogId blogTitle
same-blog 1 0.96162045 261 high scalability-2008-02-25-Make Your Site Run 10 Times Faster
Introduction: This is what Mike Peters says he can do : make your site run 10 times faster. His test bed is "half a dozen servers parsing 200,000 pages per hour over 40 IP addresses, 24 hours a day." Before optimization CPU spiked to 90% with 50 concurrent connections. After optimization each machine "was effectively handling 500 concurrent connections per second with CPU at 8% and no degradation in performance." Mike identifies six major bottlenecks: Database write access (read is cheaper) Database read access PHP, ASP, JSP and any other server side scripting Client side JavaScript Multiple/Fat Images, scripts or css files from different domains on your page Slow keep-alive client connections, clogging your available sockets Mike's solutions: Switch all database writes to offline processing Minimize number of database read access to the bare minimum. No more than two queries per page. Denormalize your database and Optimize MySQL tables Implement MemCached and cha
2 0.76725441 136 high scalability-2007-10-28-Scaling Early Stage Startups
Introduction: Mark Maunder of No VC Required --who advocates not taking VC money lest you be turned into a frog instead of the prince (or princess) you were dreaming of--has an excellent slide deck on how to scale an early stage startup. His blog also has some good SEO tips and a very spooky widget showing the geographical location of his readers. Perfect for Halloween! What is Mark's other worldly scaling strategies for startups? Site: http://novcrequired.com/ Information Sources Slides from Seattle Tech Startup Talk . Scaling Early Stage Startups blog post by Mark Maunder. The Platform Linxux An ISAM type data store. Perl Httperf is used for benchmarking. Websitepulse.com is used for perf monitoring. The Architecture Performance matters because being slow could cost you 20% of your revenue. The UIE guys disagree saying this ain't necessarily so. They explain their reasoning in Usability Tools Podcast: The Truth About Page Download Time . The idea i
3 0.73427188 203 high scalability-2008-01-07-How Ruby on Rails Survived a 550k Pageview Digging
Introduction: Shanti Braford details how his Ruby on Rails based website survived a 24 hour 550,000+ pageview digg attack. His post cleanly lays out all the juicy setup details, so there's not much I can add. Hosting costs $370 a month for 1 web server, 1 database server, and sufficient bandwidth. The site is built on RoR, nginx, MySQL, and 7 mongrel servers. He thinks Rails 2.0 has improved performance and credits database avoidance and fragment caching for much of the performance boost. Keep in mind his system is relatively static, but it's a very interesting and useful experience report.
4 0.73373538 1486 high scalability-2013-07-03-5 Rockin' Tips for Scaling PHP to 30,000 Concurrent Users Per Server
Introduction: Jonathan Block , CTO at RockThePost.com , a crowdfunding company, has written a nice set of tips for smaller sites on how to scale a service on EC2 using a small two person development team. Their service has a typical small scale structure: PHP's Zend Framework 2 Two m1.medium for web servers ELB to split the load master/slave MySQL database Siege for load testing The very sensible tips that can handle 30,000 concurrent users per web server: Use PHP's APC feature . APC is opcode cache that is " really a requirement in order for a website to have a chance at performing well." Put everything that's not a .php request on a CDN . Don't serve static files from your web server. They put everything on S3 and use CloudFront as their CDN. Recent CloudFront problems have caused them to serve directly from S3. Don't make connections to other servers in your PHP code . Making connections to other servers blocks the server and slows down processing. Use the APC k
5 0.69475907 1615 high scalability-2014-03-19-Strategy: Three Techniques to Survive Traffic Surges by Quickly Scaling Your Site
Introduction: Matthew Might , as a first responder to a surprise traffic surge on his inexpensive linode hosted blog, took emergency steps that you might find useful in a similar situation: Find the bottleneck. Reloading the page in firebug showed the first page took 24 seconds to load and after that everything else loaded quickly. In retrospect this burst meant the site was thread limited as the CPU was idle. Cut image sizes in half with a shell script using ImageMagick's convert. Load time is now 12 seconds. Turn dynamic content into static content using a static index.html file copied using the browser's "view source" feature. Load time is now 6 seconds. Added threads to the Apache configuration file. Load time is now 2 seconds. Crises averted. Because of this quick thinking and quick action the patient survived to serve pages another day. And in fine post-mortem tradition some of the future changes are: Run a cron job to trigger an earlier alert . Email when requ
6 0.68096113 884 high scalability-2010-08-23-6 Ways to Kill Your Servers - Learning How to Scale the Hard Way
7 0.67206258 554 high scalability-2009-04-04-Digg Architecture
8 0.6715737 150 high scalability-2007-11-12-Slashdot Architecture - How the Old Man of the Internet Learned to Scale
9 0.67039484 800 high scalability-2010-03-26-Strategy: Caching 404s Saved the Onion 66% on Server Time
10 0.65585572 47 high scalability-2007-07-30-Product: Yslow to speed up your web pages
11 0.65307343 1268 high scalability-2012-06-20-Ask HighScalability: How do I organize millions of images?
12 0.65097308 314 high scalability-2008-05-03-Product: nginx
13 0.64604443 72 high scalability-2007-08-22-Wikimedia architecture
14 0.64539814 7 high scalability-2007-07-12-FeedBurner Architecture
15 0.64222717 829 high scalability-2010-05-20-Strategy: Scale Writes to 734 Million Records Per Day Using Time Partitioning
16 0.64212209 365 high scalability-2008-08-16-Strategy: Serve Pre-generated Static Files Instead Of Dynamic Pages
17 0.64085037 52 high scalability-2007-08-01-Product: Memcached
18 0.63690233 808 high scalability-2010-04-12-Poppen.de Architecture
19 0.63401252 821 high scalability-2010-05-03-MocoSpace Architecture - 3 Billion Mobile Page Views a Month
20 0.6285125 437 high scalability-2008-11-03-How Sites are Scaling Up for the Election Night Crush
topicId topicWeight
[(1, 0.09), (2, 0.19), (30, 0.481), (61, 0.09), (79, 0.056)]
simIndex simValue blogId blogTitle
1 0.98379987 131 high scalability-2007-10-25-Should JSPs be avoided for high scalability?
Introduction: I just heard about some web sites where Velocity templates are used to render HTML instead of using JSPs and all the processing in performed in servlets. Can JSPs cause issue with scalability? Thanks, Unmesh
2 0.94982904 991 high scalability-2011-02-16-Paper: An Experimental Investigation of the Akamai Adaptive Video Streaming
Introduction: Video is hot on the Internet and people are really interested in knowing how to make it work. Dan Rayburn has a post pointing to a fascinating paper: An Experimental Investigation of the Akamai Adaptive Video Streaming , which talks in some detail about the protocols big players like YouTube, Skype and Akamai use to serve video over on an inherently video unfriendly medium like the Internet. For Akamai they found: Each video is encoded in five versions at different bit rates and stored in separate files. The client sends commands to the server with an average inter departure time of about 2 s, i.e. the control algorithm is executed on average each 2 seconds. Akamai uses only the video level to adapt the video source to the available bandwidth, whereas the frame rate of the video is kept constant. When a sudden drop in the available bandwidth occurs, short interruptions of the video playback can occur due to the a large actuation delay. For a sudden increase of the avai
3 0.94615316 1016 high scalability-2011-04-04-Scaling Social Ecommerce Architecture Case study
Introduction: A recent study showed that over 92 percent of executives from leading retailers are focusing their marketing efforts on Facebook and subsequent applications. Furthermore, over 71 percent of users have confirmed they are more likely to make a purchase after “liking” a brand they find online. ( source ) Sears Architect Tomer Gabel provides an insightful overview on how they built a Social Ecommerce solution for Sears.com that can handle complex relationship quires in real time. Tomer goes through: the architectural considerations behind their solution why they chose memory over disk how they partitioned the data to gain scalability why they chose to execute code with the data using GigaSpaces Map/Reduce execution framework how they integrated with Facebook why they chose GigaSpaces over Coherence and Terracotta for in-memory caching and scale In this post I tried to summarize the main takeaway from the interview. You can also watch the full interview (highly reco
4 0.94598919 1459 high scalability-2013-05-16-Paper: Warp: Multi-Key Transactions for Key-Value Stores
Introduction: Looks like an interesting take on "a completely asynchronous, low-latency transaction management protocol, in line with the fully distributed NoSQL architecture." Warp: Multi-Key Transactions for Key-Value Stores  overview: Implementing ACID transactions has been a longstanding challenge for NoSQL systems. Because these systems are based on a sharded architecture, transactions necessarily require coordination across multiple servers. Past work in this space has relied either on heavyweight protocols such as Paxos or clock synchronization for this coordination. This paper presents a novel protocol for coordinating distributed transactions with ACID semantics on top of a sharded data store. Called linear transactions, this protocol achieves scalability by distributing the coordination task to only those servers that hold relevant data for each transaction. It achieves high performance by serializing only those transactions whose concurrent execution could potentially yield a vio
5 0.88503081 16 high scalability-2007-07-16-Book: High Performance MySQL
Introduction: As users come to depend on MySQL, they find that they have to deal with issues of reliability, scalability, and performance--issues that are not well documented but are critical to a smoothly functioning site. This book is an insider's guide to these little understood topics. Author Jeremy Zawodny has managed large numbers of MySQL servers for mission-critical work at Yahoo!, maintained years of contacts with the MySQL AB team, and presents regularly at conferences. Jeremy and Derek have spent months experimenting, interviewing major users of MySQL, talking to MySQL AB, benchmarking, and writing some of their own tools in order to produce the information in this book. In High Performance MySQL you will learn about MySQL indexing and optimization in depth so you can make better use of these key features. You will learn practical replication, backup, and load-balancing strategies with information that goes beyond available tools to discuss their effects in real-life environments. And you
same-blog 6 0.86987084 261 high scalability-2008-02-25-Make Your Site Run 10 Times Faster
7 0.86557204 14 high scalability-2007-07-15-Web Analytics: An Hour a Day
8 0.84683645 308 high scalability-2008-04-22-Simple NFS failover solution with symbolic link?
9 0.84678119 500 high scalability-2009-01-22-Heterogeneous vs. Homogeneous System Architectures
10 0.81119245 831 high scalability-2010-05-26-End-To-End Performance Study of Cloud Services
11 0.80901194 182 high scalability-2007-12-12-Oracle Can Do Read-Write Splitting Too
12 0.80495024 783 high scalability-2010-02-24-Hot Scalability Links for February 24, 2010
13 0.8001706 263 high scalability-2008-02-27-Product: System Imager - Automate Deployment and Installs
14 0.77847201 336 high scalability-2008-05-31-Biggest Under Reported Story: Google's BigTable Costs 10 Times Less than Amazon's SimpleDB
15 0.74077481 1284 high scalability-2012-07-16-Cinchcast Architecture - Producing 1,500 Hours of Audio Every Day
16 0.72554082 334 high scalability-2008-05-29-Amazon Improves Diagonal Scaling Support with High-CPU Instances
17 0.72526228 917 high scalability-2010-10-08-4 Scalability Themes from Surgecon
18 0.70235163 43 high scalability-2007-07-30-Product: ImageShack
19 0.69046444 291 high scalability-2008-03-29-20 New Rules for Faster Web Pages
20 0.68145603 464 high scalability-2008-12-13-Strategy: Facebook Tweaks to Handle 6 Time as Many Memcached Requests