high_scalability high_scalability-2008 high_scalability-2008-353 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Given S3's recent failure ( Cloud Status tells the tale) Kevin Burton makes the excellent suggestion of fronting S3 with a caching proxy server . A caching proxy server can reply to service requests without contacting the specified server, by retrieving content saved from a previous request, made by the same client or even other clients. This is called caching. Caching proxies keep local copies of frequently requested resources. In normal operation when an asset (a user's avatar, for example) is requested the cache is tried first. If the asset is found in the cache then it's returned. If the asset is not in the cache it's retrieved from S3 (or wherever) and cached. So when S3 goes down it's likely you can ride out the down time by serving assets out of the cache. This strategy only works when using S3 as a CDN . If you are using S3 for its "real" purpose, as a storage service, then a caching proxy can't help you... Amazon doesn't used S3 as a CDN either Amazon Not
sentIndex sentText sentNum sentScore
1 Given S3's recent failure ( Cloud Status tells the tale) Kevin Burton makes the excellent suggestion of fronting S3 with a caching proxy server . [sent-1, score-0.767]
2 A caching proxy server can reply to service requests without contacting the specified server, by retrieving content saved from a previous request, made by the same client or even other clients. [sent-2, score-1.15]
3 Caching proxies keep local copies of frequently requested resources. [sent-4, score-0.528]
4 In normal operation when an asset (a user's avatar, for example) is requested the cache is tried first. [sent-5, score-0.864]
5 If the asset is found in the cache then it's returned. [sent-6, score-0.465]
6 If the asset is not in the cache it's retrieved from S3 (or wherever) and cached. [sent-7, score-0.585]
7 So when S3 goes down it's likely you can ride out the down time by serving assets out of the cache. [sent-8, score-0.298]
8 If you are using S3 for its "real" purpose, as a storage service, then a caching proxy can't help you. [sent-10, score-0.464]
9 Some proxy options are: Squid , Nginx , Varnish . [sent-15, score-0.408]
10 Planaroo shares how a small startup responds to an S3 outage (summarized): Up-to-date backups are a good thing. [sent-16, score-0.391]
11 Keep current backups such that you can switch to a new URL for your assets. [sent-17, score-0.251]
12 Switch to your backup rather than wait for the system to come up quickly, because it may not. [sent-20, score-0.135]
13 Serve CSS, JavaScript, icons, and Google AJAX libraries from alternate sources. [sent-21, score-0.129]
14 Don't rely S3 or Google to always be able to server your crown jewels. [sent-22, score-0.25]
wordName wordTfidf (topN-words)
[('asset', 0.344), ('proxy', 0.341), ('requested', 0.24), ('backups', 0.179), ('jewels', 0.176), ('acaching', 0.176), ('fronting', 0.176), ('crown', 0.166), ('contacting', 0.166), ('icons', 0.152), ('avatar', 0.152), ('limelight', 0.143), ('alternate', 0.129), ('suggestion', 0.127), ('ride', 0.125), ('retrieving', 0.123), ('caching', 0.123), ('cache', 0.121), ('summarized', 0.12), ('retrieved', 0.12), ('wherever', 0.12), ('responds', 0.119), ('tale', 0.119), ('specified', 0.119), ('proxies', 0.116), ('reply', 0.115), ('assets', 0.106), ('kevin', 0.103), ('css', 0.1), ('ajax', 0.097), ('compete', 0.094), ('outage', 0.093), ('copies', 0.089), ('saved', 0.087), ('rely', 0.084), ('frequently', 0.083), ('url', 0.081), ('tried', 0.081), ('purpose', 0.078), ('normal', 0.078), ('previous', 0.076), ('google', 0.074), ('switch', 0.072), ('javascript', 0.07), ('cdn', 0.069), ('wait', 0.068), ('said', 0.068), ('serving', 0.067), ('options', 0.067), ('backup', 0.067)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 353 high scalability-2008-07-20-Strategy: Front S3 with a Caching Proxy
Introduction: Given S3's recent failure ( Cloud Status tells the tale) Kevin Burton makes the excellent suggestion of fronting S3 with a caching proxy server . A caching proxy server can reply to service requests without contacting the specified server, by retrieving content saved from a previous request, made by the same client or even other clients. This is called caching. Caching proxies keep local copies of frequently requested resources. In normal operation when an asset (a user's avatar, for example) is requested the cache is tried first. If the asset is found in the cache then it's returned. If the asset is not in the cache it's retrieved from S3 (or wherever) and cached. So when S3 goes down it's likely you can ride out the down time by serving assets out of the cache. This strategy only works when using S3 as a CDN . If you are using S3 for its "real" purpose, as a storage service, then a caching proxy can't help you... Amazon doesn't used S3 as a CDN either Amazon Not
2 0.16280687 456 high scalability-2008-12-01-Sun's High-Performance and Reliable Web Proxy Solution
Introduction: As individuals and businesses depend on the Web more than ever to conduct business, rapid and reliable content retrieval is critical. Reducing wait time improves productivity and increases user satisfaction. Web proxy technology has emerged as an effective solution to improve performance, help ensure content availability and enhance network security by caching and filtering Web content. The combination of Sun SPARC Enterprise servers with CoolThreads technology and the Sun Java System Web Proxy Server software provides a compelling foundation for a robust Web proxy solution. Sun SPARC Enterprise T1000 and T2000 servers include the UltraSPARC T1 processor with CoolThreads technology, offering six or eight cores with four threads per core. The Sun Java System Web Proxy Server software is highly threaded and takes advantage of the large number of threads supported by Sun UltraSPARC T1 processors with CoolThreads technology. Together, these products provide a highly scalable solution that
3 0.16260307 297 high scalability-2008-04-05-Skype Plans for PostgreSQL to Scale to 1 Billion Users
Introduction: Skype uses PostgreSQL as their backend database . PostgreSQL doesn't get enough run in the database world so I was excited to see how PostgreSQL is used "as the main DB for most of [Skype's] business needs." Their approach is to use a traditional stored procedure interface for accessing data and on top of that layer proxy servers which hash SQL requests to a set of database servers that actually carry out queries. The result is a horizontally partitioned system that they think will scale to handle 1 billion users. Skype's goal is an architecture that can handle 1 billion plus users. This level of scale isn't practically solvable with one really big computer, so our masked superhero horizontal scaling comes to the rescue. Hardware is dual or quad Opterons with SCSI RAID. Followed common database progression: Start with one DB. Add new databases partitioned by functionality. Replicate read-mostly data for better read access. Then horizontally partition data across multiple nod
4 0.14071955 172 high scalability-2007-12-02-nginx: high performance smpt-pop-imap proxy
Introduction: nginx is a high performance smtp/pop/imap proxy that lets you do custom authorization and lookups and is very scalable. (just add nodes) Nginx by default is a reverse proxy and this is what it is doing here for pop/imap connections. It is also an excellelent reverse proxy for web servers. Advantage: You dont have to have a speacial database or ldap schema. Just an url to do auth and lookup with. A url that may be accessed by a unix or a tcp socket. Write your own auth handler - according to your own policy. For example: A user called atif tries to login with the pass testxyz. You pass this infomation to a URL such as socket:/var/tmp/xyz.sock or http://auth.corp.mailserver.net:someport/someurl The auth server replies with either a FAILURE such as Auth-Status: Invalid Login or password or with a success such as Auth-Status: OK Auth-Server: OneOfThe100Servers Auth-Port: optionalyAPort We have implemented it at our ISP and it has saves us a
Introduction: Update 2: A HSCALE benchmark finds HSCALE "adds a maximum overhead of about 0.24 ms per query (against a partitioned table)." Future releases promise much improved results. Update: A new presentation at An Introduction to HSCALE . After writing Skype Plans for PostgreSQL to Scale to 1 Billion Users , which shows how Skype smartly uses a proxy architecture for scaling, I'm now seeing MySQL Proxy articles all over the place. It's like those "get rich quick" books that say all you have to do is visualize a giraffe with a big yellow dot superimposed over it and by sympathetic magic giraffes will suddenly stampede into your life. Without realizing it I must have visualized transparent proxies smothered in yellow dots. One of the brightest images is a wonderful series of articles by Peter Romianowski describing the evolution of their proxy architecture. Their application is an OLTP system executing 200 million transaction per month, tables with more than 1.5 billion rows, and a 6
6 0.12958737 1402 high scalability-2013-02-07-Ask HighScalability: Web asset server concept - 3rd party software available?
7 0.1232553 431 high scalability-2008-10-27-Notify.me Architecture - Synchronicity Kills
8 0.120621 662 high scalability-2009-07-27-Handle 700 Percent More Requests Using Squid and APC Cache
9 0.097985007 124 high scalability-2007-10-16-How Scalable are Single Page Ajax Apps?
10 0.097314917 1396 high scalability-2013-01-30-Better Browser Caching is More Important than No Javascript or Fast Networks for HTTP Performance
11 0.093354046 194 high scalability-2007-12-26-Golden rule of web caching
12 0.091006249 313 high scalability-2008-05-02-Friends for Sale Architecture - A 300 Million Page View-Month Facebook RoR App
13 0.087159701 360 high scalability-2008-08-04-A Bunch of Great Strategies for Using Memcached and MySQL Better Together
14 0.084435932 192 high scalability-2007-12-25-IBMer Says LAMP Can't Scale
15 0.083234228 565 high scalability-2009-04-13-Benchmark for keeping data in browser in AJAX projects
16 0.082167022 1589 high scalability-2014-02-03-How Google Backs Up the Internet Along With Exabytes of Other Data
17 0.08169733 665 high scalability-2009-07-29-Strategy: Let Google and Yahoo Host Your Ajax Library - For Free
18 0.079851873 60 high scalability-2007-08-07-Can you profit from the coming Content Delivery Network wars?
19 0.079102576 1033 high scalability-2011-05-02-The Updated Big List of Articles on the Amazon Outage
20 0.079041317 1266 high scalability-2012-06-18-Google on Latency Tolerant Systems: Making a Predictable Whole Out of Unpredictable Parts
topicId topicWeight
[(0, 0.117), (1, 0.077), (2, -0.041), (3, -0.074), (4, -0.024), (5, -0.039), (6, 0.014), (7, -0.032), (8, -0.018), (9, 0.017), (10, -0.025), (11, -0.037), (12, -0.023), (13, -0.002), (14, -0.025), (15, -0.005), (16, -0.024), (17, -0.011), (18, 0.06), (19, -0.068), (20, -0.015), (21, 0.007), (22, 0.063), (23, 0.001), (24, -0.052), (25, 0.088), (26, 0.034), (27, 0.058), (28, -0.02), (29, 0.027), (30, -0.016), (31, -0.006), (32, -0.008), (33, -0.039), (34, -0.003), (35, -0.003), (36, -0.004), (37, 0.045), (38, 0.023), (39, -0.022), (40, 0.026), (41, -0.047), (42, -0.013), (43, -0.067), (44, -0.054), (45, -0.059), (46, -0.022), (47, -0.011), (48, -0.032), (49, -0.011)]
simIndex simValue blogId blogTitle
same-blog 1 0.97325635 353 high scalability-2008-07-20-Strategy: Front S3 with a Caching Proxy
Introduction: Given S3's recent failure ( Cloud Status tells the tale) Kevin Burton makes the excellent suggestion of fronting S3 with a caching proxy server . A caching proxy server can reply to service requests without contacting the specified server, by retrieving content saved from a previous request, made by the same client or even other clients. This is called caching. Caching proxies keep local copies of frequently requested resources. In normal operation when an asset (a user's avatar, for example) is requested the cache is tried first. If the asset is found in the cache then it's returned. If the asset is not in the cache it's retrieved from S3 (or wherever) and cached. So when S3 goes down it's likely you can ride out the down time by serving assets out of the cache. This strategy only works when using S3 as a CDN . If you are using S3 for its "real" purpose, as a storage service, then a caching proxy can't help you... Amazon doesn't used S3 as a CDN either Amazon Not
Introduction: Performance guru Steve Souders gave his keynote presentation, Cache is King! ( slides ), at the HTML5DevCon, besides being an extremely clear explanation of how caching works on the Internet and how to optimize your use of HTTP to get the best performance, Steve ran experiments that found some surprising results on what gave the best web site performance improvements. In his base line test, page loads took 7.65 seconds (median of three runs). What change--Fast Network, No Javascript, or Primed Cache--would make the biggest performance improvement? It was Primed Cache. Fast Network - Using a fast FIOS network the load time was 4.13 seconds. Steve was surprised how big a difference this made, given how much work must happen in the browser. No JavaScript - 4.74 seconds after disabling JavaScript. Both reduces transfers and skips parsing by the browser. Steve thought the effect would have been larger. Primed Cache - 3.46 seconds using a warm cache, less than half than the
3 0.68656272 800 high scalability-2010-03-26-Strategy: Caching 404s Saved the Onion 66% on Server Time
Introduction: In the article The Onion Uses Django, And Why It Matters To Us , a lot of interesting points are made about their ambitious infrastructure move from Drupal/PHP to Django/Python: the move wasn't that hard, it just took time and work because of their previous experience moving the A.V. Club website; churn in core framework APIs make it more attractive to move than stay; supporting the structure of older versions of the site is an unsolved problem; the built-in Django admin saved a lot of work; group development is easier with "fewer specialized or hacked together pieces"; they use IRC for distributed development; sphinx for full-text search; nginx is the media server and reverse proxy; haproxy made the launch process a 5 second procedure; capistrano for deployment; clean component separation makes moving easier; Git for version control; ORM with complicated querysets is a performance problem; memcached for caching rendered pages; the CDN checks for updates every 10 minutes; videos, ar
4 0.6842081 1321 high scalability-2012-09-12-Using Varnish for Paywalls: Moving Logic to the Edge
Introduction: This is a guest post from Per Buer , founder and CEO of Varnish Software , provider of Varnish Cache, an open source web application accelerator freely available at varnish-cache.org . Varnish powers a lot of really big websites worldwide. We at Varnish Software are all about speed. Varnish Cache is built for speed. It executes its policy code more or less a thousand times faster than your typical Java or PHP based application servers, mostly due to the fact that the configuration is compiled into system call free machine code. System calls require expensive context switches, stall the CPU and wreck havoc in the CPU cache so avoiding them makes the code fly. There are strong limitations on what kind of logic you can move into Varnish Cache, but the logic that you do move there will run very fast. An example is using Varnish for access control to serve access controlled content from the caching edge layer. The Varnish Paywall Who gets to access your content? In a tradi
5 0.67425019 996 high scalability-2011-02-28-A Practical Guide to Varnish - Why Varnish Matters
Introduction: This is a guest post by Jeff Su from Factual. What is Varnish? Varnish is an open source, high performance http accelerator that sits in front of a web stack and caches pages. This caching layer is very configurable and can be used for both static and dynamic content. One great thing about Varnish is that it can improve the performance of your website without requiring any code changes. If you haven’t heard of Varnish (or have heard of it, but haven’t used it), please read on. Adding Varnish to your stack can be completely noninvasive, but if you tweak your stack to play along with some of varnish’s more advanced features, you’ll be able to increase performance by orders of magnitude. Some of the high profile companies using Varnish include: Twitter , Facebook , Heroku and LinkedIn . Our Use Case One of Factual’s first high profile projects was Newsweek’s “America’s Best High Schools: The List” . After realizing that we had only a few weeks to increase our
6 0.63954639 836 high scalability-2010-06-04-Strategy: Cache Larger Chunks - Cache Hit Rate is a Bad Indicator
7 0.63521713 1401 high scalability-2013-02-06-Super Bowl Advertisers Ready for the Traffic? Nope..It's Lights Out.
8 0.63085085 703 high scalability-2009-09-12-How Google Taught Me to Cache and Cash-In
9 0.6203987 673 high scalability-2009-08-07-Strategy: Break Up the Memcache Dog Pile
10 0.61928713 230 high scalability-2008-01-29-Speed up (Oracle) database code with result caching
11 0.61748374 359 high scalability-2008-07-29-Ehcache - A Java Distributed Cache
12 0.61486793 662 high scalability-2009-07-27-Handle 700 Percent More Requests Using Squid and APC Cache
13 0.60989994 1333 high scalability-2012-10-04-LinkedIn Moved from Rails to Node: 27 Servers Cut and Up to 20x Faster
14 0.60815293 247 high scalability-2008-02-12-We want to cache a lot :) How do we go about it ?
15 0.60759538 124 high scalability-2007-10-16-How Scalable are Single Page Ajax Apps?
16 0.60303509 1346 high scalability-2012-10-24-Saving Cash Using Less Cache - 90% Savings in the Caching Tier
17 0.59384066 360 high scalability-2008-08-04-A Bunch of Great Strategies for Using Memcached and MySQL Better Together
18 0.59290266 665 high scalability-2009-07-29-Strategy: Let Google and Yahoo Host Your Ajax Library - For Free
19 0.59282011 573 high scalability-2009-04-16-Serving 250M quotes-day at CNBC.com with aiCache
20 0.5913524 911 high scalability-2010-09-30-More Troubles with Caching
topicId topicWeight
[(2, 0.22), (3, 0.262), (10, 0.131), (40, 0.047), (61, 0.045), (79, 0.144), (94, 0.049)]
simIndex simValue blogId blogTitle
Introduction: Awesome paper on how particular synchronization mechanisms scale on multi-core architectures: Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask . The goal is to pick a locking approach that doesn't degrade as the number of cores increase. Like everything else in life, that doesn't appear to be generically possible: None of the nine locking schemes we consider consistently outperforms any other one, on all target architectures or workloads. Strictly speaking, to seek optimality, a lock algorithm should thus be selected based on the hardware platform and the expected workload . Abstract: This paper presents the most exhaustive study of synchronization to date. We span multiple layers, from hardware cache-coherence protocols up to high-level concurrent software. We do so on different types architectures, from single-socket – uniform and nonuniform – to multi-socket – directory and broadcastbased – many-cores. We draw a set of observations t
same-blog 2 0.82976782 353 high scalability-2008-07-20-Strategy: Front S3 with a Caching Proxy
Introduction: Given S3's recent failure ( Cloud Status tells the tale) Kevin Burton makes the excellent suggestion of fronting S3 with a caching proxy server . A caching proxy server can reply to service requests without contacting the specified server, by retrieving content saved from a previous request, made by the same client or even other clients. This is called caching. Caching proxies keep local copies of frequently requested resources. In normal operation when an asset (a user's avatar, for example) is requested the cache is tried first. If the asset is found in the cache then it's returned. If the asset is not in the cache it's retrieved from S3 (or wherever) and cached. So when S3 goes down it's likely you can ride out the down time by serving assets out of the cache. This strategy only works when using S3 as a CDN . If you are using S3 for its "real" purpose, as a storage service, then a caching proxy can't help you... Amazon doesn't used S3 as a CDN either Amazon Not
3 0.76452863 926 high scalability-2010-10-24-Hot Scalability Links For Oct 24, 2010
Introduction: On a cold and rainy Fall day, a day stolen from winter rather than our usual gorgeous Indian Summers , a day not even the SF Giants winning the pennant can help warm, here are some hot links to read by a digital flame: Using MySQL as a NoSQL - A story for exceeding 750,000 qps on a commodity server by Yoshinori Matsunobu. Wonderfully detailed post on how you can lookup a row by ID really fast if you bypass all the typical MySQL query parsing overhead. Minecraftwiki.net and minecraftforum.net now serve more traffic than Slashdot and Stackoverflow! 1 million pageviews and 100k uniques per day, per site; 10TB of bandwidth a month; 4+ machines running Varnish, HAProxy, PHP, MySQL, Nginx. Stuff the Internet Says: @ old_sound : Somebody make me a t-shirt that says "I've read the CAP theorem and I liked it" @dscape : How relevant do I think the CAP theorem is? Not at all. I honestly hate conversations where anyone talks about crap.. cap, sorry. @humidbei
4 0.75085229 84 high scalability-2007-09-08-MP3.com Web Templating Architecture (March, 2000)
Introduction: In March, 2000, I did a talk about how we scaled with semi-static files while splitting data from presentation. For dynamic pages we used mod_perl doing an internal redirect with the XML on the style templates. Since then Apache 2.0 contains the concept of filters to allow for similar functionality.
5 0.73791206 189 high scalability-2007-12-21-Strategy: Limit Result Sets
Introduction: Release It! author Michael Nygard tells a tale of two web sites , both brought low by unexpectedly huge unbounded results sets that slowed down their sites to the speed of a Christmas checkout line. I've committed this error more than a few times. During testing the results sets are often small, so you don't see problems. Or when a product is new you don't have a lot of data so everything is fine, until some magic line is crossed and you get that dreaded 2AM fix it call. My most embarrassing bug of this type caused a rather spectacular failure at a customer site as the variance in response times was out of spec and this kicked in penalty clauses. What happened was the customer had a larger network than we could even test (customers always get the good stuff). I took a lock and went to get all the data. Because the result set was so much larger in their larger system I took the lock for many more milliseconds than I should have. Unknown to me a chunk of code on the criti
6 0.72524905 1178 high scalability-2012-01-20-Stuff The Internet Says On Scalability For January 20, 2012
7 0.71385628 365 high scalability-2008-08-16-Strategy: Serve Pre-generated Static Files Instead Of Dynamic Pages
8 0.69879973 1491 high scalability-2013-07-15-Ask HS: What's Wrong with Twitter, Why Isn't One Machine Enough?
9 0.6987431 862 high scalability-2010-07-20-Strategy: Consider When a Service Starts Billing in Your Algorithm Cost
10 0.69572604 213 high scalability-2008-01-15-Does Sun Buying MySQL Change Your Scaling Strategy?
11 0.69345808 20 high scalability-2007-07-16-Paper: The Clustered Storage Revolution
12 0.69161123 1165 high scalability-2011-12-28-Strategy: Guaranteed Availability Requires Reserving Instances in Specific Zones
13 0.68958026 1425 high scalability-2013-03-18-Beyond Threads and Callbacks - Application Architecture Pros and Cons
14 0.68700951 526 high scalability-2009-03-05-Strategy: In Cloud Computing Systematically Drive Load to the CPU
15 0.68480909 1098 high scalability-2011-08-15-Should any cloud be considered one availability zone? The Amazon experience says yes.
16 0.68160677 1371 high scalability-2012-12-12-Pinterest Cut Costs from $54 to $20 Per Hour by Automatically Shutting Down Systems
17 0.68133241 575 high scalability-2009-04-21-Thread Pool Engine in MS CLR 4, and Work-Stealing scheduling algorithm
18 0.68080223 1353 high scalability-2012-11-01-Cost Analysis: TripAdvisor and Pinterest costs on the AWS cloud
19 0.68011969 533 high scalability-2009-03-11-The Implications of Punctuated Scalabilium for Website Architecture
20 0.67995232 1266 high scalability-2012-06-18-Google on Latency Tolerant Systems: Making a Predictable Whole Out of Unpredictable Parts