high_scalability high_scalability-2009 high_scalability-2009-602 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Film buffs will recognize Django as a classic 1966 spaghetti western that spawned hundreds of imitators. Web heads will certainly first think of Django as the classic Python based Web framework that has also spawned hundreds of imitators and has become the gold standard framework for the web. Mike Malone, who worked on Pownce, a blogging tool now owned by Six Apart, tells in this very informative EuroDjangoCon presentation how Pownce scaled using Django in the real world. I was surprised to learn how large Pounce was: hundreds of requests/sec, thousands of DB operations/sec, millions of user relationships, millions of notes, and terabytes of static data. Django has a lot of functionality in the box to help you scale, but if you want to scale large it turns out Django has some limitations and Mike tells you what these are and also provides some code to get around them. Mike's talk-although Django specific--will really help anyone creating applications on the web. There's
sentIndex sentText sentNum sentScore
1 Film buffs will recognize Django as a classic 1966 spaghetti western that spawned hundreds of imitators. [sent-1, score-0.482]
2 Web heads will certainly first think of Django as the classic Python based Web framework that has also spawned hundreds of imitators and has become the gold standard framework for the web. [sent-2, score-0.407]
3 Mike Malone, who worked on Pownce, a blogging tool now owned by Six Apart, tells in this very informative EuroDjangoCon presentation how Pownce scaled using Django in the real world. [sent-3, score-0.147]
4 I was surprised to learn how large Pounce was: hundreds of requests/sec, thousands of DB operations/sec, millions of user relationships, millions of notes, and terabytes of static data. [sent-4, score-0.082]
5 Django has a lot of functionality in the box to help you scale, but if you want to scale large it turns out Django has some limitations and Mike tells you what these are and also provides some code to get around them. [sent-5, score-0.086]
6 * Application servers are horizontally scalable because they are stateless. [sent-10, score-0.072]
7 Type of scalability: * Vertical - buy bigger hardware * Horizontal - the ability to increase a system’s capacity by adding more processing units (servers) Cache to remove load from the database server. [sent-14, score-0.156]
8 Built-in Django Caching: Per-site caching, per-view cache, template fragment cache - not so effective on heavily personalized pages Low-level Cache API is used to cache at any level of granularity. [sent-15, score-0.429]
9 Pounce cached individual objects and lists of object IDs. [sent-16, score-0.089]
10 How do you know when a value changes such that the cache should be up updates so readers see valid values? [sent-18, score-0.243]
11 Work is spread between multiple application servers using a load balancer. [sent-22, score-0.167]
12 Best way to reduce load on your app servers: don’t use them to do hard stuff. [sent-23, score-0.095]
13 Pounce used software load balancing * Hardware load balancers are expensive ($35K) and you need two for redunancy. [sent-24, score-0.277]
14 * The RDBMS’s consistency requirements get in our way * Most sharding / federation schemes are kludges that trade consistency * There are many non relational databases (CouchDB, Cassandra, Tokyo Cabinet) but they aren't easy to use with Django. [sent-31, score-0.142]
15 Rules for denormalization: * Start with a normalized database * Selectively denormalize things as they become bottlenecks * Denormalized counts, copied fields, etc. [sent-32, score-0.183]
16 * Django doesn't support multiple database connections, but there's a library, linked to at the end of this document to help. [sent-35, score-0.061]
17 When you write to the primary it takes time for the state to be transferred to the read slaves so readers may see an old value on the read. [sent-37, score-0.21]
18 * Vertical Partitioning: split tables that aren’t joined across database servers. [sent-41, score-0.233]
19 * Horizontal Partitioning: split a single table across databases (e. [sent-42, score-0.166]
20 Problem is autoincrement now doesn't work and Django uses autoincrement for primary keys. [sent-45, score-0.399]
wordName wordTfidf (topN-words)
[('django', 0.555), ('perlbal', 0.278), ('autoincrement', 0.168), ('spawned', 0.158), ('cache', 0.153), ('pownce', 0.14), ('invalidate', 0.114), ('qps', 0.112), ('split', 0.108), ('classic', 0.102), ('load', 0.095), ('readers', 0.09), ('vertical', 0.09), ('objects', 0.089), ('balancers', 0.087), ('bound', 0.086), ('tells', 0.086), ('hundreds', 0.082), ('leah', 0.079), ('kludges', 0.079), ('evictions', 0.079), ('isa', 0.079), ('horizontal', 0.078), ('caching', 0.078), ('doesn', 0.075), ('malone', 0.075), ('spaghetti', 0.075), ('culver', 0.072), ('lengths', 0.072), ('servers', 0.072), ('partitioning', 0.067), ('western', 0.065), ('heads', 0.065), ('tables', 0.064), ('denormalize', 0.064), ('federation', 0.063), ('pound', 0.063), ('primary', 0.063), ('fragment', 0.062), ('selectively', 0.062), ('film', 0.062), ('database', 0.061), ('personalized', 0.061), ('blogging', 0.061), ('denormalized', 0.061), ('deleting', 0.06), ('cabinet', 0.06), ('copied', 0.058), ('table', 0.058), ('transferred', 0.057)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999982 602 high scalability-2009-05-17-Scaling Django Web Apps by Mike Malone
Introduction: Film buffs will recognize Django as a classic 1966 spaghetti western that spawned hundreds of imitators. Web heads will certainly first think of Django as the classic Python based Web framework that has also spawned hundreds of imitators and has become the gold standard framework for the web. Mike Malone, who worked on Pownce, a blogging tool now owned by Six Apart, tells in this very informative EuroDjangoCon presentation how Pownce scaled using Django in the real world. I was surprised to learn how large Pounce was: hundreds of requests/sec, thousands of DB operations/sec, millions of user relationships, millions of notes, and terabytes of static data. Django has a lot of functionality in the box to help you scale, but if you want to scale large it turns out Django has some limitations and Mike tells you what these are and also provides some code to get around them. Mike's talk-although Django specific--will really help anyone creating applications on the web. There's
2 0.17541867 1193 high scalability-2012-02-16-A Short on the Pinterest Stack for Handling 3+ Million Users
Introduction: Pinterest co-founder Paul Sciarra shared a bit about their stack on Quora: Python + heavily-modified Django at the application layer Tornado and (very selectively) node.js as web-servers. Memcached and membase / redis for object- and logical-caching, respectively. RabbitMQ as a message queue. Nginx, HAproxy and Varnish for static-delivery and load-balancing. Persistent data storage using MySQL. MrJob on EMR for map-reduce. Git. Alex Popescu has created a cool diagram of the setup and provided some thoughtful analysis as well.
3 0.16540791 360 high scalability-2008-08-04-A Bunch of Great Strategies for Using Memcached and MySQL Better Together
Introduction: The primero recommendation for speeding up a website is almost always to add cache and more cache. And after that add a little more cache just in case. Memcached is almost always given as the recommended cache to use. What we don't often hear is how to effectively use a cache in our own products. MySQL hosted two excellent webinars (referenced below) on the subject of how to deploy and use memcached. The star of the show, other than MySQL of course, is Farhan Mashraqi of Fotolog. You may recall we did an earlier article on Fotolog in Secrets to Fotolog's Scaling Success , which was one of my personal favorites. Fotolog, as they themselves point out, is probably the largest site nobody has ever heard of, pulling in more page views than even Flickr. Fotolog has 51 instances of memcached on 21 servers with 175G in use and 254G available. As a large successful photo-blogging site they have very demanding performance and scaling requirements. To meet those requirements they've developed a
4 0.15642025 1646 high scalability-2014-05-12-4 Architecture Issues When Scaling Web Applications: Bottlenecks, Database, CPU, IO
Introduction: This is a guest repost by Venkatesh CM at Architecture Issues Scaling Web Applications . I will cover architecture issues that show up while scaling and performance tuning large scale web application in this blog. Lets start by defining few terms to create common understanding and vocabulary. Later on I will go through different issues that pop-up while scaling web application like Architecture bottlenecks Scaling Database CPU Bound Application IO Bound Application Determining optimal thread pool size of an web application will be covered in next blog. Performance Term performance of web application is used to mean several things. Most developers are primarily concerned with are response time and scalability. Response Time Is the time taken by web application to process request and return response. Applications should respond to requests (response time) within acceptable duration. If application is taking beyond the acceptable time, it is said to
5 0.15586108 116 high scalability-2007-10-08-Lessons from Pownce - The Early Years
Introduction: Pownce is a new social messaging application competing micromessage to micromessage with the likes of Twitter and Jaiku. Still in closed beta, Pownce has generously shared some of what they've learned so far. Like going to a barrel tasting of a young wine and then tasting the same wine after some aging, I think what will be really interesting is to follow Pownce and compare the Pownce of today with the Pownce of tomorrow, after a few years spent in the barrel. What lessons lie in wait for Pownce as they grow? Site: http://www.pownce.com Information Sources Pownce Lessons Learned - FOWA 2007 Scoble on Twitter vs Pownce Founder Leah Culver's Blog The Platform Python Django for the website framework Amazon's S3 for file storage. Adobe AIR (Adobe Integrated Runtime) for desktop application Memcached Available on Facebook Timeplot for charts and graphs. The Stats Developed in 4 months and went to an invite-only launch in June. Beg
6 0.13762014 1440 high scalability-2013-04-15-Scaling Pinterest - From 0 to 10s of Billions of Page Views a Month in Two Years
7 0.13184494 359 high scalability-2008-07-29-Ehcache - A Java Distributed Cache
9 0.12725535 538 high scalability-2009-03-16-Are Cloud Based Memory Architectures the Next Big Thing?
10 0.1265977 928 high scalability-2010-10-26-Scaling DISQUS to 75 Million Comments and 17,000 RPS
11 0.12172665 351 high scalability-2008-07-16-The Mother of All Database Normalization Debates on Coding Horror
12 0.11939524 313 high scalability-2008-05-02-Friends for Sale Architecture - A 300 Million Page View-Month Facebook RoR App
13 0.11935689 1269 high scalability-2012-06-20-iDoneThis - Scaling an Email-based App from Scratch
14 0.11930739 511 high scalability-2009-02-12-MySpace Architecture
15 0.11844238 1346 high scalability-2012-10-24-Saving Cash Using Less Cache - 90% Savings in the Caching Tier
16 0.11690377 800 high scalability-2010-03-26-Strategy: Caching 404s Saved the Onion 66% on Server Time
17 0.11590435 467 high scalability-2008-12-16-[ANN] New Open Source Cache System
18 0.11553335 994 high scalability-2011-02-23-This stuff isn't taught, you learn it bit by bit as you solve each problem.
19 0.11498068 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?
20 0.11456387 1633 high scalability-2014-04-16-Six Lessons Learned the Hard Way About Scaling a Million User System
topicId topicWeight
[(0, 0.208), (1, 0.118), (2, -0.045), (3, -0.141), (4, 0.034), (5, 0.076), (6, -0.01), (7, -0.06), (8, -0.041), (9, -0.03), (10, -0.018), (11, -0.022), (12, -0.014), (13, 0.076), (14, -0.081), (15, -0.042), (16, 0.005), (17, -0.018), (18, 0.019), (19, 0.002), (20, -0.047), (21, 0.057), (22, 0.013), (23, -0.034), (24, 0.004), (25, 0.018), (26, 0.045), (27, 0.074), (28, -0.021), (29, -0.001), (30, -0.009), (31, -0.032), (32, -0.024), (33, -0.018), (34, -0.008), (35, -0.049), (36, -0.063), (37, -0.024), (38, 0.043), (39, 0.001), (40, 0.036), (41, 0.032), (42, -0.019), (43, -0.045), (44, 0.024), (45, 0.029), (46, -0.012), (47, 0.029), (48, -0.044), (49, -0.002)]
simIndex simValue blogId blogTitle
same-blog 1 0.96614599 602 high scalability-2009-05-17-Scaling Django Web Apps by Mike Malone
Introduction: Film buffs will recognize Django as a classic 1966 spaghetti western that spawned hundreds of imitators. Web heads will certainly first think of Django as the classic Python based Web framework that has also spawned hundreds of imitators and has become the gold standard framework for the web. Mike Malone, who worked on Pownce, a blogging tool now owned by Six Apart, tells in this very informative EuroDjangoCon presentation how Pownce scaled using Django in the real world. I was surprised to learn how large Pounce was: hundreds of requests/sec, thousands of DB operations/sec, millions of user relationships, millions of notes, and terabytes of static data. Django has a lot of functionality in the box to help you scale, but if you want to scale large it turns out Django has some limitations and Mike tells you what these are and also provides some code to get around them. Mike's talk-although Django specific--will really help anyone creating applications on the web. There's
2 0.84629744 360 high scalability-2008-08-04-A Bunch of Great Strategies for Using Memcached and MySQL Better Together
Introduction: The primero recommendation for speeding up a website is almost always to add cache and more cache. And after that add a little more cache just in case. Memcached is almost always given as the recommended cache to use. What we don't often hear is how to effectively use a cache in our own products. MySQL hosted two excellent webinars (referenced below) on the subject of how to deploy and use memcached. The star of the show, other than MySQL of course, is Farhan Mashraqi of Fotolog. You may recall we did an earlier article on Fotolog in Secrets to Fotolog's Scaling Success , which was one of my personal favorites. Fotolog, as they themselves point out, is probably the largest site nobody has ever heard of, pulling in more page views than even Flickr. Fotolog has 51 instances of memcached on 21 servers with 175G in use and 254G available. As a large successful photo-blogging site they have very demanding performance and scaling requirements. To meet those requirements they've developed a
3 0.81599557 248 high scalability-2008-02-13-What's your scalability plan?
Introduction: How do you plan to scale your system as you reach predictable milestones? This topic came up in another venue and it reminded me about a great comment an Anonymous wrote a while ago and I wanted to make sure that comment didn't get lost. The Anonymous scaling plan was relatively simple and direct: My two cents on what I'm using to start a website from scratch using a single server for now. Later, I'll scale out horizontally when the need arises. Phase 1 Single Server, Dual Quad-Core 2.66, 8gb RAM, 500gb Disk Raid 10 OS: Fedora 8. You could go with pretty much any Linux though. I like Fedora 8 best for servers. Proxy Cache: Varnish - it is way faster than Squid per my own benchmarks. Squid chokes bigtime. Web Server: Lighttpd - faster than Apache 2 and easier to configure for me. Object Cache: Memcached. Very scalable. PHP Cache: APC. Easy to configure and seems to work fine. Language: PHP 5 - no bloated frameworks, waste of time for me. You spend too mu
4 0.81247574 149 high scalability-2007-11-12-Scaling Using Cache Farms and Read Pooling
Introduction: Michael Nygard talks about Two Ways To Boost Your Flagging Web Site . The idea behind cache farms is to move memory devoted to the various caching layers into one large farm of caches, as with memcached. The idea behind read pools is to allocate your database read requests to a pool of dedicated read servers, thus offloading the write server. Using a combination of the strategies you aren't forced to scale up the database tier to scale your website.
5 0.79950112 1620 high scalability-2014-03-27-Strategy: Cache Stored Procedure Results
Introduction: Caching is not new of course, but I don't think I've heard of caching store procedure results before. It's like memoization in the database. Brent Ozar covers this idea in How to Cache Stored Procedure Results . The benefits are the usual for doing work in the database, it doesn't take per developer per app work, just code it once in the stored proc and it works for everyone, everywhere, for all of time. The disadvantage is the usual as well, it adds extra load to a probably already busy database, so it should only be applied to heavy computations. Brent positions this strategy as an emergency bandaid to apply when you need to take pressure off a database now. Developers can then work on moving the cache off the database and into its own tier. Interesting idea. And as the comments show the implementation is never as simple as it seems.
6 0.79732525 684 high scalability-2009-08-18-Real World Web: Performance & Scalability
7 0.78067702 1346 high scalability-2012-10-24-Saving Cash Using Less Cache - 90% Savings in the Caching Tier
8 0.77970064 391 high scalability-2008-09-23-The 7 Stages of Scaling Web Apps
9 0.77489048 7 high scalability-2007-07-12-FeedBurner Architecture
10 0.77075386 928 high scalability-2010-10-26-Scaling DISQUS to 75 Million Comments and 17,000 RPS
11 0.77016795 1633 high scalability-2014-04-16-Six Lessons Learned the Hard Way About Scaling a Million User System
12 0.75738192 927 high scalability-2010-10-26-Marrying memcached and NoSQL
13 0.75557578 594 high scalability-2009-05-08-Eight Best Practices for Building Scalable Systems
14 0.75511056 389 high scalability-2008-09-23-How to Scale with Ruby on Rails
15 0.7532233 359 high scalability-2008-07-29-Ehcache - A Java Distributed Cache
16 0.75294036 673 high scalability-2009-08-07-Strategy: Break Up the Memcache Dog Pile
17 0.75178766 436 high scalability-2008-11-02-Strategy: How to Manage Sessions Using Memcached
18 0.74930382 911 high scalability-2010-09-30-More Troubles with Caching
19 0.7469449 192 high scalability-2007-12-25-IBMer Says LAMP Can't Scale
20 0.74632156 174 high scalability-2007-12-05-Product: Tugela Cache
topicId topicWeight
[(0, 0.118), (1, 0.075), (2, 0.231), (10, 0.074), (17, 0.013), (30, 0.053), (40, 0.031), (47, 0.043), (51, 0.012), (61, 0.104), (69, 0.023), (79, 0.072), (85, 0.024), (94, 0.039)]
simIndex simValue blogId blogTitle
same-blog 1 0.93613964 602 high scalability-2009-05-17-Scaling Django Web Apps by Mike Malone
Introduction: Film buffs will recognize Django as a classic 1966 spaghetti western that spawned hundreds of imitators. Web heads will certainly first think of Django as the classic Python based Web framework that has also spawned hundreds of imitators and has become the gold standard framework for the web. Mike Malone, who worked on Pownce, a blogging tool now owned by Six Apart, tells in this very informative EuroDjangoCon presentation how Pownce scaled using Django in the real world. I was surprised to learn how large Pounce was: hundreds of requests/sec, thousands of DB operations/sec, millions of user relationships, millions of notes, and terabytes of static data. Django has a lot of functionality in the box to help you scale, but if you want to scale large it turns out Django has some limitations and Mike tells you what these are and also provides some code to get around them. Mike's talk-although Django specific--will really help anyone creating applications on the web. There's
2 0.9212774 1259 high scalability-2012-06-07-3 Secrets to Lightning Fast Mobile Design at Instagram
Introduction: In Secrets to Lightning Fast Mobile Design , Instagram co-founder Mike Krieger shares three strategies Instagram uses to outsmart high latency mobile networks and make mobile apps feel faster than they really are, which helped Instagram reach 12 million users in 12 months: The motivation: "mobile experiences fill gaps while we wait. no one wants to wait while they wait" Perform actions optimistically . Make the user feel productive. Likes, comments, and follows are registered on screen while the like operation is being sent in the background. Adaptively preload content . Load content before it's needed. Don't images according to time, it will take too long, instead re-prioritize based on interest. Show what the social network things is the most popular. Listen to what your user's flicks and taps are telling you. Move bits when no-one is watching. Picture uploading starts before you share them, most apps wait until after the share screen. The rule is send data as soo
3 0.91500872 1534 high scalability-2013-10-18-Stuff The Internet Says On Scalability For October 18th, 2013
Introduction: Hey, it's HighScalability time: Test your sense of scale. Is this image of something microscopic or macroscopic? Find out . $3.5 million : Per Episode Cost of Breaking Bad Quotable Quotes: @GammaCounter : "There are 400 billion trees in the Amazon River basin, close to the number of stars in the Milky Way galaxy." @rbranson : Virtualization has near-zero overhead, unless the VM spends most of it's time copying between RAM and network… like memcached or haproxy. @HackerNewsOnion : Programming is 1% inspiration, 99% trying to get your environment working. @aneel : "roundtrips, not bandwidth, is now often the bottleneck for most applications" @jamesurquhart : Not to mention the fact that auto-scaling should happen above IaaS layer. Think multi-cloud. Sheref Mansy : A machine keeps sort of chugging away, without worrying about its environment. But a living system has to. V.D. Veksler : it just came to my attention that Javascri
Introduction: Tachyon ( github ) is interesting new filesystem brought to by the folks at the UC Berkeley AMP Lab : Tachyon is a fault tolerant distributed file system enabling reliable file sharing at memory-speed across cluster frameworks, such as Spark and MapReduce.It offers up to 300 times higher throughput than HDFS, by leveraging lineage information and using memory aggressively. Tachyon caches working set files in memory, and enables different jobs/queries and frameworks to access cached files at memory speed. Thus, Tachyon avoids going to disk to load datasets that is frequently read. It has a Java-like File API, native support for raw tables, a pluggable file system, and it works with Hadoop with no modifications. It might work well for streaming media too as you wouldn't have to wait for the complete file to hit the disk before rendering. Discuss on Hacker News
5 0.90218842 703 high scalability-2009-09-12-How Google Taught Me to Cache and Cash-In
Introduction: A user named Apathy on how Reddit scales some of their features, shares some advice he learned while working at Google and other major companies. To be fair, I [Apathy] was working at Google at the time, and every job I held between 1995 and 2005 involved at least one of the largest websites on the planet. I didn't come up with any of these ideas, just watched other smart people I worked with who knew what they were doing and found (or wrote) tools that did the same things. But the theme is always the same: Cache everything you can and store the rest in some sort of database (not necessarily relational and not necessarily centralized). Cache everything that doesn't change rapidly. Most of the time you don't have to hit the database for anything other than checking whether the users' new message count has transitioned from 0 to (1 or more). Cache everything--templates, user message status, the front page components--and hit the database once a minute or so to update the fr
6 0.89990652 1443 high scalability-2013-04-19-Stuff The Internet Says On Scalability For April 19, 2013
8 0.89847517 1291 high scalability-2012-07-25-Vertical Scaling Ascendant - How are SSDs Changing Architectures?
9 0.89551628 1010 high scalability-2011-03-24-Strategy: Disk Backup for Speed, Tape Backup to Save Your Bacon, Just Ask Google
10 0.89509708 714 high scalability-2009-10-02-HighScalability has Moved to Squarespace.com!
11 0.89507288 1074 high scalability-2011-07-06-11 Common Web Use Cases Solved in Redis
12 0.89173603 619 high scalability-2009-06-05-HotPads Shows the True Cost of Hosting on Amazon
13 0.89159489 658 high scalability-2009-07-17-Against all the odds
14 0.89105535 1244 high scalability-2012-05-11-Stuff The Internet Says On Scalability For May 11, 2012
15 0.88857138 306 high scalability-2008-04-21-The Search for the Source of Data - How SimpleDB Differs from a RDBMS
16 0.88749307 1207 high scalability-2012-03-12-Google: Taming the Long Latency Tail - When More Machines Equals Worse Results
17 0.88657641 1609 high scalability-2014-03-11-Building a Social Music Service Using AWS, Scala, Akka, Play, MongoDB, and Elasticsearch
18 0.88606882 825 high scalability-2010-05-10-Sify.com Architecture - A Portal at 3900 Requests Per Second
19 0.88577664 1057 high scalability-2011-06-10-Stuff The Internet Says On Scalability For June 10, 2011
20 0.88562238 628 high scalability-2009-06-13-Neo4j - a Graph Database that Kicks Buttox