high_scalability high_scalability-2007 high_scalability-2007-192 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: A very entertaining and somewhat educational article on IBM Poopheads say LAMP Users Need to "grow up" . The physical three tier architecture turns out to be the root of all evil and shared nothing architectures brings simplicity and light. In the comments Simon Willison makes an insightful comment on why fine grained caching works for personalized pages and proxy's don't: Great post, but I have to disagree with you on the finely grained caching part. If you look at big LAMP deployments such as Flickr, LiveJournal and Facebook the common technology component that enables them to scale is memcached - a tool for finely grained caching. That's not to say that they aren't doing shared-nothing, it's just that memcached is critical for helping the database layer scale. LiveJournal serves around 50% of its page views "permission controlled" (friends only) so an HTTP proxy on the front end isn't the right solution - but memcached reduces their database hits by 90%.
sentIndex sentText sentNum sentScore
1 A very entertaining and somewhat educational article on IBM Poopheads say LAMP Users Need to "grow up" . [sent-1, score-0.58]
2 The physical three tier architecture turns out to be the root of all evil and shared nothing architectures brings simplicity and light. [sent-2, score-1.022]
3 In the comments Simon Willison makes an insightful comment on why fine grained caching works for personalized pages and proxy's don't: Great post, but I have to disagree with you on the finely grained caching part. [sent-3, score-2.115]
4 If you look at big LAMP deployments such as Flickr, LiveJournal and Facebook the common technology component that enables them to scale is memcached - a tool for finely grained caching. [sent-4, score-1.404]
5 That's not to say that they aren't doing shared-nothing, it's just that memcached is critical for helping the database layer scale. [sent-5, score-0.593]
6 LiveJournal serves around 50% of its page views "permission controlled" (friends only) so an HTTP proxy on the front end isn't the right solution - but memcached reduces their database hits by 90%. [sent-6, score-0.938]
wordName wordTfidf (topN-words)
[('grained', 0.411), ('finely', 0.4), ('livejournal', 0.324), ('proxy', 0.206), ('memcached', 0.183), ('lamp', 0.158), ('personalized', 0.154), ('entertaining', 0.154), ('permission', 0.149), ('educational', 0.147), ('disagree', 0.143), ('nothing', 0.139), ('insightful', 0.134), ('evil', 0.125), ('controlled', 0.121), ('deployments', 0.114), ('simplicity', 0.112), ('flickr', 0.112), ('say', 0.11), ('somewhat', 0.11), ('friends', 0.106), ('serves', 0.101), ('brings', 0.1), ('helping', 0.1), ('caching', 0.099), ('turns', 0.096), ('comment', 0.095), ('hits', 0.095), ('root', 0.095), ('fine', 0.094), ('tier', 0.093), ('enables', 0.093), ('reduces', 0.088), ('component', 0.085), ('views', 0.085), ('critical', 0.081), ('pages', 0.075), ('front', 0.072), ('architectures', 0.072), ('grow', 0.071), ('layer', 0.067), ('physical', 0.065), ('shared', 0.065), ('tool', 0.063), ('three', 0.06), ('article', 0.059), ('facebook', 0.057), ('page', 0.056), ('common', 0.055), ('database', 0.052)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000002 192 high scalability-2007-12-25-IBMer Says LAMP Can't Scale
Introduction: A very entertaining and somewhat educational article on IBM Poopheads say LAMP Users Need to "grow up" . The physical three tier architecture turns out to be the root of all evil and shared nothing architectures brings simplicity and light. In the comments Simon Willison makes an insightful comment on why fine grained caching works for personalized pages and proxy's don't: Great post, but I have to disagree with you on the finely grained caching part. If you look at big LAMP deployments such as Flickr, LiveJournal and Facebook the common technology component that enables them to scale is memcached - a tool for finely grained caching. That's not to say that they aren't doing shared-nothing, it's just that memcached is critical for helping the database layer scale. LiveJournal serves around 50% of its page views "permission controlled" (friends only) so an HTTP proxy on the front end isn't the right solution - but memcached reduces their database hits by 90%.
2 0.17219992 1352 high scalability-2012-10-31-Gone Fishin': LiveJournal Architecture
Introduction: This was the first architecture profile on HighScalability. IMHO LiveJournal was really the start of the openness on how to build stuff at scale, setting the whole industry off with an excellent role model. They wrote about their architecture, they open sourced their tools, they showed that success wasn't based on keeping secrets, and they set forth principles still followed by our rather amazing industry. No other industry is so open and cooperative, with their eyes cast so far forward, intent on building cool stuff. When all around seems dark it would be good to keep this little bit of light in mind... A fascinating and detailed story of how LiveJournal evolved their system to scale. LiveJournal was an early player in the free blog service race and faced issues from quickly adding a large number of users. Blog posts come fast and furious which causes a lot of writes and writes are particularly hard to scale. Understanding how LiveJournal faced their scaling problems will help any
3 0.16002063 3 high scalability-2007-07-09-LiveJournal Architecture
Introduction: A fascinating and detailed story of how LiveJournal evolved their system to scale. LiveJournal was an early player in the free blog service race and faced issues from quickly adding a large number of users. Blog posts come fast and furious which causes a lot of writes and writes are particularly hard to scale. Understanding how LiveJournal faced their scaling problems will help any aspiring website builder. Site: http://www.livejournal.com/ Information Sources LiveJournal - Behind The Scenes Scaling Storytime Google Video Tokyo Video 2005 version Platform Linux MySql Perl Memcached MogileFS Apache What's Inside? Scaling from 1, 2, and 4 hosts to cluster of servers. Avoid single points of failure. Using MySQL replication only takes you so far. Becoming IO bound kills scaling. Spread out writes and reads for more parallelism. You can't keep adding read slaves and scale. Shard storage approach, using DRBD, for maxim
4 0.12634437 927 high scalability-2010-10-26-Marrying memcached and NoSQL
Introduction: Memcached is one of the most common In-Memory cache implementation. It was originally developed by Danga Interactive for LiveJournal , but is now used by many other sites as a side cache to speed up read mostly operations. It gained popularity in the non-Java world, too, especially since it’s a language-neutral side cache for which few alternatives existed. As a side-cache, Memcache clients relies on the database as the system of record, The database is still used for write,update and complex query operations. Since the memcached specification includes no query operations, memcached is not a database alternative, unlike most of the NoSQL offerings. It also exclude memcache from being a real solution for write scalability. As a result of that many of the heavy sites started to move away from Memcache and replace it with other NoSQL alternatives as noted in a recent highscalability post MySQL And Memcached: End Of An Era? The transition away from memcached to NoSQL
5 0.11306486 360 high scalability-2008-08-04-A Bunch of Great Strategies for Using Memcached and MySQL Better Together
Introduction: The primero recommendation for speeding up a website is almost always to add cache and more cache. And after that add a little more cache just in case. Memcached is almost always given as the recommended cache to use. What we don't often hear is how to effectively use a cache in our own products. MySQL hosted two excellent webinars (referenced below) on the subject of how to deploy and use memcached. The star of the show, other than MySQL of course, is Farhan Mashraqi of Fotolog. You may recall we did an earlier article on Fotolog in Secrets to Fotolog's Scaling Success , which was one of my personal favorites. Fotolog, as they themselves point out, is probably the largest site nobody has ever heard of, pulling in more page views than even Flickr. Fotolog has 51 instances of memcached on 21 servers with 175G in use and 254G available. As a large successful photo-blogging site they have very demanding performance and scaling requirements. To meet those requirements they've developed a
6 0.11175738 729 high scalability-2009-10-28-And the winner is: MySQL or Memcached or Tokyo Tyrant?
7 0.110427 721 high scalability-2009-10-13-Why are Facebook, Digg, and Twitter so hard to scale?
8 0.10780866 297 high scalability-2008-04-05-Skype Plans for PostgreSQL to Scale to 1 Billion Users
9 0.10639631 313 high scalability-2008-05-02-Friends for Sale Architecture - A 300 Million Page View-Month Facebook RoR App
10 0.096956819 456 high scalability-2008-12-01-Sun's High-Performance and Reliable Web Proxy Solution
11 0.089277506 315 high scalability-2008-05-05-HSCALE - Handling 200 Million Transactions Per Month Using Transparent Partitioning With MySQL Proxy
12 0.085327372 728 high scalability-2009-10-26-Facebook's Memcached Multiget Hole: More machines != More Capacity
13 0.08500126 908 high scalability-2010-09-28-6 Strategies for Scaling BBC iPlayer
14 0.084864102 52 high scalability-2007-08-01-Product: Memcached
15 0.084435932 353 high scalability-2008-07-20-Strategy: Front S3 with a Caching Proxy
16 0.083955824 5 high scalability-2007-07-10-mixi.jp Architecture
17 0.076461665 636 high scalability-2009-06-23-Learn How to Exploit Multiple Cores for Better Performance and Scalability
18 0.07596159 1123 high scalability-2011-09-23-The Real News is Not that Facebook Serves Up 1 Trillion Pages a Month…
19 0.075446613 367 high scalability-2008-08-17-Strategy: Drop Memcached, Add More MySQL Servers
20 0.073349245 464 high scalability-2008-12-13-Strategy: Facebook Tweaks to Handle 6 Time as Many Memcached Requests
topicId topicWeight
[(0, 0.098), (1, 0.035), (2, -0.036), (3, -0.067), (4, 0.051), (5, 0.01), (6, -0.026), (7, -0.025), (8, -0.01), (9, 0.029), (10, -0.019), (11, 0.022), (12, 0.015), (13, 0.088), (14, -0.035), (15, -0.021), (16, 0.004), (17, -0.029), (18, 0.013), (19, 0.022), (20, -0.011), (21, 0.036), (22, 0.035), (23, -0.002), (24, -0.015), (25, 0.007), (26, 0.022), (27, 0.05), (28, 0.029), (29, 0.02), (30, -0.019), (31, -0.025), (32, 0.016), (33, 0.008), (34, -0.017), (35, -0.046), (36, 0.012), (37, -0.007), (38, 0.064), (39, -0.035), (40, 0.014), (41, 0.024), (42, -0.034), (43, -0.074), (44, -0.014), (45, -0.012), (46, 0.023), (47, -0.074), (48, -0.042), (49, 0.002)]
simIndex simValue blogId blogTitle
same-blog 1 0.96319562 192 high scalability-2007-12-25-IBMer Says LAMP Can't Scale
Introduction: A very entertaining and somewhat educational article on IBM Poopheads say LAMP Users Need to "grow up" . The physical three tier architecture turns out to be the root of all evil and shared nothing architectures brings simplicity and light. In the comments Simon Willison makes an insightful comment on why fine grained caching works for personalized pages and proxy's don't: Great post, but I have to disagree with you on the finely grained caching part. If you look at big LAMP deployments such as Flickr, LiveJournal and Facebook the common technology component that enables them to scale is memcached - a tool for finely grained caching. That's not to say that they aren't doing shared-nothing, it's just that memcached is critical for helping the database layer scale. LiveJournal serves around 50% of its page views "permission controlled" (friends only) so an HTTP proxy on the front end isn't the right solution - but memcached reduces their database hits by 90%.
2 0.71775788 149 high scalability-2007-11-12-Scaling Using Cache Farms and Read Pooling
Introduction: Michael Nygard talks about Two Ways To Boost Your Flagging Web Site . The idea behind cache farms is to move memory devoted to the various caching layers into one large farm of caches, as with memcached. The idea behind read pools is to allocate your database read requests to a pool of dedicated read servers, thus offloading the write server. Using a combination of the strategies you aren't forced to scale up the database tier to scale your website.
3 0.67394978 602 high scalability-2009-05-17-Scaling Django Web Apps by Mike Malone
Introduction: Film buffs will recognize Django as a classic 1966 spaghetti western that spawned hundreds of imitators. Web heads will certainly first think of Django as the classic Python based Web framework that has also spawned hundreds of imitators and has become the gold standard framework for the web. Mike Malone, who worked on Pownce, a blogging tool now owned by Six Apart, tells in this very informative EuroDjangoCon presentation how Pownce scaled using Django in the real world. I was surprised to learn how large Pounce was: hundreds of requests/sec, thousands of DB operations/sec, millions of user relationships, millions of notes, and terabytes of static data. Django has a lot of functionality in the box to help you scale, but if you want to scale large it turns out Django has some limitations and Mike tells you what these are and also provides some code to get around them. Mike's talk-although Django specific--will really help anyone creating applications on the web. There's
4 0.6569559 360 high scalability-2008-08-04-A Bunch of Great Strategies for Using Memcached and MySQL Better Together
Introduction: The primero recommendation for speeding up a website is almost always to add cache and more cache. And after that add a little more cache just in case. Memcached is almost always given as the recommended cache to use. What we don't often hear is how to effectively use a cache in our own products. MySQL hosted two excellent webinars (referenced below) on the subject of how to deploy and use memcached. The star of the show, other than MySQL of course, is Farhan Mashraqi of Fotolog. You may recall we did an earlier article on Fotolog in Secrets to Fotolog's Scaling Success , which was one of my personal favorites. Fotolog, as they themselves point out, is probably the largest site nobody has ever heard of, pulling in more page views than even Flickr. Fotolog has 51 instances of memcached on 21 servers with 175G in use and 254G available. As a large successful photo-blogging site they have very demanding performance and scaling requirements. To meet those requirements they've developed a
5 0.63936472 927 high scalability-2010-10-26-Marrying memcached and NoSQL
Introduction: Memcached is one of the most common In-Memory cache implementation. It was originally developed by Danga Interactive for LiveJournal , but is now used by many other sites as a side cache to speed up read mostly operations. It gained popularity in the non-Java world, too, especially since it’s a language-neutral side cache for which few alternatives existed. As a side-cache, Memcache clients relies on the database as the system of record, The database is still used for write,update and complex query operations. Since the memcached specification includes no query operations, memcached is not a database alternative, unlike most of the NoSQL offerings. It also exclude memcache from being a real solution for write scalability. As a result of that many of the heavy sites started to move away from Memcache and replace it with other NoSQL alternatives as noted in a recent highscalability post MySQL And Memcached: End Of An Era? The transition away from memcached to NoSQL
6 0.62228876 785 high scalability-2010-02-26-MySQL and Memcached: End of an Era?
7 0.5926224 389 high scalability-2008-09-23-How to Scale with Ruby on Rails
8 0.59013277 5 high scalability-2007-07-10-mixi.jp Architecture
9 0.58695966 603 high scalability-2009-05-19-Scaling Memcached: 500,000+ Operations-Second with a Single-Socket UltraSPARC T2
10 0.58626097 703 high scalability-2009-09-12-How Google Taught Me to Cache and Cash-In
11 0.58578891 836 high scalability-2010-06-04-Strategy: Cache Larger Chunks - Cache Hit Rate is a Bad Indicator
12 0.58407712 481 high scalability-2009-01-02-Strategy: Understanding Your Data Leads to the Best Scalability Solutions
13 0.58271688 359 high scalability-2008-07-29-Ehcache - A Java Distributed Cache
14 0.57915634 367 high scalability-2008-08-17-Strategy: Drop Memcached, Add More MySQL Servers
15 0.57116216 337 high scalability-2008-05-31-memcached and Storage of Friend list
16 0.57082671 673 high scalability-2009-08-07-Strategy: Break Up the Memcache Dog Pile
17 0.56803077 729 high scalability-2009-10-28-And the winner is: MySQL or Memcached or Tokyo Tyrant?
18 0.56578004 391 high scalability-2008-09-23-The 7 Stages of Scaling Web Apps
19 0.56562209 721 high scalability-2009-10-13-Why are Facebook, Digg, and Twitter so hard to scale?
20 0.55948031 512 high scalability-2009-02-14-Scaling Digg and Other Web Applications
topicId topicWeight
[(1, 0.222), (2, 0.122), (10, 0.112), (61, 0.027), (73, 0.268), (79, 0.079), (85, 0.044)]
simIndex simValue blogId blogTitle
1 0.93411654 333 high scalability-2008-05-28-Webinar: Designing and Implementing Scalable Applications with Memcached and MySQL
Introduction: The following technical Webinar could be of interest to the community. WHO: Farhan "Frank" Mashraqi, Director of Business Operations and Technical Strategy, Fotolog Inc Monty Taylor, Senior Consultant, Sun Microsystems Jimmy Guerrero, Sr Product Marketing Manager, Sun Microsystems - Database Group WHAT: Designing and Implementing Scalable Applications with Memcached and MySQL web presentation. WHEN: Thursday, May 29, 2008, 10:00 am PST, 1:00 pm EST, 18:00 GMT The presentation will be approximately 45 minutes long followed by Q&A.; Check out the details here !
2 0.89161605 471 high scalability-2008-12-19-Gigaspaces curbs latency outliers with Java Real Time
Introduction: Today, most banks have migrated their internal software development from C/C++ to the Java language because of well-known advantages in development productivity (Java Platform), robustness & reliability (Garbage Collector) and platform independence (Java Bytecode). They may even have gotten better throughput performance through the use of standard architectures and application servers (Java Enterprise Edition). Among the few banking applications that have not been able to benefit yet from the Java revolution, you find the latency-critical applications connected to the trading floor. Why? Because of the unpredictable pauses introduced by the garbage collector which result in significant jitter (variance of execution time). In this post Frederic Pariente Engineering Manager at Sun Microsystems posted a summary of a case study on how the use of Sun Real Time JVM and GigaSpaces was used in the context of of a customer proof-of-concept this summer to ensure guaranteed latency per m
same-blog 3 0.86302871 192 high scalability-2007-12-25-IBMer Says LAMP Can't Scale
Introduction: A very entertaining and somewhat educational article on IBM Poopheads say LAMP Users Need to "grow up" . The physical three tier architecture turns out to be the root of all evil and shared nothing architectures brings simplicity and light. In the comments Simon Willison makes an insightful comment on why fine grained caching works for personalized pages and proxy's don't: Great post, but I have to disagree with you on the finely grained caching part. If you look at big LAMP deployments such as Flickr, LiveJournal and Facebook the common technology component that enables them to scale is memcached - a tool for finely grained caching. That's not to say that they aren't doing shared-nothing, it's just that memcached is critical for helping the database layer scale. LiveJournal serves around 50% of its page views "permission controlled" (friends only) so an HTTP proxy on the front end isn't the right solution - but memcached reduces their database hits by 90%.
Introduction: It's time to do something a little different and for me that doesn't mean cutting off my hair and joining a monastery, nor does it mean buying a cherry red convertible (yet), it means doing a webinar! On December 14th, 2:00 PM - 3:00 PM EST, I'll be hosting What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications . The webinar is sponsored by VoltDB, but it will be completely vendor independent, as that's the only honor preserving and technically accurate way of doing these things. The webinar will run about 60 minutes, with 40 minutes of speechifying and 20 minutes for questions. The hashtag for the event on Twitter will be SQLNoSQL . I'll be monitoring that hashtag if you have any suggestions for the webinar or if you would like to ask questions during the webinar. The motivation for me to do the webinar was a talk I had with another audience member at the NoSQL Evening in Palo Alto . He said he came from a Java background and was confused ab
Introduction: It's time to do something a little different and for me that doesn't mean cutting off my hair and joining a monastery, nor does it mean buying a cherry red convertible (yet), it means doing a webinar! On December 14th, 2:00 PM - 3:00 PM EST, I'll be hosting What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications . The webinar is sponsored by VoltDB, but it will be completely vendor independent, as that's the only honor preserving and technically accurate way of doing these things. The webinar will run about 60 minutes, with 40 minutes of speechifying and 20 minutes for questions. The hashtag for the event on Twitter will be SQLNoSQL . I'll be monitoring that hashtag if you have any suggestions for the webinar or if you would like to ask questions during the webinar. The motivation for me to do the webinar was a talk I had with another audience member at the NoSQL Evening in Palo Alto . He said he came from a Java background and was confused ab
6 0.81749833 9 high scalability-2007-07-15-Blog: Occam’s Razor by Avinash Kaushik
7 0.80646217 1175 high scalability-2012-01-17-Paper: Feeding Frenzy: Selectively Materializing Users’ Event Feeds
8 0.80456597 1587 high scalability-2014-01-29-10 Things Bitly Should Have Monitored
9 0.79431921 1196 high scalability-2012-02-20-Berkeley DB Architecture - NoSQL Before NoSQL was Cool
10 0.76322562 284 high scalability-2008-03-19-RAD Lab is Creating a Datacenter Operating System
11 0.72965944 986 high scalability-2011-02-10-Database Isolation Levels And Their Effects on Performance and Scalability
12 0.72878027 980 high scalability-2011-01-28-Stuff The Internet Says On Scalability For January 28, 2011
13 0.70564824 624 high scalability-2009-06-10-Hive - A Petabyte Scale Data Warehouse using Hadoop
14 0.70087868 1521 high scalability-2013-09-23-Salesforce Architecture - How they Handle 1.3 Billion Transactions a Day
15 0.70043397 688 high scalability-2009-08-26-Hot Links for 2009-8-26
16 0.69894063 1482 high scalability-2013-06-26-Leveraging Cloud Computing at Yelp - 102 Million Monthly Vistors and 39 Million Reviews
17 0.69802791 434 high scalability-2008-10-30-Olio Web2.0 Toolkit - Evaluate Web Technologies and Tools
18 0.69697332 1068 high scalability-2011-06-27-TripAdvisor Architecture - 40M Visitors, 200M Dynamic Page Views, 30TB Data
19 0.69677716 1642 high scalability-2014-05-02-Stuff The Internet Says On Scalability For May 2nd, 2014
20 0.69673985 181 high scalability-2007-12-11-Hosting and CDN for startup video sharing site