high_scalability high_scalability-2012 high_scalability-2012-1361 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Other than StackOverflow , PlentyOfFish is perhaps the most spectacular example of scale-up architectures working for what your average sane person would consider a large system. It doesn't hurt that it's also a sexy story. Update 5 : PlentyOfFish Update - 6 Billion Pageviews And 32 Billion Images A Month Update 4 : Jeff Atwood costs out Markus' scale up approach against a scale out approach and finds scale up wanting. The discussion in the comments is as interesting as the article. My guess is Markus doesn't want to rewrite his software to work across a scale out cluster so even if it's more expensive scale up works better for his needs. Update 3 : POF now has 200 million images and serves 10,000 images served per second. They'll be moving to a 250,000 IOPS RamSan to handle the load. Also upgraded to a core database machine with 512 GB of RAM, 32 CPU’s, SQLServer 2008 and Windows 2008. Update 2 : This seems to be a POF Peer1 love fest infomercial . It's pretty cont
sentIndex sentText sentNum sentScore
1 Update 3 : POF now has 200 million images and serves 10,000 images served per second. [sent-6, score-0.561]
2 PlentyOfFish is a hugely popular on-line dating system slammed by over 45 million visitors a month and 30+ million hits a day (500 - 600 pages per second). [sent-15, score-0.689]
3 All this is handled by one person, using a handful of servers, working a few hours a day, while making $6 million a year from Google ads. [sent-17, score-0.333]
4 2 billion page views/month, and 500,000 average unique logins per day. [sent-25, score-0.281]
5 Makes up to $10 million a year on Google ads working only two hours a day. [sent-28, score-0.332]
6 1 billion page views and 45 million visitors a month. [sent-31, score-0.561]
7 Approaching 64,000 simultaneous connections and 2 million page views per hour. [sent-38, score-0.677]
8 1 TB/day serving 171 million images through Akamai. [sent-40, score-0.34]
9 With 30 million page views a day you can make good money on advertising, even a 5 - 10 cents a CPM. [sent-47, score-0.623]
10 And using ServerIron allowed advanced functionality like bot blocking and load balancing based on passed on cookies, session data, and IP data. [sent-65, score-0.466]
11 - NLB has an affinity option so a user always maps to a certain server, thus no external storage is used for session state and if the server fails the user loses their state and must relogin. [sent-71, score-0.444]
12 If this state includes a shopping cart or other important data, this solution may be poor, but for a dating site it seems reasonable. [sent-72, score-0.322]
13 If you are doing over a million page views a day just write out the code to spit it out to the screen. [sent-84, score-0.56]
14 - If you call the database 20 times per page view you are screwed no matter what you do. [sent-101, score-0.274]
15 Going from one million to 12 million users was a big jump. [sent-126, score-0.394]
16 It's easy to sell a few million page views at high CPM’s. [sent-160, score-0.577]
17 It's a LOT harder to sell billions of page views at high CPM’s, as shown by Myspace and Facebook. [sent-161, score-0.38]
18 To generate 100 million a year as a free site is virtually impossible as you need too big a market. [sent-164, score-0.397]
19 Growing page views via Facebook for a dating site won't work. [sent-165, score-0.56]
20 Most of Facebook's page views are outside the US and you have to split 5 cent CPM’s with Facebook. [sent-167, score-0.295]
wordName wordTfidf (topN-words)
[('pof', 0.318), ('markus', 0.315), ('million', 0.197), ('serveriron', 0.19), ('views', 0.161), ('dating', 0.149), ('images', 0.143), ('cpm', 0.143), ('nlb', 0.143), ('plentyoffish', 0.136), ('page', 0.134), ('site', 0.116), ('load', 0.099), ('australia', 0.095), ('session', 0.092), ('windows', 0.085), ('sell', 0.085), ('year', 0.084), ('bot', 0.08), ('per', 0.078), ('blocking', 0.076), ('clicks', 0.069), ('billion', 0.069), ('day', 0.068), ('facebook', 0.068), ('grow', 0.067), ('balancing', 0.067), ('ad', 0.067), ('canada', 0.067), ('user', 0.065), ('robin', 0.065), ('cents', 0.063), ('database', 0.062), ('people', 0.059), ('cdn', 0.059), ('servers', 0.058), ('used', 0.057), ('state', 0.057), ('ram', 0.055), ('dns', 0.055), ('simultaneous', 0.054), ('hire', 0.054), ('connections', 0.053), ('love', 0.053), ('balanced', 0.052), ('employees', 0.052), ('using', 0.052), ('top', 0.052), ('ads', 0.051), ('fails', 0.051)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000004 1361 high scalability-2012-11-22-Gone Fishin': PlentyOfFish Architecture
Introduction: Other than StackOverflow , PlentyOfFish is perhaps the most spectacular example of scale-up architectures working for what your average sane person would consider a large system. It doesn't hurt that it's also a sexy story. Update 5 : PlentyOfFish Update - 6 Billion Pageviews And 32 Billion Images A Month Update 4 : Jeff Atwood costs out Markus' scale up approach against a scale out approach and finds scale up wanting. The discussion in the comments is as interesting as the article. My guess is Markus doesn't want to rewrite his software to work across a scale out cluster so even if it's more expensive scale up works better for his needs. Update 3 : POF now has 200 million images and serves 10,000 images served per second. They'll be moving to a 250,000 IOPS RamSan to handle the load. Also upgraded to a core database machine with 512 GB of RAM, 32 CPU’s, SQLServer 2008 and Windows 2008. Update 2 : This seems to be a POF Peer1 love fest infomercial . It's pretty cont
2 0.99461496 638 high scalability-2009-06-26-PlentyOfFish Architecture
Introduction: Update 5 : PlentyOfFish Update - 6 Billion Pageviews And 32 Billion Images A Month Update 4 : Jeff Atwood costs out Markus' scale up approach against a scale out approach and finds scale up wanting. The discussion in the comments is as interesting as the article. My guess is Markus doesn't want to rewrite his software to work across a scale out cluster so even if it's more expensive scale up works better for his needs. Update 3 : POF now has 200 million images and serves 10,000 images served per second. They'll be moving to a 250,000 IOPS RamSan to handle the load. Also upgraded to a core database machine with 512 GB of RAM, 32 CPU’s, SQLServer 2008 and Windows 2008. Update 2 : This seems to be a POF Peer1 love fest infomercial . It's pretty content free, but the production values are high. Lots of quirky sounds and fish swimming on the screen. Update : by Facebook standards Read/WriteWeb says POF is worth a cool one billion dollars . It helps to talk like Dr. Evil whe
3 0.26363152 442 high scalability-2008-11-13-Plenty of Fish Says Scaling for Free Doesn't Pay
Introduction: Plenty of Fish CEO Markus Frind, famous nerd hero for making over $10 million a year from Google ads on a free dating site he made and ran all by himself, now sees a problem with the free model : The problem with free is that every time you double the size of your database the cost of maintaining the site grows 6 fold. I really underestimated how much resources it would take, I have one database table now that exceeds 3 billion records. The bigger you get as a free site the less money you make per visit and the more it costs to service a visit...There is really no money in being free and we have to start experimenting with other models now or we won’t be able to compete in 3 or 4 years. As one commenter succinctly put it: the “golden time” of AdSense is over . Time to look at costs. The POF architecture is to run scarily huge tables on single machines. They also buy and maintain their own SAN. So it seems scaling up is what is increasing costs and decreasing profits. I wo
4 0.2478753 1164 high scalability-2011-12-27-PlentyOfFish Update - 6 Billion Pageviews and 32 Billion Images a Month
Introduction: Markus has a short update on their PlentyOfFish Architecture . Impressive November statistics: 6 billion pageviews served 32 billion images served 6 million logins i n one day IM servers handle about 30 billion pageviews 11 webservers (5 of which could be dropped) Hired first DBA in July . They currently have a handful of employees . All hosting/cdn costs combined are under $70k/month. Lesson : small organization, simple architecture, on raw hardware is still plenty profitable for PlentyOfFish. Related Articles On HackerNews 32 Billion images a month by Markus Frind.
5 0.21745676 313 high scalability-2008-05-02-Friends for Sale Architecture - A 300 Million Page View-Month Facebook RoR App
Introduction: Update: Jake in Does Django really scale better than Rails? thinks apps like FFS shouldn't need so much hardware to scale. In a short three months Friends for Sale (think Hot-or-Not with a market economy) grew to become a top 10 Facebook application handling 200 gorgeous requests per second and a stunning 300 million page views a month. They did all this using Ruby on Rails, two part time developers, a cluster of a dozen machines, and a fairly standard architecture. How did Friends for Sale scale to sell all those beautiful people? And how much do you think your friends are worth on the open market? Site: http://www.facebook.com/apps/application.php?id=7019261521 Information Sources Siqi Chen and Alexander Le, co-creators of Friends for Sale, answering my standard questionairre. Virality on Facebook The Platform Ruby on Rails CentOS 5 (64 bit) Capistrano - update and restart application servers. Memcached MySQL Nginx Starling - distrib
6 0.21219884 106 high scalability-2007-10-02-Secrets to Fotolog's Scaling Success
7 0.20277312 1123 high scalability-2011-09-23-The Real News is Not that Facebook Serves Up 1 Trillion Pages a Month…
8 0.20080183 511 high scalability-2009-02-12-MySpace Architecture
9 0.17364793 274 high scalability-2008-03-12-YouTube Architecture
10 0.16747051 1359 high scalability-2012-11-15-Gone Fishin': Justin.Tv's Live Video Broadcasting Architecture
11 0.16735955 796 high scalability-2010-03-16-Justin.tv's Live Video Broadcasting Architecture
12 0.16476557 821 high scalability-2010-05-03-MocoSpace Architecture - 3 Billion Mobile Page Views a Month
13 0.16383255 1501 high scalability-2013-08-13-In Memoriam: Lavabit Architecture - Creating a Scalable Email Service
14 0.16174394 67 high scalability-2007-08-17-What is the best hosting option?
15 0.16047792 554 high scalability-2009-04-04-Digg Architecture
16 0.15322828 513 high scalability-2009-02-16-Handle 1 Billion Events Per Day Using a Memory Grid
17 0.15113777 1000 high scalability-2011-03-08-Medialets Architecture - Defeating the Daunting Mobile Device Data Deluge
18 0.15003848 808 high scalability-2010-04-12-Poppen.de Architecture
20 0.14548577 750 high scalability-2009-12-16-Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud
topicId topicWeight
[(0, 0.294), (1, 0.146), (2, -0.048), (3, -0.23), (4, -0.003), (5, -0.119), (6, -0.074), (7, 0.055), (8, 0.082), (9, 0.059), (10, 0.021), (11, -0.074), (12, -0.022), (13, 0.046), (14, -0.0), (15, 0.062), (16, -0.01), (17, 0.024), (18, 0.018), (19, 0.066), (20, 0.062), (21, 0.061), (22, -0.036), (23, -0.164), (24, 0.064), (25, -0.16), (26, -0.1), (27, 0.117), (28, 0.048), (29, 0.007), (30, -0.008), (31, 0.048), (32, -0.071), (33, -0.076), (34, 0.006), (35, 0.002), (36, 0.002), (37, 0.067), (38, 0.077), (39, -0.017), (40, -0.121), (41, 0.065), (42, -0.05), (43, 0.172), (44, -0.014), (45, 0.019), (46, 0.121), (47, 0.035), (48, 0.063), (49, 0.081)]
simIndex simValue blogId blogTitle
1 0.96427459 638 high scalability-2009-06-26-PlentyOfFish Architecture
Introduction: Update 5 : PlentyOfFish Update - 6 Billion Pageviews And 32 Billion Images A Month Update 4 : Jeff Atwood costs out Markus' scale up approach against a scale out approach and finds scale up wanting. The discussion in the comments is as interesting as the article. My guess is Markus doesn't want to rewrite his software to work across a scale out cluster so even if it's more expensive scale up works better for his needs. Update 3 : POF now has 200 million images and serves 10,000 images served per second. They'll be moving to a 250,000 IOPS RamSan to handle the load. Also upgraded to a core database machine with 512 GB of RAM, 32 CPU’s, SQLServer 2008 and Windows 2008. Update 2 : This seems to be a POF Peer1 love fest infomercial . It's pretty content free, but the production values are high. Lots of quirky sounds and fish swimming on the screen. Update : by Facebook standards Read/WriteWeb says POF is worth a cool one billion dollars . It helps to talk like Dr. Evil whe
same-blog 2 0.96307623 1361 high scalability-2012-11-22-Gone Fishin': PlentyOfFish Architecture
Introduction: Other than StackOverflow , PlentyOfFish is perhaps the most spectacular example of scale-up architectures working for what your average sane person would consider a large system. It doesn't hurt that it's also a sexy story. Update 5 : PlentyOfFish Update - 6 Billion Pageviews And 32 Billion Images A Month Update 4 : Jeff Atwood costs out Markus' scale up approach against a scale out approach and finds scale up wanting. The discussion in the comments is as interesting as the article. My guess is Markus doesn't want to rewrite his software to work across a scale out cluster so even if it's more expensive scale up works better for his needs. Update 3 : POF now has 200 million images and serves 10,000 images served per second. They'll be moving to a 250,000 IOPS RamSan to handle the load. Also upgraded to a core database machine with 512 GB of RAM, 32 CPU’s, SQLServer 2008 and Windows 2008. Update 2 : This seems to be a POF Peer1 love fest infomercial . It's pretty cont
3 0.89621222 1164 high scalability-2011-12-27-PlentyOfFish Update - 6 Billion Pageviews and 32 Billion Images a Month
Introduction: Markus has a short update on their PlentyOfFish Architecture . Impressive November statistics: 6 billion pageviews served 32 billion images served 6 million logins i n one day IM servers handle about 30 billion pageviews 11 webservers (5 of which could be dropped) Hired first DBA in July . They currently have a handful of employees . All hosting/cdn costs combined are under $70k/month. Lesson : small organization, simple architecture, on raw hardware is still plenty profitable for PlentyOfFish. Related Articles On HackerNews 32 Billion images a month by Markus Frind.
4 0.77745831 52 high scalability-2007-08-01-Product: Memcached
Introduction: Memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load. Danga Interactive developed memcached to enhance the speed of LiveJournal.com, a site which was already doing 20 million+ dynamic page views per day for 1 million users with a bunch of webservers and a bunch of database servers. memcached dropped the database load to almost nothing, yielding faster page load times for users, better resource utilization, and faster access to the databases on a memcache miss.
5 0.77148628 106 high scalability-2007-10-02-Secrets to Fotolog's Scaling Success
Introduction: Fotolog, a social blogging site centered around photos, grew from about 300 thousand users in 2004 to over 11 million users in 2007. Though they initially experienced the inevitable pains of rapid growth, they overcame their problems and now manage over 300 million photos and 800,000 new photos are added each day. Generating all that fabulous content are 20 million unique monthly visitors and a volunteer army of 30,000 new users each day. They did so well a very impressed suitor bought them out for a cool $90 million. That's scale meets success by anyone standards. How did they do it? Site: http://www.fotolog.com Information Sources Scaling the World's Largest Photo Blogging Community Congrats to Fotolog on $90mm sale to Hi-Media Fotolog overtaking Flickr? Fotolog Hits 11 Million Members and 300 Million Photos Posted Site of the Week: Fotolog.com by PC Magazine CEO John Borthwick's Blog . DBA Frank Mash's Blog Fotolog, lessons learnt by John B
6 0.77123237 133 high scalability-2007-10-26-How Gravatar scales on WordPress.com hardware
7 0.70808727 511 high scalability-2009-02-12-MySpace Architecture
8 0.70618081 711 high scalability-2009-09-22-How Ravelry Scales to 10 Million Requests Using Rails
9 0.70080459 788 high scalability-2010-03-04-How MySpace Tested Their Live Site with 1 Million Concurrent Users
10 0.69991553 513 high scalability-2009-02-16-Handle 1 Billion Events Per Day Using a Memory Grid
11 0.68650609 356 high scalability-2008-07-22-Scaling Bumper Sticker: A 1 Billion Page Per Month Facebook RoR App
12 0.68319499 150 high scalability-2007-11-12-Slashdot Architecture - How the Old Man of the Internet Learned to Scale
13 0.68180388 965 high scalability-2010-12-29-Pinboard.in Architecture - Pay to Play to Keep a System Small
14 0.67306411 442 high scalability-2008-11-13-Plenty of Fish Says Scaling for Free Doesn't Pay
15 0.67211062 928 high scalability-2010-10-26-Scaling DISQUS to 75 Million Comments and 17,000 RPS
17 0.65853262 519 high scalability-2009-02-23-Database Sharding at Netlog, with MySQL and PHP
18 0.65839767 794 high scalability-2010-03-11-What would you like to ask Justin.tv?
19 0.65700698 304 high scalability-2008-04-19-How to build a real-time analytics system?
20 0.65571946 1108 high scalability-2011-08-31-Pud is the Anti-Stack - Windows, CFML, Dropbox, Xeround, JungleDisk, ELB
topicId topicWeight
[(1, 0.118), (2, 0.193), (5, 0.012), (10, 0.026), (30, 0.015), (33, 0.033), (40, 0.017), (61, 0.071), (77, 0.029), (79, 0.087), (85, 0.291), (94, 0.033)]
simIndex simValue blogId blogTitle
1 0.97748196 143 high scalability-2007-11-06-Product: ChironFS
Introduction: If you are trying to create highly available file systems, especially across data centers, then ChironFS is one potential solution. It's relatively new, so there aren't lots of experience reports, but it looks worth considering. What is ChironFS and how does it work? Adapted from the ChironFS website: The Chiron Filesystem is a Fuse based filesystem that frees you from single points of failure. It's main purpose is to guarantee filesystem availability using replication. But it isn't a RAID implementation. RAID replicates DEVICES not FILESYSTEMS. Why not just use RAID over some network block device? Because it is a block device and if one server mounts that device in RW mode, no other server will be able to mount it in RW mode. Any real network may have many servers and offer a variety of services. Keeping everything running can become a real nightmare!
2 0.97629118 1049 high scalability-2011-05-31-Awesome List of Advanced Distributed Systems Papers
Introduction: As part of Dr. Indranil Gupta 's CS 525 Spring 2011 Advanced Distributed Systems class, he has collected an incredible list of resources on distributed systems . His research group is also doing some interesting work. The various topics include: Before there Were Clouds, Cloud Computing, P2P Systems, Basic Distributed Computing Concepts, Sensor Networks, Overlays and DHTs, Cloud Programming, Cloud Scheduling, Key-Value Stores, Storage, Sensor Net Routing, Geo-Distribution, P2P Apps, In-network processing, Epidemics, Probabilistic Membership Protocols, Distributed Monitoring and Management, Publish-Subscribe/CDNs, Measurement Studies, Old Wine: Stale or Vintage?, In Byzantium, Cloud Pricing, Other Industrial Systems, Structure of Networks, Completing the Circle, Green Clouds, Distributed Debugging, Flash!, The Middle or the End?, Availability-Aware Systems, Design Methodologies, Handling Stress, Sources of unreliability in networks, Handling Stress, Selfish algorithms, Securi
3 0.96814173 1039 high scalability-2011-05-12-Paper: Mind the Gap: Reconnecting Architecture and OS Research
Introduction: Mind the Gap: Reconnecting Architecture and OS Research is a paper presented at HotOS XIII , the place where researchers talk about making potential futures happen. For a great overview of the conference take a look at this article by Matt Welsh: Conference report: HotOS 2011 in Napa . In the VM/cloud age I question the need of having an OS at all, programs can compile directly against "raw" hardware, but the paper does a good job of trying to figure out the new roll operating systems can play in the future. We've been in a long OS holding pattern, so long that we've seen the rise of PaaS vendors skipping the OS level abstraction completely, but there's room for a middle ground between legacy time sharing systems of the past and service level APIs that are but one possible future. Introduction: For too long, operating systems researchers and developers have pretty much taken whatever computer architects have dished out. With occasional exceptions (e.g., virtualization support)
4 0.96757084 102 high scalability-2007-09-27-Product: Sequoia Database Clustering Technology
Introduction: Sequoia is a transparent middleware solution offering clustering, load balancing and failover services for any database. Sequoia is the continuation of the C-JDBC project. The database is distributed and replicated among several nodes and Sequoia balances the queries among these nodes. Sequoia handles node and network failures with transparent failover. It also provides support for hot recovery, online maintenance operations and online upgrades. Features in a nutshell No modification of existing applications or databases. Operational with any database providing a JDBC driver. High availability provided by advanced RAIDb technology. Transparent failover and recovery capabilities. Performance scalability with unique load balancing and query result caching features. Integrated JMX-based administration and monitoring. 100% Java implementation allowing portability across platforms with a JRE 1.4 or greater. Open source licensed under Apache v2 license. Professi
5 0.96638012 820 high scalability-2010-05-03-100 Node Hazelcast cluster on Amazon EC2
Introduction: Deploying, running and monitoring application on a big cluster is a challenging task. Recently Hazelcast team deployed a demo application on Amazon EC2 platform to show how Hazelcast p2p cluster scales and screen recorded the entire process from deployment to monitoring. Hazelcast is open source (Apache License), transactional, distributed caching solution for Java. It is a little more than a cache though as it provides distributed implementation of map, multimap, queue, topic, lock and executor service. Details of running 100 node Hazelcast cluster on Amazon EC2 can be found here . Make sure to watch the screencast !
6 0.95198405 1500 high scalability-2013-08-12-100 Curse Free Lessons from Gordon Ramsay on Building Great Software
7 0.94398397 1032 high scalability-2011-05-02-Stack Overflow Makes Slow Pages 100x Faster by Simple SQL Tuning
8 0.94380188 1239 high scalability-2012-05-04-Stuff The Internet Says On Scalability For May 4, 2012
9 0.94164157 191 high scalability-2007-12-23-Synchronizing Memcached application
10 0.94152898 1577 high scalability-2014-01-13-NYTimes Architecture: No Head, No Master, No Single Point of Failure
11 0.93671823 59 high scalability-2007-08-04-Try Squid as a Reverse Proxy
12 0.93044323 447 high scalability-2008-11-19-High Definition Video Delivery on the Web?
13 0.92792803 492 high scalability-2009-01-16-Database Sharding for startups
14 0.92018151 1024 high scalability-2011-04-15-Stuff The Internet Says On Scalability For April 15, 2011
15 0.91663384 1592 high scalability-2014-02-07-Stuff The Internet Says On Scalability For February 7th, 2014
16 0.91547608 638 high scalability-2009-06-26-PlentyOfFish Architecture
same-blog 17 0.91473383 1361 high scalability-2012-11-22-Gone Fishin': PlentyOfFish Architecture
18 0.90746683 118 high scalability-2007-10-09-High Load on production Webservers after Sourcecode sync
19 0.8940835 1327 high scalability-2012-09-21-Stuff The Internet Says On Scalability For September 21, 2012
20 0.89292717 1080 high scalability-2011-07-15-Stuff The Internet Says On Scalability For July 15, 2011