high_scalability high_scalability-2009 high_scalability-2009-638 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Update 5 : PlentyOfFish Update - 6 Billion Pageviews And 32 Billion Images A Month Update 4 : Jeff Atwood costs out Markus' scale up approach against a scale out approach and finds scale up wanting. The discussion in the comments is as interesting as the article. My guess is Markus doesn't want to rewrite his software to work across a scale out cluster so even if it's more expensive scale up works better for his needs. Update 3 : POF now has 200 million images and serves 10,000 images served per second. They'll be moving to a 250,000 IOPS RamSan to handle the load. Also upgraded to a core database machine with 512 GB of RAM, 32 CPU’s, SQLServer 2008 and Windows 2008. Update 2 : This seems to be a POF Peer1 love fest infomercial . It's pretty content free, but the production values are high. Lots of quirky sounds and fish swimming on the screen. Update : by Facebook standards Read/WriteWeb says POF is worth a cool one billion dollars . It helps to talk like Dr. Evil whe
sentIndex sentText sentNum sentScore
1 Update 3 : POF now has 200 million images and serves 10,000 images served per second. [sent-4, score-0.568]
2 PlentyOfFish is a hugely popular on-line dating system slammed by over 45 million visitors a month and 30+ million hits a day (500 - 600 pages per second). [sent-13, score-0.696]
3 All this is handled by one person, using a handful of servers, working a few hours a day, while making $6 million a year from Google ads. [sent-15, score-0.337]
4 2 billion page views/month, and 500,000 average unique logins per day. [sent-23, score-0.283]
5 Makes up to $10 million a year on Google ads working only two hours a day. [sent-26, score-0.336]
6 1 billion page views and 45 million visitors a month. [sent-29, score-0.566]
7 Approaching 64,000 simultaneous connections and 2 million page views per hour. [sent-36, score-0.685]
8 1 TB/day serving 171 million images through Akamai. [sent-38, score-0.344]
9 With 30 million page views a day you can make good money on advertising, even a 5 - 10 cents a CPM. [sent-45, score-0.629]
10 Load balancing - IIS arbitrarily limits the total connections to 64,000 so a load balancer was added to handle the large number of simultaneous connections. [sent-61, score-0.277]
11 And using ServerIron allowed advanced functionality like bot blocking and load balancing based on passed on cookies, session data, and IP data. [sent-63, score-0.472]
12 - NLB has an affinity option so a user always maps to a certain server, thus no external storage is used for session state and if the server fails the user loses their state and must relogin. [sent-69, score-0.45]
13 If this state includes a shopping cart or other important data, this solution may be poor, but for a dating site it seems reasonable. [sent-70, score-0.326]
14 If you are doing over a million page views a day just write out the code to spit it out to the screen. [sent-82, score-0.566]
15 Going from one million to 12 million users was a big jump. [sent-124, score-0.398]
16 It's easy to sell a few million page views at high CPM’s. [sent-158, score-0.583]
17 It's a LOT harder to sell billions of page views at high CPM’s, as shown by Myspace and Facebook. [sent-159, score-0.384]
18 To generate 100 million a year as a free site is virtually impossible as you need too big a market. [sent-162, score-0.402]
19 Growing page views via Facebook for a dating site won't work. [sent-163, score-0.566]
20 Most of Facebook's page views are outside the US and you have to split 5 cent CPM’s with Facebook. [sent-165, score-0.298]
wordName wordTfidf (topN-words)
[('pof', 0.322), ('markus', 0.318), ('million', 0.199), ('serveriron', 0.193), ('views', 0.163), ('dating', 0.15), ('images', 0.145), ('cpm', 0.144), ('nlb', 0.144), ('page', 0.135), ('site', 0.118), ('load', 0.1), ('australia', 0.096), ('session', 0.093), ('plentyoffish', 0.092), ('windows', 0.086), ('sell', 0.086), ('year', 0.085), ('bot', 0.081), ('per', 0.079), ('blocking', 0.077), ('clicks', 0.07), ('billion', 0.069), ('day', 0.069), ('facebook', 0.069), ('grow', 0.068), ('balancing', 0.068), ('ad', 0.068), ('canada', 0.067), ('user', 0.066), ('robin', 0.065), ('cents', 0.063), ('database', 0.063), ('people', 0.06), ('cdn', 0.06), ('servers', 0.059), ('used', 0.058), ('state', 0.058), ('ram', 0.056), ('dns', 0.056), ('simultaneous', 0.055), ('hire', 0.055), ('connections', 0.054), ('love', 0.054), ('balanced', 0.053), ('employees', 0.053), ('using', 0.053), ('top', 0.052), ('ads', 0.052), ('fails', 0.051)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 638 high scalability-2009-06-26-PlentyOfFish Architecture
Introduction: Update 5 : PlentyOfFish Update - 6 Billion Pageviews And 32 Billion Images A Month Update 4 : Jeff Atwood costs out Markus' scale up approach against a scale out approach and finds scale up wanting. The discussion in the comments is as interesting as the article. My guess is Markus doesn't want to rewrite his software to work across a scale out cluster so even if it's more expensive scale up works better for his needs. Update 3 : POF now has 200 million images and serves 10,000 images served per second. They'll be moving to a 250,000 IOPS RamSan to handle the load. Also upgraded to a core database machine with 512 GB of RAM, 32 CPU’s, SQLServer 2008 and Windows 2008. Update 2 : This seems to be a POF Peer1 love fest infomercial . It's pretty content free, but the production values are high. Lots of quirky sounds and fish swimming on the screen. Update : by Facebook standards Read/WriteWeb says POF is worth a cool one billion dollars . It helps to talk like Dr. Evil whe
2 0.99461496 1361 high scalability-2012-11-22-Gone Fishin': PlentyOfFish Architecture
Introduction: Other than StackOverflow , PlentyOfFish is perhaps the most spectacular example of scale-up architectures working for what your average sane person would consider a large system. It doesn't hurt that it's also a sexy story. Update 5 : PlentyOfFish Update - 6 Billion Pageviews And 32 Billion Images A Month Update 4 : Jeff Atwood costs out Markus' scale up approach against a scale out approach and finds scale up wanting. The discussion in the comments is as interesting as the article. My guess is Markus doesn't want to rewrite his software to work across a scale out cluster so even if it's more expensive scale up works better for his needs. Update 3 : POF now has 200 million images and serves 10,000 images served per second. They'll be moving to a 250,000 IOPS RamSan to handle the load. Also upgraded to a core database machine with 512 GB of RAM, 32 CPU’s, SQLServer 2008 and Windows 2008. Update 2 : This seems to be a POF Peer1 love fest infomercial . It's pretty cont
3 0.2648747 442 high scalability-2008-11-13-Plenty of Fish Says Scaling for Free Doesn't Pay
Introduction: Plenty of Fish CEO Markus Frind, famous nerd hero for making over $10 million a year from Google ads on a free dating site he made and ran all by himself, now sees a problem with the free model : The problem with free is that every time you double the size of your database the cost of maintaining the site grows 6 fold. I really underestimated how much resources it would take, I have one database table now that exceeds 3 billion records. The bigger you get as a free site the less money you make per visit and the more it costs to service a visit...There is really no money in being free and we have to start experimenting with other models now or we won’t be able to compete in 3 or 4 years. As one commenter succinctly put it: the “golden time” of AdSense is over . Time to look at costs. The POF architecture is to run scarily huge tables on single machines. They also buy and maintain their own SAN. So it seems scaling up is what is increasing costs and decreasing profits. I wo
4 0.25169623 1164 high scalability-2011-12-27-PlentyOfFish Update - 6 Billion Pageviews and 32 Billion Images a Month
Introduction: Markus has a short update on their PlentyOfFish Architecture . Impressive November statistics: 6 billion pageviews served 32 billion images served 6 million logins i n one day IM servers handle about 30 billion pageviews 11 webservers (5 of which could be dropped) Hired first DBA in July . They currently have a handful of employees . All hosting/cdn costs combined are under $70k/month. Lesson : small organization, simple architecture, on raw hardware is still plenty profitable for PlentyOfFish. Related Articles On HackerNews 32 Billion images a month by Markus Frind.
5 0.21953717 313 high scalability-2008-05-02-Friends for Sale Architecture - A 300 Million Page View-Month Facebook RoR App
Introduction: Update: Jake in Does Django really scale better than Rails? thinks apps like FFS shouldn't need so much hardware to scale. In a short three months Friends for Sale (think Hot-or-Not with a market economy) grew to become a top 10 Facebook application handling 200 gorgeous requests per second and a stunning 300 million page views a month. They did all this using Ruby on Rails, two part time developers, a cluster of a dozen machines, and a fairly standard architecture. How did Friends for Sale scale to sell all those beautiful people? And how much do you think your friends are worth on the open market? Site: http://www.facebook.com/apps/application.php?id=7019261521 Information Sources Siqi Chen and Alexander Le, co-creators of Friends for Sale, answering my standard questionairre. Virality on Facebook The Platform Ruby on Rails CentOS 5 (64 bit) Capistrano - update and restart application servers. Memcached MySQL Nginx Starling - distrib
6 0.21413569 106 high scalability-2007-10-02-Secrets to Fotolog's Scaling Success
7 0.20311198 1123 high scalability-2011-09-23-The Real News is Not that Facebook Serves Up 1 Trillion Pages a Month…
8 0.20285845 511 high scalability-2009-02-12-MySpace Architecture
9 0.17527309 274 high scalability-2008-03-12-YouTube Architecture
10 0.16786051 1359 high scalability-2012-11-15-Gone Fishin': Justin.Tv's Live Video Broadcasting Architecture
11 0.16773766 796 high scalability-2010-03-16-Justin.tv's Live Video Broadcasting Architecture
12 0.16642877 821 high scalability-2010-05-03-MocoSpace Architecture - 3 Billion Mobile Page Views a Month
13 0.16354755 67 high scalability-2007-08-17-What is the best hosting option?
14 0.16294521 1501 high scalability-2013-08-13-In Memoriam: Lavabit Architecture - Creating a Scalable Email Service
15 0.16057068 554 high scalability-2009-04-04-Digg Architecture
16 0.15493692 513 high scalability-2009-02-16-Handle 1 Billion Events Per Day Using a Memory Grid
17 0.15096831 1000 high scalability-2011-03-08-Medialets Architecture - Defeating the Daunting Mobile Device Data Deluge
18 0.1509268 808 high scalability-2010-04-12-Poppen.de Architecture
19 0.14511435 671 high scalability-2009-08-05-Stack Overflow Architecture
topicId topicWeight
[(0, 0.294), (1, 0.145), (2, -0.049), (3, -0.233), (4, -0.003), (5, -0.118), (6, -0.074), (7, 0.055), (8, 0.083), (9, 0.059), (10, 0.022), (11, -0.075), (12, -0.022), (13, 0.045), (14, -0.003), (15, 0.063), (16, -0.011), (17, 0.023), (18, 0.02), (19, 0.065), (20, 0.062), (21, 0.061), (22, -0.036), (23, -0.164), (24, 0.064), (25, -0.16), (26, -0.1), (27, 0.119), (28, 0.047), (29, 0.006), (30, -0.009), (31, 0.048), (32, -0.071), (33, -0.075), (34, 0.007), (35, -0.001), (36, 0.001), (37, 0.068), (38, 0.077), (39, -0.015), (40, -0.123), (41, 0.064), (42, -0.051), (43, 0.174), (44, -0.014), (45, 0.019), (46, 0.121), (47, 0.036), (48, 0.062), (49, 0.082)]
simIndex simValue blogId blogTitle
same-blog 1 0.96370661 638 high scalability-2009-06-26-PlentyOfFish Architecture
Introduction: Update 5 : PlentyOfFish Update - 6 Billion Pageviews And 32 Billion Images A Month Update 4 : Jeff Atwood costs out Markus' scale up approach against a scale out approach and finds scale up wanting. The discussion in the comments is as interesting as the article. My guess is Markus doesn't want to rewrite his software to work across a scale out cluster so even if it's more expensive scale up works better for his needs. Update 3 : POF now has 200 million images and serves 10,000 images served per second. They'll be moving to a 250,000 IOPS RamSan to handle the load. Also upgraded to a core database machine with 512 GB of RAM, 32 CPU’s, SQLServer 2008 and Windows 2008. Update 2 : This seems to be a POF Peer1 love fest infomercial . It's pretty content free, but the production values are high. Lots of quirky sounds and fish swimming on the screen. Update : by Facebook standards Read/WriteWeb says POF is worth a cool one billion dollars . It helps to talk like Dr. Evil whe
2 0.96238977 1361 high scalability-2012-11-22-Gone Fishin': PlentyOfFish Architecture
Introduction: Other than StackOverflow , PlentyOfFish is perhaps the most spectacular example of scale-up architectures working for what your average sane person would consider a large system. It doesn't hurt that it's also a sexy story. Update 5 : PlentyOfFish Update - 6 Billion Pageviews And 32 Billion Images A Month Update 4 : Jeff Atwood costs out Markus' scale up approach against a scale out approach and finds scale up wanting. The discussion in the comments is as interesting as the article. My guess is Markus doesn't want to rewrite his software to work across a scale out cluster so even if it's more expensive scale up works better for his needs. Update 3 : POF now has 200 million images and serves 10,000 images served per second. They'll be moving to a 250,000 IOPS RamSan to handle the load. Also upgraded to a core database machine with 512 GB of RAM, 32 CPU’s, SQLServer 2008 and Windows 2008. Update 2 : This seems to be a POF Peer1 love fest infomercial . It's pretty cont
3 0.89651924 1164 high scalability-2011-12-27-PlentyOfFish Update - 6 Billion Pageviews and 32 Billion Images a Month
Introduction: Markus has a short update on their PlentyOfFish Architecture . Impressive November statistics: 6 billion pageviews served 32 billion images served 6 million logins i n one day IM servers handle about 30 billion pageviews 11 webservers (5 of which could be dropped) Hired first DBA in July . They currently have a handful of employees . All hosting/cdn costs combined are under $70k/month. Lesson : small organization, simple architecture, on raw hardware is still plenty profitable for PlentyOfFish. Related Articles On HackerNews 32 Billion images a month by Markus Frind.
4 0.77865052 52 high scalability-2007-08-01-Product: Memcached
Introduction: Memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load. Danga Interactive developed memcached to enhance the speed of LiveJournal.com, a site which was already doing 20 million+ dynamic page views per day for 1 million users with a bunch of webservers and a bunch of database servers. memcached dropped the database load to almost nothing, yielding faster page load times for users, better resource utilization, and faster access to the databases on a memcache miss.
5 0.77217656 133 high scalability-2007-10-26-How Gravatar scales on WordPress.com hardware
Introduction: Automattic recently purchase Gravatar and have switched the server onto their hosting platform. WordPress.com host over 1.7 million blogs with well over 60'000 new posts submitted each day generating 10 - 12 million page views per day. Barry on WordPress.com has a great post on the changes they've introduced to help Gravatar scale .
6 0.77155334 106 high scalability-2007-10-02-Secrets to Fotolog's Scaling Success
7 0.70816278 511 high scalability-2009-02-12-MySpace Architecture
8 0.70452785 711 high scalability-2009-09-22-How Ravelry Scales to 10 Million Requests Using Rails
9 0.69939381 513 high scalability-2009-02-16-Handle 1 Billion Events Per Day Using a Memory Grid
10 0.69919246 788 high scalability-2010-03-04-How MySpace Tested Their Live Site with 1 Million Concurrent Users
11 0.68627119 356 high scalability-2008-07-22-Scaling Bumper Sticker: A 1 Billion Page Per Month Facebook RoR App
12 0.6821658 150 high scalability-2007-11-12-Slashdot Architecture - How the Old Man of the Internet Learned to Scale
13 0.68000233 965 high scalability-2010-12-29-Pinboard.in Architecture - Pay to Play to Keep a System Small
14 0.67342776 928 high scalability-2010-10-26-Scaling DISQUS to 75 Million Comments and 17,000 RPS
16 0.67102444 442 high scalability-2008-11-13-Plenty of Fish Says Scaling for Free Doesn't Pay
17 0.65756404 519 high scalability-2009-02-23-Database Sharding at Netlog, with MySQL and PHP
18 0.65530258 794 high scalability-2010-03-11-What would you like to ask Justin.tv?
19 0.65518171 304 high scalability-2008-04-19-How to build a real-time analytics system?
20 0.65388042 1108 high scalability-2011-08-31-Pud is the Anti-Stack - Windows, CFML, Dropbox, Xeround, JungleDisk, ELB
topicId topicWeight
[(1, 0.118), (2, 0.196), (5, 0.012), (10, 0.027), (30, 0.015), (33, 0.031), (40, 0.017), (61, 0.071), (77, 0.03), (79, 0.082), (85, 0.294), (94, 0.032)]
simIndex simValue blogId blogTitle
1 0.97860914 1049 high scalability-2011-05-31-Awesome List of Advanced Distributed Systems Papers
Introduction: As part of Dr. Indranil Gupta 's CS 525 Spring 2011 Advanced Distributed Systems class, he has collected an incredible list of resources on distributed systems . His research group is also doing some interesting work. The various topics include: Before there Were Clouds, Cloud Computing, P2P Systems, Basic Distributed Computing Concepts, Sensor Networks, Overlays and DHTs, Cloud Programming, Cloud Scheduling, Key-Value Stores, Storage, Sensor Net Routing, Geo-Distribution, P2P Apps, In-network processing, Epidemics, Probabilistic Membership Protocols, Distributed Monitoring and Management, Publish-Subscribe/CDNs, Measurement Studies, Old Wine: Stale or Vintage?, In Byzantium, Cloud Pricing, Other Industrial Systems, Structure of Networks, Completing the Circle, Green Clouds, Distributed Debugging, Flash!, The Middle or the End?, Availability-Aware Systems, Design Methodologies, Handling Stress, Sources of unreliability in networks, Handling Stress, Selfish algorithms, Securi
2 0.9768219 143 high scalability-2007-11-06-Product: ChironFS
Introduction: If you are trying to create highly available file systems, especially across data centers, then ChironFS is one potential solution. It's relatively new, so there aren't lots of experience reports, but it looks worth considering. What is ChironFS and how does it work? Adapted from the ChironFS website: The Chiron Filesystem is a Fuse based filesystem that frees you from single points of failure. It's main purpose is to guarantee filesystem availability using replication. But it isn't a RAID implementation. RAID replicates DEVICES not FILESYSTEMS. Why not just use RAID over some network block device? Because it is a block device and if one server mounts that device in RW mode, no other server will be able to mount it in RW mode. Any real network may have many servers and offer a variety of services. Keeping everything running can become a real nightmare!
3 0.96896279 1039 high scalability-2011-05-12-Paper: Mind the Gap: Reconnecting Architecture and OS Research
Introduction: Mind the Gap: Reconnecting Architecture and OS Research is a paper presented at HotOS XIII , the place where researchers talk about making potential futures happen. For a great overview of the conference take a look at this article by Matt Welsh: Conference report: HotOS 2011 in Napa . In the VM/cloud age I question the need of having an OS at all, programs can compile directly against "raw" hardware, but the paper does a good job of trying to figure out the new roll operating systems can play in the future. We've been in a long OS holding pattern, so long that we've seen the rise of PaaS vendors skipping the OS level abstraction completely, but there's room for a middle ground between legacy time sharing systems of the past and service level APIs that are but one possible future. Introduction: For too long, operating systems researchers and developers have pretty much taken whatever computer architects have dished out. With occasional exceptions (e.g., virtualization support)
4 0.96765018 102 high scalability-2007-09-27-Product: Sequoia Database Clustering Technology
Introduction: Sequoia is a transparent middleware solution offering clustering, load balancing and failover services for any database. Sequoia is the continuation of the C-JDBC project. The database is distributed and replicated among several nodes and Sequoia balances the queries among these nodes. Sequoia handles node and network failures with transparent failover. It also provides support for hot recovery, online maintenance operations and online upgrades. Features in a nutshell No modification of existing applications or databases. Operational with any database providing a JDBC driver. High availability provided by advanced RAIDb technology. Transparent failover and recovery capabilities. Performance scalability with unique load balancing and query result caching features. Integrated JMX-based administration and monitoring. 100% Java implementation allowing portability across platforms with a JRE 1.4 or greater. Open source licensed under Apache v2 license. Professi
5 0.96646452 820 high scalability-2010-05-03-100 Node Hazelcast cluster on Amazon EC2
Introduction: Deploying, running and monitoring application on a big cluster is a challenging task. Recently Hazelcast team deployed a demo application on Amazon EC2 platform to show how Hazelcast p2p cluster scales and screen recorded the entire process from deployment to monitoring. Hazelcast is open source (Apache License), transactional, distributed caching solution for Java. It is a little more than a cache though as it provides distributed implementation of map, multimap, queue, topic, lock and executor service. Details of running 100 node Hazelcast cluster on Amazon EC2 can be found here . Make sure to watch the screencast !
6 0.95087296 1500 high scalability-2013-08-12-100 Curse Free Lessons from Gordon Ramsay on Building Great Software
7 0.94549358 191 high scalability-2007-12-23-Synchronizing Memcached application
8 0.94386661 1032 high scalability-2011-05-02-Stack Overflow Makes Slow Pages 100x Faster by Simple SQL Tuning
9 0.94355744 1239 high scalability-2012-05-04-Stuff The Internet Says On Scalability For May 4, 2012
10 0.94129401 1577 high scalability-2014-01-13-NYTimes Architecture: No Head, No Master, No Single Point of Failure
11 0.93985611 59 high scalability-2007-08-04-Try Squid as a Reverse Proxy
12 0.92787611 492 high scalability-2009-01-16-Database Sharding for startups
13 0.92646265 447 high scalability-2008-11-19-High Definition Video Delivery on the Web?
14 0.91813534 1024 high scalability-2011-04-15-Stuff The Internet Says On Scalability For April 15, 2011
15 0.91524857 1592 high scalability-2014-02-07-Stuff The Internet Says On Scalability For February 7th, 2014
same-blog 16 0.91444033 638 high scalability-2009-06-26-PlentyOfFish Architecture
17 0.91352654 1361 high scalability-2012-11-22-Gone Fishin': PlentyOfFish Architecture
18 0.90654588 118 high scalability-2007-10-09-High Load on production Webservers after Sourcecode sync
19 0.89268625 1327 high scalability-2012-09-21-Stuff The Internet Says On Scalability For September 21, 2012
20 0.8918041 1080 high scalability-2011-07-15-Stuff The Internet Says On Scalability For July 15, 2011