high_scalability high_scalability-2009 high_scalability-2009-511 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Update: Presentation: Behind the Scenes at MySpace.com . Dan Farino, Chief Systems Architect at MySpace shares details of some of MySpace's cool internal operations tools. MySpace.com is one of the fastest growing site on the Internet with 65 million subscribers and 260,000 new users registering each day. Often criticized for poor performance, MySpace has had to tackle scalability issues few other sites have faced. How did they do it? Site: http://myspace.com Information Sources Presentation: Behind the Scenes at MySpace.com Inside MySpace.com Platform ASP.NET 2.0 Windows IIS SQL Server What's Inside? 300 million users. Pushes 100 gigabits/second to the internet. 10Gb/sec is HTML content. 4,500+ web servers windows 2003/IIS 6.0/APS.NET. 1,200+ cache servers running 64-bit Windows 2003. 16GB of objects cached in RAM. 500+ database servers running 64-bit Windows and SQL Server 2005. MySpace processes 1.5 Billion page views per day and
sentIndex sentText sentNum sentScore
1 com is one of the fastest growing site on the Internet with 65 million subscribers and 260,000 new users registering each day. [sent-5, score-0.512]
2 1,200+ cache servers running 64-bit Windows 2003. [sent-20, score-0.203]
3 500+ database servers running 64-bit Windows and SQL Server 2005. [sent-22, score-0.255]
4 3 million concurrent users during the day Membership Milestones: - 500,000 Users: A Simple Architecture Stumbles - 1 Million Users:Vertical Partitioning Solves Scalability Woes - 3 Million Users: Scale-Out Wins Over Scale-Up - 9 Million Users: Site Migrates to ASP. [sent-25, score-0.369]
5 NET, Adds Virtual Storage - 26 Million Users: MySpace Embraces 64-Bit Technology 500,000 accounts was too much load for two web servers and a single database. [sent-26, score-0.338]
6 At 1-2 Million Accounts - They used a database architecture built around the concept of vertical partitioning, with separate databases for parts of the website that served different functions such as the log-in screen, user profiles and blogs. [sent-27, score-0.477]
7 - The vertical partitioning scheme helped divide up the workload for database reads and writes alike, and when users demanded a new feature, MySpace would put a new database online to support it. [sent-28, score-0.721]
8 - MySpace switched from using storage devices directly attached to its database servers to a storage area network (SAN), in which a pool of disk storage devices are tied together by a high-speed, specialized network, and the databases connect to the SAN. [sent-29, score-0.678]
9 At 3 Million Accounts - the vertical partitioning solution didn't last because they replicated some horizontal information like user accounts across all vertical slices. [sent-31, score-0.737]
10 With so many replications one would fail and slow down the system. [sent-32, score-0.162]
11 - Moved to a virtualized storage architecture where the entire SAN is treated as one big pool of storage capacity, without requiring that specific disks be dedicated to serving specific applications. [sent-39, score-0.295]
12 So you have Profile1, Profile2 all the way up to Profile300 as they have 300 million users. [sent-48, score-0.242]
13 Doesn't use ASP cache because they don't have a high enough hit rate on the front-end. [sent-49, score-0.188]
14 So if the database is slow only those threads will slowdown and the traffic in the other threads will flow. [sent-54, score-0.528]
15 Can right-click on a problem server and get stack dump of the . [sent-59, score-0.188]
16 Troubleshooting is easier because you can see 90 threads are blocked on a database so the database may be down. [sent-64, score-0.465]
17 Developed their own asynchronous communication technology to get around windows networking problems and treat servers as a group. [sent-80, score-0.275]
18 Legitimate users and hackers will run into corner cases that weren't hit in testing, though QA will find most of the problems. [sent-96, score-0.23]
19 The example is they add a new database server for every million users. [sent-99, score-0.379]
20 It might be more efficient to change their approach to more efficiently use the database hardware, but it's easier just to add servers. [sent-100, score-0.212]
wordName wordTfidf (topN-words)
[('myspace', 0.385), ('million', 0.242), ('vertical', 0.201), ('dump', 0.188), ('san', 0.186), ('windows', 0.157), ('accounts', 0.147), ('database', 0.137), ('users', 0.127), ('partitioning', 0.119), ('servers', 0.118), ('threads', 0.116), ('scenes', 0.105), ('hit', 0.103), ('pushes', 0.095), ('ship', 0.092), ('slow', 0.092), ('moved', 0.091), ('qa', 0.086), ('cache', 0.085), ('virtualized', 0.083), ('objects', 0.08), ('site', 0.078), ('easier', 0.075), ('milestones', 0.075), ('contentions', 0.075), ('criticized', 0.075), ('storage', 0.074), ('web', 0.073), ('debugger', 0.07), ('reorganization', 0.07), ('replications', 0.07), ('databases', 0.07), ('user', 0.069), ('learnedyou', 0.067), ('embraces', 0.067), ('alike', 0.067), ('slowdown', 0.067), ('connect', 0.067), ('registering', 0.065), ('transitory', 0.065), ('sql', 0.064), ('pool', 0.064), ('partition', 0.063), ('asp', 0.063), ('vip', 0.063), ('keyed', 0.063), ('limits', 0.062), ('base', 0.062), ('migrates', 0.061)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000001 511 high scalability-2009-02-12-MySpace Architecture
Introduction: Update: Presentation: Behind the Scenes at MySpace.com . Dan Farino, Chief Systems Architect at MySpace shares details of some of MySpace's cool internal operations tools. MySpace.com is one of the fastest growing site on the Internet with 65 million subscribers and 260,000 new users registering each day. Often criticized for poor performance, MySpace has had to tackle scalability issues few other sites have faced. How did they do it? Site: http://myspace.com Information Sources Presentation: Behind the Scenes at MySpace.com Inside MySpace.com Platform ASP.NET 2.0 Windows IIS SQL Server What's Inside? 300 million users. Pushes 100 gigabits/second to the internet. 10Gb/sec is HTML content. 4,500+ web servers windows 2003/IIS 6.0/APS.NET. 1,200+ cache servers running 64-bit Windows 2003. 16GB of objects cached in RAM. 500+ database servers running 64-bit Windows and SQL Server 2005. MySpace processes 1.5 Billion page views per day and
2 0.35559762 1011 high scalability-2011-03-25-Did the Microsoft Stack Kill MySpace?
Introduction: Robert Scoble wrote a fascinating case study, MySpace’s death spiral: insiders say it’s due to bets on Los Angeles and Microsoft , where he reports MySpace insiders blame the Microsoft stack on why they lost the great social network race to Facebook. Does anyone know if this is true? What's the real story? I was wondering because it doesn't seem to track with the MySpace Architecture post that I did in 2009, where they seem happy with their choices and had stats to back up their improvements. Why this matters is it's a fascinating model for startups to learn from. What does it really take to succeed? Is it the people or the stack? Is it the organization or the technology? Is it the process or the competition? Is the quality of the site or the love of the users? So much to consider and learn from. Some conjectures from the article: Myspace didn't have programming talent capable of scaling the site to compete with Facebook. Choosing the Microsoft stack made it difficul
3 0.28619975 788 high scalability-2010-03-04-How MySpace Tested Their Live Site with 1 Million Concurrent Users
Introduction: This is a guest post by Dan Bartow, VP of SOASTA , talking about how they pelted MySpace with 1 million concurrent users using 800 EC2 instances. I thought this was an interesting story because: that's a lot of users, it takes big cajones to test your live site like that, and not everything worked out quite as expected. I'd like to thank Dan for taking the time to write and share this article. In December of 2009 MySpace launched a new wave of streaming music video offerings in New Zealand, building on the previous success of MySpace music. These new features included the ability to watch music videos, search for artist’s videos, create lists of favorites, and more. The anticipated load increase from a feature like this on a popular site like MySpace is huge, and they wanted to test these features before making them live. If you manage the infrastructure that sits behind a high traffic application you don’t want any surprises. You want to understand your breakin
4 0.23184331 1014 high scalability-2011-03-31-8 Lessons We Can Learn from the MySpace Incident - Balance, Vision, Fearlessness
Introduction: A surprising amount of heat and light was generated by the whole Micrsoft vs MySpace discussion. Why people feel so passionate about this I'm not quite sure, but fortunately for us, in the best sense of the web, it generated an amazing number of insightful comments and observations. If we stand back and take a look at the whole incident, what can we take a way that might help us in the future? All computer companies are technology companies first. A repeated theme was that you can't be an entertainment company first. You are a technology company providing entertainment using technology. The tech can inform the entertainment side, the entertainment side drives features, but they really can't be separated. An awesome stack that does nothing is useless. A great idea on a poor stack is just as useless. There's a difficult balance that must be achieved and both management and developers must be aware that there's something to balance. All pigs are equal . All business f
5 0.20285845 638 high scalability-2009-06-26-PlentyOfFish Architecture
Introduction: Update 5 : PlentyOfFish Update - 6 Billion Pageviews And 32 Billion Images A Month Update 4 : Jeff Atwood costs out Markus' scale up approach against a scale out approach and finds scale up wanting. The discussion in the comments is as interesting as the article. My guess is Markus doesn't want to rewrite his software to work across a scale out cluster so even if it's more expensive scale up works better for his needs. Update 3 : POF now has 200 million images and serves 10,000 images served per second. They'll be moving to a 250,000 IOPS RamSan to handle the load. Also upgraded to a core database machine with 512 GB of RAM, 32 CPU’s, SQLServer 2008 and Windows 2008. Update 2 : This seems to be a POF Peer1 love fest infomercial . It's pretty content free, but the production values are high. Lots of quirky sounds and fish swimming on the screen. Update : by Facebook standards Read/WriteWeb says POF is worth a cool one billion dollars . It helps to talk like Dr. Evil whe
6 0.20080183 1361 high scalability-2012-11-22-Gone Fishin': PlentyOfFish Architecture
7 0.18860771 821 high scalability-2010-05-03-MocoSpace Architecture - 3 Billion Mobile Page Views a Month
9 0.16118284 18 high scalability-2007-07-16-Paper: MySQL Scale-Out by application partitioning
10 0.15914819 331 high scalability-2008-05-27-eBay Architecture
11 0.15907681 70 high scalability-2007-08-22-How many machines do you need to run your site?
12 0.15751454 5 high scalability-2007-07-10-mixi.jp Architecture
13 0.15659003 538 high scalability-2009-03-16-Are Cloud Based Memory Architectures the Next Big Thing?
14 0.15524779 1521 high scalability-2013-09-23-Salesforce Architecture - How they Handle 1.3 Billion Transactions a Day
15 0.15492852 313 high scalability-2008-05-02-Friends for Sale Architecture - A 300 Million Page View-Month Facebook RoR App
16 0.15385483 274 high scalability-2008-03-12-YouTube Architecture
17 0.15315601 106 high scalability-2007-10-02-Secrets to Fotolog's Scaling Success
19 0.15171607 1131 high scalability-2011-10-24-StackExchange Architecture Updates - Running Smoothly, Amazon 4x More Expensive
20 0.14850008 1501 high scalability-2013-08-13-In Memoriam: Lavabit Architecture - Creating a Scalable Email Service
topicId topicWeight
[(0, 0.299), (1, 0.114), (2, -0.018), (3, -0.194), (4, 0.036), (5, 0.0), (6, 0.001), (7, -0.022), (8, -0.041), (9, 0.021), (10, -0.015), (11, -0.029), (12, -0.013), (13, 0.085), (14, -0.029), (15, 0.048), (16, -0.003), (17, 0.019), (18, -0.001), (19, 0.07), (20, 0.019), (21, 0.019), (22, -0.011), (23, -0.026), (24, 0.045), (25, -0.079), (26, -0.028), (27, 0.003), (28, -0.004), (29, -0.014), (30, 0.035), (31, 0.016), (32, -0.059), (33, -0.075), (34, 0.032), (35, 0.056), (36, -0.064), (37, 0.061), (38, 0.102), (39, 0.127), (40, -0.02), (41, -0.078), (42, -0.04), (43, 0.03), (44, 0.01), (45, 0.012), (46, 0.061), (47, -0.027), (48, -0.06), (49, 0.14)]
simIndex simValue blogId blogTitle
same-blog 1 0.97179818 511 high scalability-2009-02-12-MySpace Architecture
Introduction: Update: Presentation: Behind the Scenes at MySpace.com . Dan Farino, Chief Systems Architect at MySpace shares details of some of MySpace's cool internal operations tools. MySpace.com is one of the fastest growing site on the Internet with 65 million subscribers and 260,000 new users registering each day. Often criticized for poor performance, MySpace has had to tackle scalability issues few other sites have faced. How did they do it? Site: http://myspace.com Information Sources Presentation: Behind the Scenes at MySpace.com Inside MySpace.com Platform ASP.NET 2.0 Windows IIS SQL Server What's Inside? 300 million users. Pushes 100 gigabits/second to the internet. 10Gb/sec is HTML content. 4,500+ web servers windows 2003/IIS 6.0/APS.NET. 1,200+ cache servers running 64-bit Windows 2003. 16GB of objects cached in RAM. 500+ database servers running 64-bit Windows and SQL Server 2005. MySpace processes 1.5 Billion page views per day and
2 0.84147418 513 high scalability-2009-02-16-Handle 1 Billion Events Per Day Using a Memory Grid
Introduction: Moshe Kaplan of RockeTier shows the life cycle of an affiliate marketing system that starts off as a cub handling one million events per day and ends up a lion handling 200 million to even one billion events per day. The resulting system uses ten commodity servers at a cost of $35,000. Mr. Kaplan's paper is especially interesting because it documents a system architecture evolution we may see a lot more of in the future: database centric --> cache centric --> memory grid . As scaling and performance requirements for complicated operations increase, leaving the entire system in memory starts to make a great deal of sense. Why use cache at all? Why shouldn't your system be all in memory from the start? General Approach to Evolving the System to Scale Analyze the system architecture and the main business processes. Detect the main hardware bottlenecks and the related business process causing them. Focus efforts on points of greatest return. Rate the bottlenecks by importance
3 0.80874145 821 high scalability-2010-05-03-MocoSpace Architecture - 3 Billion Mobile Page Views a Month
Introduction: This is a guest post by Jamie Hall, Co-founder & CTO of MocoSpace , describing the architecture for their mobile social network. This is a timely architecture to learn from as it combines several hot trends: it is very large, mobile, and social. What they think is especially cool about their system is: how it optimizes for device/browser fragmentation on the mobile Web; their multi-tiered, read/write, local/distributed caching system; selecting PostgreSQL over MySQL as a relational DB that can scale. MocoSpace is a mobile social network, with 12 million members and 3 billion page views a month, which makes it one of the most highly trafficked mobile Websites in the US. Members access the site mainly from their mobile phone Web browser, ranging from high end smartphones to lower end devices, as well as the Web. Activities on the site include customizing profiles, chat, instant messaging, music, sharing photos & videos, games, eCards and blogs. The monetization strategy is focused on
4 0.80340481 928 high scalability-2010-10-26-Scaling DISQUS to 75 Million Comments and 17,000 RPS
Introduction: This presentation and video by Jason Yan and David Cramer discusses how they scaled DISQUS, a comments as a service service for easily adding comments to your site and connecting communities. The presentation is very good, so here are just a few highlights: Traffic : 17,000 requests/second peak; 450,000 websites; 15 million profiles; 75 million comments; 250 million visitors; 40 million monthly users / developer. Forces : unpredictable traffic patterns because of celebrity gossip and events like disasters; discussion never expire which means they can't fit in memory; must always be up. Machines : 100 servers; 30% web servers (Appache + mod_wsgi); 10% databases (PostgreSQL); 25% cache servers (memcached); 20% load balancing / high availability (HAProxy + heartbeat); 15% Utility servers (Python scripts). Architecture : Requests are load balanced across an Apache cluster. Apache talks to memcached, HAProxy/pgbouncer to handle connection pooling to the database, and a ce
5 0.79586715 106 high scalability-2007-10-02-Secrets to Fotolog's Scaling Success
Introduction: Fotolog, a social blogging site centered around photos, grew from about 300 thousand users in 2004 to over 11 million users in 2007. Though they initially experienced the inevitable pains of rapid growth, they overcame their problems and now manage over 300 million photos and 800,000 new photos are added each day. Generating all that fabulous content are 20 million unique monthly visitors and a volunteer army of 30,000 new users each day. They did so well a very impressed suitor bought them out for a cool $90 million. That's scale meets success by anyone standards. How did they do it? Site: http://www.fotolog.com Information Sources Scaling the World's Largest Photo Blogging Community Congrats to Fotolog on $90mm sale to Hi-Media Fotolog overtaking Flickr? Fotolog Hits 11 Million Members and 300 Million Photos Posted Site of the Week: Fotolog.com by PC Magazine CEO John Borthwick's Blog . DBA Frank Mash's Blog Fotolog, lessons learnt by John B
6 0.78224546 998 high scalability-2011-03-03-Stack Overflow Architecture Update - Now at 95 Million Page Views a Month
7 0.78186512 52 high scalability-2007-08-01-Product: Memcached
8 0.78126609 638 high scalability-2009-06-26-PlentyOfFish Architecture
9 0.78056288 1361 high scalability-2012-11-22-Gone Fishin': PlentyOfFish Architecture
10 0.77777594 331 high scalability-2008-05-27-eBay Architecture
11 0.76193047 473 high scalability-2008-12-20-Second Life Architecture - The Grid
12 0.75972074 5 high scalability-2007-07-10-mixi.jp Architecture
13 0.75551647 1094 high scalability-2011-08-08-Tagged Architecture - Scaling to 100 Million Users, 1000 Servers, and 5 Billion Page Views
14 0.75458699 671 high scalability-2009-08-05-Stack Overflow Architecture
15 0.74689388 7 high scalability-2007-07-12-FeedBurner Architecture
16 0.74348342 72 high scalability-2007-08-22-Wikimedia architecture
18 0.73869801 903 high scalability-2010-09-17-Hot Scalability Links For Sep 17, 2010
19 0.73680753 1456 high scalability-2013-05-13-The Secret to 10 Million Concurrent Connections -The Kernel is the Problem, Not the Solution
20 0.73488134 150 high scalability-2007-11-12-Slashdot Architecture - How the Old Man of the Internet Learned to Scale
topicId topicWeight
[(1, 0.208), (2, 0.214), (10, 0.057), (18, 0.092), (30, 0.055), (47, 0.028), (61, 0.069), (79, 0.112), (85, 0.064), (94, 0.034)]
simIndex simValue blogId blogTitle
1 0.96239513 1344 high scalability-2012-10-19-Stuff The Internet Says On Scalability For October 19, 2012
Introduction: It's HighScalability Time: @davilagrau : Youtube, GitHub,..., Are cloud services facing a entropic limit to scalability? Async all the way down? The Tyranny of the Clock : The cost of logic and memory dominated Turing's thinking, but today, communication rather than logic should dominate our thinking. Clock-free design uses less than half, about 40%, as much energy per addition as its clocked counterpart. We can regain the efficiency of local decision making by revolting against the pervasive beat of an external clock. Why Google Compute Engine for OpenStack . Smart move. Having OpenStack work inside a super charged cloud, in private clouds, and as a bridge between the two ought to be quite attractive to developers looking for some sort of ally for independence. All it will take are a few victories to cement new alliances. 3 Lessons That Startups Can Learn From Facebook’s Failed Credits Experiment . I thought this was a great idea too. So what happened? FACEBOOK DID NOT
same-blog 2 0.95806116 511 high scalability-2009-02-12-MySpace Architecture
Introduction: Update: Presentation: Behind the Scenes at MySpace.com . Dan Farino, Chief Systems Architect at MySpace shares details of some of MySpace's cool internal operations tools. MySpace.com is one of the fastest growing site on the Internet with 65 million subscribers and 260,000 new users registering each day. Often criticized for poor performance, MySpace has had to tackle scalability issues few other sites have faced. How did they do it? Site: http://myspace.com Information Sources Presentation: Behind the Scenes at MySpace.com Inside MySpace.com Platform ASP.NET 2.0 Windows IIS SQL Server What's Inside? 300 million users. Pushes 100 gigabits/second to the internet. 10Gb/sec is HTML content. 4,500+ web servers windows 2003/IIS 6.0/APS.NET. 1,200+ cache servers running 64-bit Windows 2003. 16GB of objects cached in RAM. 500+ database servers running 64-bit Windows and SQL Server 2005. MySpace processes 1.5 Billion page views per day and
Introduction: This is a guest post by Gordon Worley , a Software Engineer at Korrelate , where they correlate (see what they did there) online purchases to offline purchases. Several weeks ago, we came into the office one morning to find every server alarm going off. Pixel log processing was behind by 8 hours and not making headway. Checking the logs, we discovered that a big client had come online during the night and was giving us 10 times more traffic than we were originally told to expect. I wouldn’t say we panicked, but the office was certainly more jittery than usual. Over the next several hours, though, thanks both to foresight and quick thinking, we were able to scale up to handle the added load and clear the backlog to return log processing to a steady state. At Korrelate, we deploy tracking pixels , also known beacons or web bugs, that our partners use to send us information about their users. These tiny web objects contain no visible content, but may include transparent 1 by 1 gif
4 0.95478994 1654 high scalability-2014-06-05-Cloud Architecture Revolution
Introduction: The introduction of cloud technologies is not a simple evolution of existing ones, but a real revolution. Like all revolutions, it changes the points of views and redefines all the meanings. Nothing is as before. This post wants to analyze some key words and concepts, usually used in traditional architectures, redefining them according the standpoint of the cloud. Understanding the meaning of new words is crucial to grasp the essence of a pure cloud architecture. << There is no greater impediment to the advancement of knowledge than the ambiguity of words. >> THOMAS REID, Essays on the Intellectual Powers of Man Nowadays, it is required to challenge the limits of traditional architectures that go beyond the normal concepts of scalability and support millions of users (What's Up 500 Million) billions of transactions per day (Salesforce 1.3 billion), five 9s of availability (99.999 AOL). I wish all of you the success of the examples cited above, but do not think that it is co
5 0.95175266 881 high scalability-2010-08-16-Scaling an AWS infrastructure - Tools and Patterns
Introduction: This is a guest post by Frédéric Faure (architect at Ysance ), you can follow him on twitter . How do you scale an AWS (Amazon Web Services) infrastructure? This article will give you a detailed reply in two parts: the tools you can use to make the most of Amazon’s dynamic approach, and the architectural model you should adopt for a scalable infrastructure. I base my report on my experience gained in several AWS production projects in casual gaming (Facebook), e-commerce infrastructures and within the mainstream GIS (Geographic Information System). It’s true that my experience in gaming ( IsCool, The Game ) is currently the most representative in terms of scalability, due to the number of users (over 800 thousand DAU – daily active users – at peak usage and over 20 million page views every day), however my experiences in e-commerce and GIS (currently underway) provide a different view of scalability, taking into account the various problems of availability and da
6 0.94941574 1557 high scalability-2013-12-02-Evolution of Bazaarvoice’s Architecture to 500M Unique Users Per Month
7 0.94843566 304 high scalability-2008-04-19-How to build a real-time analytics system?
9 0.94645333 1093 high scalability-2011-08-05-Stuff The Internet Says On Scalability For August 5, 2011
10 0.94643497 1553 high scalability-2013-11-25-How To Make an Infinitely Scalable Relational Database Management System (RDBMS)
12 0.94495785 1011 high scalability-2011-03-25-Did the Microsoft Stack Kill MySpace?
13 0.9444291 906 high scalability-2010-09-22-Applying Scalability Patterns to Infrastructure Architecture
14 0.94422185 857 high scalability-2010-07-13-DbShards Part Deux - The Internals
15 0.9434815 233 high scalability-2008-01-30-How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data
16 0.94293463 1197 high scalability-2012-02-21-Pixable Architecture - Crawling, Analyzing, and Ranking 20 Million Photos a Day
17 0.94278014 139 high scalability-2007-10-30-Paper: Dynamo: Amazon’s Highly Available Key-value Store
18 0.94226032 1090 high scalability-2011-08-01-Peecho Architecture - scalability on a shoestring
19 0.94110239 325 high scalability-2008-05-25-How do you explain cloud computing to your grandma?