high_scalability high_scalability-2008 high_scalability-2008-331 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Update 2: EBay's Randy Shoup spills the secrets of how to service hundreds of millions of users and over two billion page views a day in Scalability Best Practices: Lessons from eBay on InfoQ. The practices: Partition by Function, Split Horizontally, Avoid Distributed Transactions, Decouple Functions Asynchronously, Move Processing To Asynchronous Flows, Virtualize At All Levels, Cache Appropriately. Update: eBay Serves 5 Billion API Calls Each Month . Aren't we seeing more and more traffic driven by mashups composed on top of open APIs? APIs are no longer a bolt on, they are your application. Architecturally that argues for implementing your own application around the same APIs developers and users employ. Who hasn't wondered how eBay does their business? As one of the largest most loaded websites in the world, it can't be easy. And the subtitle of the presentation hints at how creating such a monster system requires true engineering: Striking a balance between site stabilit
sentIndex sentText sentNum sentScore
1 Update 2: EBay's Randy Shoup spills the secrets of how to service hundreds of millions of users and over two billion page views a day in Scalability Best Practices: Lessons from eBay on InfoQ. [sent-1, score-0.25]
2 And the subtitle of the presentation hints at how creating such a monster system requires true engineering: Striking a balance between site stability, feature velocity, performance, and cost. [sent-9, score-0.438]
3 com Information Sources The eBay Architecture - Striking a balance between site stability, feature velocity, performance, and cost. [sent-12, score-0.204]
4 Podcast: eBay’s Transactions on a Massive Scale Dan Pritchett on Architecture at eBay interview by InfoQ Platform Java Oracle WebSphere, servlets Horizontal Scaling Sharding Mix of Windows and Unix What's Inside? [sent-13, score-0.13]
5 This information was adapted from Johannes Ernst's Blog The Stats On an average day, it runs through 26 billion SQL queries and keeps tabs on 100 million items available for purchase. [sent-14, score-0.144]
6 212 million registered users, 1 billion photos 1 billion page views a day, 105 million listings, 2 petabytes of data, 3 billion API calls a month Something like a factor of 35 in page views, e-mails sent, bandwidth from June 1999 to Q3/2006. [sent-15, score-0.538]
7 94% availability, measured as "all parts of site functional to everybody" vs. [sent-17, score-0.245]
8 at least one part of a site not functional to some users somewhere The database is virtualized and spans 600 production instances residing in more than 100 server clusters. [sent-18, score-0.332]
9 Architectures is strictly divided into layers: data tier, application tier, search, operations, Leverages MSXML framework for presentation layer (even in Java) Oracle databases, WebSphere Java (still 1. [sent-26, score-0.185]
10 1) Split databases by primary access path, modulo on a key. [sent-28, score-0.182]
11 Distributed over 8 data centers Some database copies run 15 min behind, 4 hours behind Databases are segmented by function: user, item account, feedback, transaction, over 70 in all. [sent-30, score-0.187]
12 Move cpu-intensive work moved out of the database layer to applications applications layer: referential integrity, joins, sorting done in the application layer! [sent-33, score-0.271]
13 Average item on site changes its search data 5 times before it is sold (e. [sent-41, score-0.347]
14 Uses reliable multicast from primary database to search nodes, in-memory search index, horizontal segmentation, N slices, load-balances over M instances, cache queries. [sent-46, score-0.546]
15 Move work out of the database into the applications because the database is the bottleneck . [sent-55, score-0.174]
16 You need to be able to change, refine, and develop your new system while keeping your existing site running. [sent-69, score-0.201]
17 It's a mistake to worry too much about scalability from the start . [sent-71, score-0.181]
18 It's also a mistake not to worry about scalability at all . [sent-73, score-0.181]
19 Don't let people and organizations be why your site fails. [sent-78, score-0.129]
20 A good system is developed overtime in response to real issues and concerns. [sent-81, score-0.155]
wordName wordTfidf (topN-words)
[('ebay', 0.398), ('billion', 0.144), ('striking', 0.141), ('servlets', 0.13), ('site', 0.129), ('horizontal', 0.124), ('search', 0.118), ('functional', 0.116), ('layer', 0.112), ('apis', 0.107), ('views', 0.106), ('mistake', 0.104), ('item', 0.1), ('primary', 0.099), ('velocity', 0.099), ('evolve', 0.096), ('emulate', 0.089), ('feeder', 0.089), ('johannes', 0.089), ('subtitle', 0.089), ('voyager', 0.089), ('database', 0.087), ('overtime', 0.083), ('bolt', 0.083), ('modulo', 0.083), ('paralysis', 0.083), ('segmentation', 0.083), ('shoup', 0.083), ('joins', 0.083), ('stability', 0.08), ('compelled', 0.079), ('pritchett', 0.079), ('refine', 0.079), ('tier', 0.078), ('worry', 0.077), ('sourcesthe', 0.077), ('architecturally', 0.077), ('mashups', 0.077), ('virtualize', 0.077), ('split', 0.076), ('practices', 0.076), ('balance', 0.075), ('function', 0.075), ('presentation', 0.073), ('system', 0.072), ('referential', 0.072), ('randy', 0.072), ('websphere', 0.07), ('layering', 0.07), ('listings', 0.07)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000004 331 high scalability-2008-05-27-eBay Architecture
Introduction: Update 2: EBay's Randy Shoup spills the secrets of how to service hundreds of millions of users and over two billion page views a day in Scalability Best Practices: Lessons from eBay on InfoQ. The practices: Partition by Function, Split Horizontally, Avoid Distributed Transactions, Decouple Functions Asynchronously, Move Processing To Asynchronous Flows, Virtualize At All Levels, Cache Appropriately. Update: eBay Serves 5 Billion API Calls Each Month . Aren't we seeing more and more traffic driven by mashups composed on top of open APIs? APIs are no longer a bolt on, they are your application. Architecturally that argues for implementing your own application around the same APIs developers and users employ. Who hasn't wondered how eBay does their business? As one of the largest most loaded websites in the world, it can't be easy. And the subtitle of the presentation hints at how creating such a monster system requires true engineering: Striking a balance between site stabilit
2 0.24436542 742 high scalability-2009-11-17-10 eBay Secrets for Planet Wide Scaling
Introduction: You don't even have to make a bid, Randy Shoup, an eBay Distinguished Architect, gives this presentation on how eBay scales, for free. Randy has done a fabulous job in this presentation and in other talks listed at the end of this post getting at the heart of the principles behind scalability. It's more about ideas of how things work and fit together than a focusing on a particular technology stack. Impressive Stats In case you weren't sure, eBay is big, with lots of: users, data, features, and change... Over 89 million active users worldwide 190 million items for sale in 50,000 categories Over 8 billion URL requests per day Hundreds of new features per quarter Roughly 10% of items are listed or ended every day In 39 countries and 10 languages 24x7x365 70 billion read / write operations / day Processes 50TB of new, incremental data per day Analyzes 50PB of data per day 10 Lessons The presentation does a good job explaining each lesson, but the list is.
3 0.20616128 425 high scalability-2008-10-22-Scalability Best Practices: Lessons from eBay
Introduction: At eBay, one of the primary architectural forces we contend with every day is scalability. It colors and drives every architectural and design decision we make. With hundreds of millions of users worldwide, over two billion page views a day, and petabytes of data in our systems, this is not a choice - it is a necessity. In a scalable architecture, resource usage should increase linearly (or better) with load, where load may be measured in user traffic, data volume, etc. Where performance is about the resource usage associated with a single unit of work, scalability is about how resource usage changes as units of work grow in number or size. Said another way, scalability is the shape of the price-performance curve, as opposed to its value at one point in that curve. There are many facets to scalability - transactional, operational, development effort. In this article, I will outline several of the key best practices we have learned over time to scale the transactional th
4 0.20013702 550 high scalability-2009-03-30-Ebay history and architecture
Introduction: Ebay [1] Starts in 1995, initial name AuctionWeb (V1) : - very simple architecture - based on perl - no database, for data persistence they used plain files Because of rapid growth they needed to improve their architecture and so V2 (clever name) was born: - replaced perl with C/C++ - started using a database in a master-slave configuration - C++ back-end - XSLT front-end Any request will lead to an XML file being created in C++ and the XLST processor will transform that into html. *pretty sophisticated architecture for the 90s, XLST was cutting-edge back then* That hold out pretty well for a while but in the late 90s ebay experienced an exponential growth. They started having some trouble with outages and needed improvements, so V3 was developed: - based on java - search engine still used C++ - proof that relational databases can scale (aggressive caching) - developed a messaging layer for making a lot of asyncronious calls, they a
5 0.15944204 65 high scalability-2007-08-16-Scaling Secret #2: Denormalizing Your Way to Speed and Profit
Introduction: Alan Watts once observed how after we accepted Descartes' separation of the mind and body we've been trying to smash them back together again ever since when really they were never separate to begin with. The database normalization-denormalization dualism has the same mobius shaped reverberations as Descartes' error. We separate data into a million jagged little pieces and then spend all our time stooping over, picking them and up, and joining them back together again. Normalization has been standard practice now for decades. But times are changing. Many mega-website architects are concluding Watts was right: the data was never separate to begin with. And even more radical, we may even need to store multiple copies of data. Information Sources Normalization Is for Sissies by Pat Helland Data normalization, is it really that good? by Arnon Rotem-Gal-Oz When Not to Normalize your SQL Database by Dare Obasanjo MegaData by Joe Gregorio Audio
6 0.15914819 511 high scalability-2009-02-12-MySpace Architecture
7 0.15217504 1068 high scalability-2011-06-27-TripAdvisor Architecture - 40M Visitors, 200M Dynamic Page Views, 30TB Data
10 0.13873592 538 high scalability-2009-03-16-Are Cloud Based Memory Architectures the Next Big Thing?
11 0.13791978 187 high scalability-2007-12-14-The Current Pros and Cons List for SimpleDB
12 0.13485931 1158 high scalability-2011-12-16-Stuff The Internet Says On Scalability For December 16, 2011
13 0.13250238 638 high scalability-2009-06-26-PlentyOfFish Architecture
14 0.13152836 1361 high scalability-2012-11-22-Gone Fishin': PlentyOfFish Architecture
15 0.12899753 459 high scalability-2008-12-03-Java World Interview on Scalability and Other Java Scalability Secrets
16 0.12876558 313 high scalability-2008-05-02-Friends for Sale Architecture - A 300 Million Page View-Month Facebook RoR App
17 0.12745896 821 high scalability-2010-05-03-MocoSpace Architecture - 3 Billion Mobile Page Views a Month
18 0.12672217 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT
19 0.12604415 332 high scalability-2008-05-28-Job queue and search engine
20 0.1257059 1521 high scalability-2013-09-23-Salesforce Architecture - How they Handle 1.3 Billion Transactions a Day
topicId topicWeight
[(0, 0.264), (1, 0.092), (2, -0.023), (3, -0.081), (4, 0.025), (5, 0.004), (6, -0.015), (7, -0.024), (8, -0.009), (9, 0.055), (10, -0.03), (11, 0.041), (12, -0.03), (13, 0.05), (14, 0.007), (15, -0.012), (16, -0.01), (17, -0.03), (18, 0.07), (19, 0.048), (20, 0.036), (21, -0.008), (22, 0.032), (23, -0.07), (24, -0.066), (25, -0.069), (26, -0.09), (27, 0.008), (28, 0.061), (29, 0.057), (30, 0.002), (31, 0.057), (32, -0.033), (33, -0.062), (34, 0.037), (35, 0.004), (36, -0.073), (37, -0.023), (38, -0.019), (39, 0.02), (40, 0.044), (41, -0.068), (42, -0.066), (43, 0.066), (44, -0.037), (45, 0.014), (46, 0.061), (47, 0.038), (48, -0.11), (49, 0.062)]
simIndex simValue blogId blogTitle
same-blog 1 0.97531784 331 high scalability-2008-05-27-eBay Architecture
Introduction: Update 2: EBay's Randy Shoup spills the secrets of how to service hundreds of millions of users and over two billion page views a day in Scalability Best Practices: Lessons from eBay on InfoQ. The practices: Partition by Function, Split Horizontally, Avoid Distributed Transactions, Decouple Functions Asynchronously, Move Processing To Asynchronous Flows, Virtualize At All Levels, Cache Appropriately. Update: eBay Serves 5 Billion API Calls Each Month . Aren't we seeing more and more traffic driven by mashups composed on top of open APIs? APIs are no longer a bolt on, they are your application. Architecturally that argues for implementing your own application around the same APIs developers and users employ. Who hasn't wondered how eBay does their business? As one of the largest most loaded websites in the world, it can't be easy. And the subtitle of the presentation hints at how creating such a monster system requires true engineering: Striking a balance between site stabilit
2 0.76990503 550 high scalability-2009-03-30-Ebay history and architecture
Introduction: Ebay [1] Starts in 1995, initial name AuctionWeb (V1) : - very simple architecture - based on perl - no database, for data persistence they used plain files Because of rapid growth they needed to improve their architecture and so V2 (clever name) was born: - replaced perl with C/C++ - started using a database in a master-slave configuration - C++ back-end - XSLT front-end Any request will lead to an XML file being created in C++ and the XLST processor will transform that into html. *pretty sophisticated architecture for the 90s, XLST was cutting-edge back then* That hold out pretty well for a while but in the late 90s ebay experienced an exponential growth. They started having some trouble with outages and needed improvements, so V3 was developed: - based on java - search engine still used C++ - proof that relational databases can scale (aggressive caching) - developed a messaging layer for making a lot of asyncronious calls, they a
3 0.751867 511 high scalability-2009-02-12-MySpace Architecture
Introduction: Update: Presentation: Behind the Scenes at MySpace.com . Dan Farino, Chief Systems Architect at MySpace shares details of some of MySpace's cool internal operations tools. MySpace.com is one of the fastest growing site on the Internet with 65 million subscribers and 260,000 new users registering each day. Often criticized for poor performance, MySpace has had to tackle scalability issues few other sites have faced. How did they do it? Site: http://myspace.com Information Sources Presentation: Behind the Scenes at MySpace.com Inside MySpace.com Platform ASP.NET 2.0 Windows IIS SQL Server What's Inside? 300 million users. Pushes 100 gigabits/second to the internet. 10Gb/sec is HTML content. 4,500+ web servers windows 2003/IIS 6.0/APS.NET. 1,200+ cache servers running 64-bit Windows 2003. 16GB of objects cached in RAM. 500+ database servers running 64-bit Windows and SQL Server 2005. MySpace processes 1.5 Billion page views per day and
4 0.74937946 106 high scalability-2007-10-02-Secrets to Fotolog's Scaling Success
Introduction: Fotolog, a social blogging site centered around photos, grew from about 300 thousand users in 2004 to over 11 million users in 2007. Though they initially experienced the inevitable pains of rapid growth, they overcame their problems and now manage over 300 million photos and 800,000 new photos are added each day. Generating all that fabulous content are 20 million unique monthly visitors and a volunteer army of 30,000 new users each day. They did so well a very impressed suitor bought them out for a cool $90 million. That's scale meets success by anyone standards. How did they do it? Site: http://www.fotolog.com Information Sources Scaling the World's Largest Photo Blogging Community Congrats to Fotolog on $90mm sale to Hi-Media Fotolog overtaking Flickr? Fotolog Hits 11 Million Members and 300 Million Photos Posted Site of the Week: Fotolog.com by PC Magazine CEO John Borthwick's Blog . DBA Frank Mash's Blog Fotolog, lessons learnt by John B
5 0.74372238 821 high scalability-2010-05-03-MocoSpace Architecture - 3 Billion Mobile Page Views a Month
Introduction: This is a guest post by Jamie Hall, Co-founder & CTO of MocoSpace , describing the architecture for their mobile social network. This is a timely architecture to learn from as it combines several hot trends: it is very large, mobile, and social. What they think is especially cool about their system is: how it optimizes for device/browser fragmentation on the mobile Web; their multi-tiered, read/write, local/distributed caching system; selecting PostgreSQL over MySQL as a relational DB that can scale. MocoSpace is a mobile social network, with 12 million members and 3 billion page views a month, which makes it one of the most highly trafficked mobile Websites in the US. Members access the site mainly from their mobile phone Web browser, ranging from high end smartphones to lower end devices, as well as the Web. Activities on the site include customizing profiles, chat, instant messaging, music, sharing photos & videos, games, eCards and blogs. The monetization strategy is focused on
6 0.74205983 903 high scalability-2010-09-17-Hot Scalability Links For Sep 17, 2010
8 0.73736936 742 high scalability-2009-11-17-10 eBay Secrets for Planet Wide Scaling
9 0.72276646 425 high scalability-2008-10-22-Scalability Best Practices: Lessons from eBay
10 0.7179175 235 high scalability-2008-02-02-The case against ORM Frameworks in High Scalability Architectures
11 0.71406335 258 high scalability-2008-02-24-Yandex Architecture
12 0.71272027 344 high scalability-2008-06-09-FaceStat's Rousing Tale of Scaling Woe and Wisdom Won
13 0.7105003 679 high scalability-2009-08-11-13 Scalability Best Practices
14 0.70444161 513 high scalability-2009-02-16-Handle 1 Billion Events Per Day Using a Memory Grid
15 0.6983059 965 high scalability-2010-12-29-Pinboard.in Architecture - Pay to Play to Keep a System Small
16 0.69671041 1271 high scalability-2012-06-25-StubHub Architecture: The Surprising Complexity Behind the World’s Largest Ticket Marketplace
17 0.68782592 384 high scalability-2008-09-16-EE-Appserver Clustering OR Terracota OR Coherence OR something else?
18 0.68780786 1361 high scalability-2012-11-22-Gone Fishin': PlentyOfFish Architecture
19 0.68753022 638 high scalability-2009-06-26-PlentyOfFish Architecture
20 0.68677908 900 high scalability-2010-09-11-Google's Colossus Makes Search Real-time by Dumping MapReduce
topicId topicWeight
[(1, 0.143), (2, 0.216), (10, 0.059), (30, 0.014), (32, 0.083), (61, 0.156), (77, 0.066), (79, 0.093), (85, 0.055), (91, 0.014), (94, 0.034)]
simIndex simValue blogId blogTitle
same-blog 1 0.96589553 331 high scalability-2008-05-27-eBay Architecture
Introduction: Update 2: EBay's Randy Shoup spills the secrets of how to service hundreds of millions of users and over two billion page views a day in Scalability Best Practices: Lessons from eBay on InfoQ. The practices: Partition by Function, Split Horizontally, Avoid Distributed Transactions, Decouple Functions Asynchronously, Move Processing To Asynchronous Flows, Virtualize At All Levels, Cache Appropriately. Update: eBay Serves 5 Billion API Calls Each Month . Aren't we seeing more and more traffic driven by mashups composed on top of open APIs? APIs are no longer a bolt on, they are your application. Architecturally that argues for implementing your own application around the same APIs developers and users employ. Who hasn't wondered how eBay does their business? As one of the largest most loaded websites in the world, it can't be easy. And the subtitle of the presentation hints at how creating such a monster system requires true engineering: Striking a balance between site stabilit
2 0.94471681 998 high scalability-2011-03-03-Stack Overflow Architecture Update - Now at 95 Million Page Views a Month
Introduction: A lot has happened since my first article on the Stack Overflow Architecture . Contrary to the theme of that last article, which lavished attention on Stack Overflow's dedication to a scale-up strategy, Stack Overflow has both grown up and out in the last few years. Stack Overflow has grown up by more then doubling in size to over 16 million users and multiplying its number of page views nearly 6 times to 95 million page views a month. Stack Overflow has grown out by expanding into the Stack Exchange Network , which includes Stack Overflow, Server Fault, and Super User for a grand total of 43 different sites. That's a lot of fruitful multiplying going on. What hasn't changed is Stack Overflow's openness about what they are doing. And that's what prompted this update. A recent series of posts talks a lot about how they've been handling their growth: Stack Exchange’s Architecture in Bullet Points , Stack Overflow’s New York Data Center , Designing For Scalability of Manageme
3 0.94418889 569 high scalability-2009-04-14-Scalability resources
Introduction: I found this resources: High Scalable Architecture: - YouTube Architecture - Facebook Chat Architecture - Amazon Architecture Blogs: - Scalability Guidelines for building scalable software system (part 1) - Scalability Guidelines for building scalable software system (part 2) - Scalability Guidelines for building scalable software system (part 3) - Scalability Worst Practices - how to minimize load time for fast user experiences - Scalability principles - Challanges for Developing Enterprise Application on the Cloud - high-performance web page real-world examples netflix case study - Intro to Caching,Caching algorithms and caching frameworks part 1 - Amdahl’s low - How I Learned to Stop Worrying and Love Using a Lot of Disk Space to Scale - Top 25 Most Dangerous Programming Mistakes
4 0.94411892 856 high scalability-2010-07-12-Creating Scalable Digital Libraries
Introduction: Like many other media content providers, libraries and museums are increasingly moving their content onto the Web. While the move itself is no easy process (with digitization, web development, and training costs), being able to successfully deliver content to a wide audience is an ongoing concern, particularly for large libraries. Much of the concern is financial, as most libraries do not have the internal budget or outside investors that for-profit businesses enjoy. Even large university libraries will face serious budget constraints that even other university departments, such as science and technology would not face. Creating a scalable infrastructure and also distributing a large digital collection that can handle multiple requests, requires planning that many librarians have not even imagined. They must stop thinking in terms of "one-item-per-customer" and start thinking in terms of numerous users accessing the same information simultaneously. Content Delivery Network
5 0.94281042 106 high scalability-2007-10-02-Secrets to Fotolog's Scaling Success
Introduction: Fotolog, a social blogging site centered around photos, grew from about 300 thousand users in 2004 to over 11 million users in 2007. Though they initially experienced the inevitable pains of rapid growth, they overcame their problems and now manage over 300 million photos and 800,000 new photos are added each day. Generating all that fabulous content are 20 million unique monthly visitors and a volunteer army of 30,000 new users each day. They did so well a very impressed suitor bought them out for a cool $90 million. That's scale meets success by anyone standards. How did they do it? Site: http://www.fotolog.com Information Sources Scaling the World's Largest Photo Blogging Community Congrats to Fotolog on $90mm sale to Hi-Media Fotolog overtaking Flickr? Fotolog Hits 11 Million Members and 300 Million Photos Posted Site of the Week: Fotolog.com by PC Magazine CEO John Borthwick's Blog . DBA Frank Mash's Blog Fotolog, lessons learnt by John B
6 0.94223666 383 high scalability-2008-09-10-Shard servers -- go big or small?
7 0.94190359 1147 high scalability-2011-11-25-Stuff The Internet Says On Scalability For November 25, 2011
8 0.94148666 931 high scalability-2010-10-28-Notes from A NOSQL Evening in Palo Alto
9 0.94078273 1089 high scalability-2011-07-29-Stuff The Internet Says On Scalability For July 29, 2011
10 0.93985188 1302 high scalability-2012-08-10-Stuff The Internet Says On Scalability For August 10, 2012
11 0.93901938 1289 high scalability-2012-07-23-State of the CDN: More Traffic, Stable Prices, More Products, Profits - Not So Much
12 0.93805712 152 high scalability-2007-11-13-Flickr Architecture
13 0.93763226 1228 high scalability-2012-04-16-Instagram Architecture Update: What’s new with Instagram?
14 0.9354943 1109 high scalability-2011-09-02-Stuff The Internet Says On Scalability For September 2, 2011
15 0.93532449 1567 high scalability-2013-12-20-Stuff The Internet Says On Scalability For December 20th, 2013
16 0.93524617 961 high scalability-2010-12-21-SQL + NoSQL = Yes !
17 0.93462205 1538 high scalability-2013-10-28-Design Decisions for Scaling Your High Traffic Feeds
18 0.93437481 1642 high scalability-2014-05-02-Stuff The Internet Says On Scalability For May 2nd, 2014
19 0.93423808 1461 high scalability-2013-05-20-The Tumblr Architecture Yahoo Bought for a Cool Billion Dollars
20 0.93419743 1020 high scalability-2011-04-12-Caching and Processing 2TB Mozilla Crash Reports in memory with Hazelcast