high_scalability high_scalability-2007 high_scalability-2007-6 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Friendster is one of the largest social network sites on the web. it emphasizes genuine friendships and the discovery of new people through friends. Site: http://www.friendster.com/ Information Sources Friendster - Scaling for 1 Billion Queries per day Platform MySQL Perl PHP Linux Apache What's Inside? Dual x86-64 AMD Opterons with 8 GB of RAM Faster disk (SAN) Optimized indexes Traditional 3-tier architecture with hardware load balancer in front of the databases Clusters based on types: ad, app, photo, monitoring, DNS, gallery search DB, profile DB, user infor DB, IM status cache, message DB, testimonial DB, friend DB, graph servers, gallery search, object cache. Lessons Learned No persistent database connections. Removed all sorts. Optimized indexes Don’t go after the biggest problems first Optimize without downtime Split load Moved sorting query types into the application and added LIMITS. Reduced ranges R
sentIndex sentText sentNum sentScore
1 Friendster is one of the largest social network sites on the web. [sent-1, score-0.07]
2 it emphasizes genuine friendships and the discovery of new people through friends. [sent-2, score-0.416]
3 Lessons Learned No persistent database connections. [sent-7, score-0.089]
4 Optimized indexes Don’t go after the biggest problems first Optimize without downtime Split load Moved sorting query types into the application and added LIMITS. [sent-9, score-0.346]
wordName wordTfidf (topN-words)
[('db', 0.464), ('gallery', 0.351), ('learnedno', 0.196), ('opterons', 0.196), ('friendster', 0.175), ('emphasizes', 0.169), ('amd', 0.143), ('genuine', 0.143), ('sorting', 0.135), ('toward', 0.135), ('types', 0.135), ('im', 0.129), ('philosophy', 0.118), ('thin', 0.117), ('photo', 0.116), ('stateless', 0.115), ('discovery', 0.104), ('roll', 0.104), ('search', 0.104), ('dual', 0.1), ('friend', 0.099), ('improvement', 0.099), ('clean', 0.098), ('fails', 0.098), ('boxes', 0.098), ('cycle', 0.097), ('define', 0.097), ('buying', 0.094), ('centralized', 0.094), ('balancer', 0.093), ('benchmark', 0.09), ('persistent', 0.089), ('optimized', 0.089), ('change', 0.087), ('status', 0.087), ('ad', 0.086), ('profile', 0.085), ('gb', 0.085), ('san', 0.081), ('maintaining', 0.077), ('biggest', 0.076), ('message', 0.074), ('cheap', 0.074), ('primary', 0.073), ('plan', 0.073), ('inside', 0.072), ('session', 0.071), ('buy', 0.071), ('dns', 0.071), ('largest', 0.07)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 6 high scalability-2007-07-11-Friendster Architecture
Introduction: Friendster is one of the largest social network sites on the web. it emphasizes genuine friendships and the discovery of new people through friends. Site: http://www.friendster.com/ Information Sources Friendster - Scaling for 1 Billion Queries per day Platform MySQL Perl PHP Linux Apache What's Inside? Dual x86-64 AMD Opterons with 8 GB of RAM Faster disk (SAN) Optimized indexes Traditional 3-tier architecture with hardware load balancer in front of the databases Clusters based on types: ad, app, photo, monitoring, DNS, gallery search DB, profile DB, user infor DB, IM status cache, message DB, testimonial DB, friend DB, graph servers, gallery search, object cache. Lessons Learned No persistent database connections. Removed all sorts. Optimized indexes Don’t go after the biggest problems first Optimize without downtime Split load Moved sorting query types into the application and added LIMITS. Reduced ranges R
2 0.20069815 704 high scalability-2009-09-13-How is Berkely DB fare against other Key-Value Database
Introduction: I want to know how is Berkeley DB compared against other key-value solution. I read it from Net that Google uses it for their Enterprise Sign-on feature. Is anyone has any experience using Berkeley DB. Backward compatibility is poor in Berkley DB but that is fine for me. How easy to scale using Berkeley DB.
3 0.16242373 78 high scalability-2007-09-01-2 tier switch selection for colocation
Introduction: Hi, I am i nterested in some exper i enced adv i ce for choosing switches for a colocated 2-t i er arch i tecture. I have the hardware chosen for the webservers, app servers, and db servers, but need some adv i ce on the network sw i tch in between: colocation port -> firewa l l(load balancer) -> 2+ web servers (app servers) -> gigabit sw i tch -> DB server(possib l y cluster for future expansion) the quest i on is that I am just starting out, i wonder which rackmount g i gabit sw i tch to select for the private LAN between the app server -> DB servers. Do I need managed for that? Cisco switches are the best, but they are the most expensive...I am looking at poss i bly using Dell/Netgear gigab i t switches. Thanks for any input
4 0.16031559 7 high scalability-2007-07-12-FeedBurner Architecture
Introduction: FeedBurner is a news feed management provider launched in 2004. FeedBurner provides custom RSS feeds and management tools to bloggers, podcasters, and other web-based content publishers. Services provided to publishers include traffic analysis and an optional advertising system. Site: http://www.feedburner.com Information Sources FeedBurner - Scalable Web Applications using MySQL and Java What the Web’s most popular sites are running on Platform Java MySQL Hibernate Spring Tomcat Cacti Load balancing: NetScaler Application Switches Routers, switches: HP, Cisco DNS: bind The Stats FeedBurner is growing faster than MySpace and Digg with 385% traffic growth. Total feeds: 808,707, Number of publishers: 471,686. 11 million subscribers in 190 countries Scaling History - July 2004: 300Kbps, 5,600 feeds, 3 app servers, 3 web servers 2 DB servers, Round Robin DNS - April 2005: 5Mbps, 47,700 feeds, 6 app servers, 6 web servers (same mac
5 0.14050511 153 high scalability-2007-11-13-Friendster Lost Lead Because of a Failure to Scale
Introduction: Hey, this scaling stuff might just be important. Jim Scheinman, former Bebo and Friendster exec, puts the blame squarely on Friendster's inability to scale as why they lost the social networking race: VB : Can you tell me a bit about what you learned in your time at Friendster? JS : For me, it basically came down to failed execution on the technology side — we had millions of Friendster members begging us to get the site working faster so they could log in and spend hours social networking with their friends. I remember coming in to the office for months reading thousands of customer service emails telling us that if we didn’t get our site working better soon, they’d be ‘forced to join’ a new social networking site that had just launched called MySpace…the rest is history. To be fair to Friendster’s technology team at the time, they were on the forefront of many new scaling and database issues that web sites simply hadn’t had to deal with prior to Friendster. As is often
6 0.14021333 637 high scalability-2009-06-24-Habits of Highly Scalable Web Applications
7 0.12343079 1196 high scalability-2012-02-20-Berkeley DB Architecture - NoSQL Before NoSQL was Cool
8 0.11772085 821 high scalability-2010-05-03-MocoSpace Architecture - 3 Billion Mobile Page Views a Month
9 0.11655692 313 high scalability-2008-05-02-Friends for Sale Architecture - A 300 Million Page View-Month Facebook RoR App
10 0.10667431 157 high scalability-2007-11-16-Product: lbpool - Load Balancing JDBC Pool
11 0.10111016 1228 high scalability-2012-04-16-Instagram Architecture Update: What’s new with Instagram?
12 0.099348724 684 high scalability-2009-08-18-Real World Web: Performance & Scalability
13 0.098657861 1440 high scalability-2013-04-15-Scaling Pinterest - From 0 to 10s of Billions of Page Views a Month in Two Years
14 0.098364845 682 high scalability-2009-08-16-ThePort Network Architecture
15 0.096686184 392 high scalability-2008-09-24-Building a Scalable Architecture for Web Apps
16 0.096242279 99 high scalability-2007-09-23-HA for switches
17 0.096020371 142 high scalability-2007-11-05-Strategy: Diagonal Scaling - Don't Forget to Scale Out AND Up
18 0.095173702 106 high scalability-2007-10-02-Secrets to Fotolog's Scaling Success
19 0.092457473 1303 high scalability-2012-08-13-Ask HighScalability: Facing scaling issues with news feeds on Redis. Any advice?
20 0.090636894 248 high scalability-2008-02-13-What's your scalability plan?
topicId topicWeight
[(0, 0.142), (1, 0.06), (2, -0.025), (3, -0.093), (4, 0.017), (5, -0.001), (6, -0.012), (7, -0.022), (8, 0.003), (9, 0.016), (10, 0.017), (11, -0.006), (12, -0.006), (13, 0.026), (14, -0.034), (15, 0.028), (16, -0.026), (17, 0.067), (18, 0.083), (19, 0.078), (20, -0.012), (21, 0.002), (22, -0.008), (23, -0.036), (24, -0.006), (25, -0.017), (26, -0.022), (27, -0.034), (28, -0.03), (29, 0.037), (30, 0.029), (31, -0.071), (32, -0.027), (33, -0.052), (34, 0.054), (35, -0.012), (36, 0.022), (37, -0.033), (38, 0.035), (39, 0.043), (40, 0.046), (41, 0.009), (42, -0.019), (43, 0.044), (44, 0.018), (45, 0.072), (46, -0.069), (47, 0.039), (48, -0.081), (49, -0.043)]
simIndex simValue blogId blogTitle
same-blog 1 0.96960783 6 high scalability-2007-07-11-Friendster Architecture
Introduction: Friendster is one of the largest social network sites on the web. it emphasizes genuine friendships and the discovery of new people through friends. Site: http://www.friendster.com/ Information Sources Friendster - Scaling for 1 Billion Queries per day Platform MySQL Perl PHP Linux Apache What's Inside? Dual x86-64 AMD Opterons with 8 GB of RAM Faster disk (SAN) Optimized indexes Traditional 3-tier architecture with hardware load balancer in front of the databases Clusters based on types: ad, app, photo, monitoring, DNS, gallery search DB, profile DB, user infor DB, IM status cache, message DB, testimonial DB, friend DB, graph servers, gallery search, object cache. Lessons Learned No persistent database connections. Removed all sorts. Optimized indexes Don’t go after the biggest problems first Optimize without downtime Split load Moved sorting query types into the application and added LIMITS. Reduced ranges R
2 0.7209785 7 high scalability-2007-07-12-FeedBurner Architecture
Introduction: FeedBurner is a news feed management provider launched in 2004. FeedBurner provides custom RSS feeds and management tools to bloggers, podcasters, and other web-based content publishers. Services provided to publishers include traffic analysis and an optional advertising system. Site: http://www.feedburner.com Information Sources FeedBurner - Scalable Web Applications using MySQL and Java What the Web’s most popular sites are running on Platform Java MySQL Hibernate Spring Tomcat Cacti Load balancing: NetScaler Application Switches Routers, switches: HP, Cisco DNS: bind The Stats FeedBurner is growing faster than MySpace and Digg with 385% traffic growth. Total feeds: 808,707, Number of publishers: 471,686. 11 million subscribers in 190 countries Scaling History - July 2004: 300Kbps, 5,600 feeds, 3 app servers, 3 web servers 2 DB servers, Round Robin DNS - April 2005: 5Mbps, 47,700 feeds, 6 app servers, 6 web servers (same mac
3 0.65968168 1228 high scalability-2012-04-16-Instagram Architecture Update: What’s new with Instagram?
Introduction: The fascination over Instagram continues and fortunately we have several new streams of information to feed the insanity. So consider this article an update to The Instagram Architecture Facebook Bought For A Cool Billion Dollars , based primarily on Scaling Instagram , a slide deck for an AirBnB tech talk given by Instagram co-founder, Mike Krieger. Several other information sources, listed at the bottom of the article, were also used. Unfortunately we just have a slide deck, so the connective tissue of the talk is missing, but it’s still very interesting, in the same spirit of wisdom presentations we often see after developers come up for air after spending significant time spent in the trenches. If you expect to dive deep into the technological details and find a billion reasons why Instagram was acquired, you will be disappointed. That magic can be found in the emotional investment in the relationship between all of the users and the product, not in the bits about h
4 0.65204036 392 high scalability-2008-09-24-Building a Scalable Architecture for Web Apps
Introduction: By Bhavin Turakhia CEO, Directi. Covers: * Why scalability is important. Viral marketing can result in instant success. With RSS/Ajax/SOA number of requests grow exponentially with user base. Goal is to build a web 2.0 app that can server millions of users with zero downtime. * Introduction to the variables. Scalability, performance, responsiveness, availability, downtime impact, cost, maintenance effort. * Introduction to the factors. Platform selection, hardware, application design, database architecture, deployment architecture, storage architecture, abuse prevention, monitoring mechanisms, etc. * Building our own scalable architecture in incremental steps: vertical scaling, vertical partitioning, horizontal scaling, horizontal partitioning, etc. First buy bigger. Then deploy each service on a separate node. Then increase the number of nudes and load balance. Deal with session management. Remove single points of failure. Use a shared nothing cluster. Choice of master-slave m
5 0.62896401 821 high scalability-2010-05-03-MocoSpace Architecture - 3 Billion Mobile Page Views a Month
Introduction: This is a guest post by Jamie Hall, Co-founder & CTO of MocoSpace , describing the architecture for their mobile social network. This is a timely architecture to learn from as it combines several hot trends: it is very large, mobile, and social. What they think is especially cool about their system is: how it optimizes for device/browser fragmentation on the mobile Web; their multi-tiered, read/write, local/distributed caching system; selecting PostgreSQL over MySQL as a relational DB that can scale. MocoSpace is a mobile social network, with 12 million members and 3 billion page views a month, which makes it one of the most highly trafficked mobile Websites in the US. Members access the site mainly from their mobile phone Web browser, ranging from high end smartphones to lower end devices, as well as the Web. Activities on the site include customizing profiles, chat, instant messaging, music, sharing photos & videos, games, eCards and blogs. The monetization strategy is focused on
6 0.62623012 152 high scalability-2007-11-13-Flickr Architecture
7 0.62045527 68 high scalability-2007-08-20-TypePad Architecture
8 0.6145007 99 high scalability-2007-09-23-HA for switches
9 0.61427087 704 high scalability-2009-09-13-How is Berkely DB fare against other Key-Value Database
10 0.60811245 313 high scalability-2008-05-02-Friends for Sale Architecture - A 300 Million Page View-Month Facebook RoR App
11 0.60630935 1094 high scalability-2011-08-08-Tagged Architecture - Scaling to 100 Million Users, 1000 Servers, and 5 Billion Page Views
12 0.60506666 513 high scalability-2009-02-16-Handle 1 Billion Events Per Day Using a Memory Grid
13 0.60035825 637 high scalability-2009-06-24-Habits of Highly Scalable Web Applications
14 0.59557676 682 high scalability-2009-08-16-ThePort Network Architecture
15 0.59140038 106 high scalability-2007-10-02-Secrets to Fotolog's Scaling Success
16 0.5906288 5 high scalability-2007-07-10-mixi.jp Architecture
17 0.58959234 72 high scalability-2007-08-22-Wikimedia architecture
18 0.58596605 78 high scalability-2007-09-01-2 tier switch selection for colocation
19 0.57844049 511 high scalability-2009-02-12-MySpace Architecture
20 0.57820159 389 high scalability-2008-09-23-How to Scale with Ruby on Rails
topicId topicWeight
[(1, 0.114), (2, 0.216), (10, 0.071), (30, 0.037), (57, 0.221), (61, 0.162), (94, 0.073)]
simIndex simValue blogId blogTitle
1 0.94743758 159 high scalability-2007-11-18-Reverse Proxy
Introduction: Hi, I saw an year ago that Netapp sold netcache to blu-coat, my site is a heavy NetCache user and we cached 83% of our site. We tested with Blue-coat and F5 WA and we are not getting same performce as NetCache. Any of you guys have the same issue? or somebody knows another product can handle much traffic? Thanks Rodrigo
2 0.9006058 968 high scalability-2011-01-04-Map-Reduce With Ruby Using Hadoop
Introduction: A demonstration, with repeatable steps, of how to quickly fire-up a Hadoop cluster on Amazon EC2, load data onto the HDFS (Hadoop Distributed File-System), write map-reduce scripts in Ruby and use them to run a map-reduce job on your Hadoop cluster. You will not need to ssh into the cluster, as all tasks are run from your local machine. Below I am using my MacBook Pro as my local machine, but the steps I have provided should be reproducible on other platforms running bash and Java. Fire-Up Your Hadoop Cluster I choose the Cloudera distribution of Hadoop which is still 100% Apache licensed, but has some additional benefits. One of these benefits is that it is released by Doug Cutting , who started Hadoop and drove it’s development at Yahoo! He also started Lucene , which is another of my favourite Apache Projects, so I have good faith that he knows what he is doing. Another benefit, as you will see, is that it is simple to fire-up a Hadoop cluster. I am going to use C
3 0.89742225 731 high scalability-2009-10-28-Need for change in your IT infrastructure
Introduction: Companies earnings outstrip forecasts , consumer confidence is retuning and city bonuses are back . What does this mean for business? Growth! After the recent years of cost cutting in IT budgets, there is the sudden fear induced from increased demand. Pre-existing trouble points in IT infrastructures that have lain dormant will suddenly be exposed. Monthly reporting and real time analytics will suffer as data grows. IT departments across the land will be crying out “The engine canna take no more captain”. What can be done? What we need is a scalable system that grows with the business. A system that can handle sudden increases in data growth without falling over. There are two core principles to a scalable system (1) Users experience constant QoS as demand grows (2) System Architects can grow system capacity proportionally with the available resources. In other words, if demand increases twofold, it is “enough” to purchase twice the hardware. This is linear growth. Is it e
4 0.89655674 1144 high scalability-2011-11-17-Five Misconceptions on Cloud Portability
Introduction: The term "cloud portability" is often considered a synonym for "Cloud API portability," which implies a series of misconceptions. If we break away from dogma, we can find that what we really looking for in cloud portability is Application portability between clouds which can be a vastly simpler requirement, as we can achieve application portability without settling on a common Cloud API. In this post i'll be covering five common misconceptions people have WRT to cloud portability. Cloud portability = Cloud API portability . API portability is easy; cloud API portability is not. The main incentive for Cloud Portability is - Avoiding Vendor lock-in .Cloud portability is more about business agility than it is about vendor lock-in. Cloud portability isn’t for startups . Every startup that is expecting rapid growth should re-examine their deployments and plan for cloud portability rather than wait to be forced to make the switch when you are least prepared to do so.
5 0.89244252 1138 high scalability-2011-11-07-10 Core Architecture Pattern Variations for Achieving Scalability
Introduction: Srinath Perera has put together a strong list of architecture patterns based on three meta patterns: distribution, caching, and asynchronous processing. He contends these three are the primal patterns and the following patterns are but different combinations: LB (Load Balancers) + Shared nothing Units . Units that do not share anything with each other fronted with a load balancer that routes incoming messages to a unit based on some criteria. LB + Stateless Nodes + Scalable Storage . Several stateless nodes talking to a scalable storage, and a load balancer distributes load among the nodes. Peer to Peer Architectures (Distributed Hash Table (DHT) and Content Addressable Networks (CAN)) . Algorithm for scaling up logarithmically. Distributed Queues . Queue implementation (FIFO delivery) implemented as a network service. Publish/Subscribe Paradigm . Network publish subscribe brokers that route messages to each other. Gossip and Nature-inspired Architectures . Each
same-blog 6 0.8743636 6 high scalability-2007-07-11-Friendster Architecture
7 0.87189484 433 high scalability-2008-10-29-CTL - Distributed Control Dispatching Framework
9 0.84506172 807 high scalability-2010-04-09-Vagrant - Build and Deploy Virtualized Development Environments Using Ruby
10 0.83825082 1211 high scalability-2012-03-19-LinkedIn: Creating a Low Latency Change Data Capture System with Databus
11 0.81153792 553 high scalability-2009-04-03-Collectl interface to Ganglia - any interest?
12 0.8084712 351 high scalability-2008-07-16-The Mother of All Database Normalization Debates on Coding Horror
13 0.80188137 232 high scalability-2008-01-29-When things aren't scalable
14 0.79773259 855 high scalability-2010-07-11-So, Why is Twitter Really Not Using Cassandra to Store Tweets?
15 0.78802675 1538 high scalability-2013-10-28-Design Decisions for Scaling Your High Traffic Feeds
16 0.78309417 703 high scalability-2009-09-12-How Google Taught Me to Cache and Cash-In
17 0.77965075 877 high scalability-2010-08-12-Designing Web Applications for Scalability
18 0.77933729 1296 high scalability-2012-08-02-Strategy: Use Spare Region Capacity to Survive Availability Zone Failures
19 0.77705717 775 high scalability-2010-02-10-ElasticSearch - Open Source, Distributed, RESTful Search Engine
20 0.77619839 1031 high scalability-2011-04-28-PaaS on OpenStack - Run Applications on Any Cloud, Any Time Using Any Thing