high_scalability high_scalability-2007 high_scalability-2007-176 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Our company offers a web service that is provided to users from several d i fferent hosting centers across the globe. The content and functionality at each of the servers is almost exact l y the same, and we could have based them all in a single location. However, we chose to distribute the servers geographica l ly to offer our users the best performance, regardless where they might be. Up unt i l now, the only content on the servers that has had to be synchronized is the server software itse l f. The features and functionality of our service are being updated regularly, so every week or two we push updates out to all the servers at basically the same time. We use a relat i vely manual approach to do the updating, but it works fine. Sometime soon, however, our synchronization needs are going to get a b i t more complex. In particular, we'll soon start offering a feature at our site that wi l l involve a database with content that will change on an a
sentIndex sentText sentNum sentScore
1 Our company offers a web service that is provided to users from several d i fferent hosting centers across the globe. [sent-1, score-0.284]
2 The content and functionality at each of the servers is almost exact l y the same, and we could have based them all in a single location. [sent-2, score-0.759]
3 However, we chose to distribute the servers geographica l ly to offer our users the best performance, regardless where they might be. [sent-3, score-0.636]
4 Up unt i l now, the only content on the servers that has had to be synchronized is the server software itse l f. [sent-4, score-0.826]
5 The features and functionality of our service are being updated regularly, so every week or two we push updates out to all the servers at basically the same time. [sent-5, score-0.639]
6 We use a relat i vely manual approach to do the updating, but it works fine. [sent-6, score-0.222]
7 Sometime soon, however, our synchronization needs are going to get a b i t more complex. [sent-7, score-0.203]
8 In particular, we'll soon start offering a feature at our site that wi l l involve a database with content that will change on an almost second-by-second basis, based on user input and activity. [sent-8, score-0.976]
9 For performance reasons, a comp l ete instance of this database will have to be present locally at each of our server locations. [sent-9, score-0.371]
10 At the same time, the content of the database will have to be synchronized across all server locat i ons, so that users get the same database content, regardless of the server they choose to visit. [sent-10, score-1.454]
11 We have not yet chosen the database that we'll use for this funct i onality, although we are leaning towards MySql. [sent-11, score-0.608]
12 ) So, my quest i on for the assembled experts is: What approach is the best one for us to use to synchron i ze the database instances across our servers? [sent-13, score-0.63]
13 Ideally, we'd l i ke a solution that is resilient to a server location becoming unavailable, and we'd also prefer a solution that makes eff i cient use of bandwidth. [sent-14, score-0.688]
14 (1) Our servers run Apache and Tomcat on top of Centos. [sent-19, score-0.165]
15 (2) I've found the following "how to" that suggests an approach involving MySQL that could address our needs: http://capttofu. [sent-20, score-0.381]
wordName wordTfidf (topN-words)
[('synchronized', 0.278), ('content', 0.255), ('functionality', 0.213), ('leaning', 0.181), ('regardless', 0.181), ('servers', 0.165), ('assembled', 0.161), ('soon', 0.16), ('database', 0.141), ('unavailable', 0.139), ('regularly', 0.137), ('approach', 0.136), ('ideally', 0.133), ('however', 0.133), ('involving', 0.131), ('synchronize', 0.131), ('server', 0.128), ('almost', 0.126), ('resilient', 0.118), ('tomcat', 0.117), ('suggests', 0.114), ('involve', 0.112), ('synchronization', 0.106), ('geographically', 0.105), ('chosen', 0.105), ('prefer', 0.105), ('updating', 0.105), ('across', 0.104), ('chose', 0.102), ('locally', 0.102), ('users', 0.098), ('needs', 0.097), ('locations', 0.097), ('input', 0.096), ('although', 0.094), ('considering', 0.091), ('distribute', 0.09), ('basis', 0.089), ('updated', 0.089), ('basically', 0.089), ('experts', 0.088), ('towards', 0.087), ('hand', 0.086), ('offering', 0.086), ('manual', 0.086), ('solution', 0.085), ('becoming', 0.084), ('week', 0.083), ('location', 0.083), ('provided', 0.082)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999988 176 high scalability-2007-12-07-Synchronizing databases in different geographic locations
Introduction: Our company offers a web service that is provided to users from several d i fferent hosting centers across the globe. The content and functionality at each of the servers is almost exact l y the same, and we could have based them all in a single location. However, we chose to distribute the servers geographica l ly to offer our users the best performance, regardless where they might be. Up unt i l now, the only content on the servers that has had to be synchronized is the server software itse l f. The features and functionality of our service are being updated regularly, so every week or two we push updates out to all the servers at basically the same time. We use a relat i vely manual approach to do the updating, but it works fine. Sometime soon, however, our synchronization needs are going to get a b i t more complex. In particular, we'll soon start offering a feature at our site that wi l l involve a database with content that will change on an a
2 0.16756578 70 high scalability-2007-08-22-How many machines do you need to run your site?
Introduction: Amazingly TechCrunch runs their website on one web server and one database server, according to the fascinating survey What the Web’s most popular sites are running on by Pingdom , a provider of uptime and response time monitoring. Early we learned PlentyOfFish catches and releases many millions of hits a day on just 1 web server and three database servers. Google runs a Dalek army full of servers. YouSendIt , a company making it easy to send and receive large files, has 24 web servers, 3 database servers, 170 storage servers, and a few miscellaneous servers. Vimeo , a video sharing company, has 100 servers for streaming video, 4 web servers, and 2 database servers. Meebo , an AJAX based instant messaging company, uses 40 servers to handle messaging, over 40 web servers, and 10 servers for forums, jabber, testing, and so on. FeedBurner , a news feed management company, has 70 web servers, 15 database servers, and 10 miscellaneous servers. Now
Introduction: This is a guest post by Dave Hagler Systems Architect at AOL. The AOL homepages receive more than 8 million visitors per day . That’s more daily viewers than Good Morning America or the Today Show on television. Over a billion page views are served each month. AOL.com has been a major internet destination since 1996, and still has a strong following of loyal users. The architecture for AOL.com is in it’s 5th generation . It has essentially been rebuilt from scratch 5 times over two decades. The current architecture was designed 6 years ago. Pieces have been upgraded and new components have been added along the way, but the overall design remains largely intact. The code, tools, development and deployment processes are highly tuned over 6 years of continual improvement, making the AOL.com architecture battle tested and very stable. The engineering team is made up of developers, testers, and operations and totals around 25 people . The majority are in Dulles, Virginia
4 0.13276415 1501 high scalability-2013-08-13-In Memoriam: Lavabit Architecture - Creating a Scalable Email Service
Introduction: With Lavabit shutting down under murky circumstances , it seems fitting to repost an old (2009), yet still very good post by Ladar Levison on Lavabit's architecture. I don't know how much of this information is still current, but it should give you a general idea what Lavabit was all about. Getting to Know You What is the name of your system and where can we find out more about it? Note: these links are no longer valid... Lavabit http://lavabit.com http://lavabit.com/network.html http://lavabit.com/about.html What is your system for? Lavabit is a mid-sized email service provider. We currently have about 140,000 registered users with more than 260,000 email addresses. While most of our accounts belong to individual users, we also provide corporate email services to approximately 70 companies. Why did you decide to build this system? We built the system to compete against the other large free email providers, with an emphasis on serving the privacy c
5 0.13209426 382 high scalability-2008-09-09-Content Delivery Networks (CDN) – a comprehensive list of providers
Introduction: We build web applications…and there are plenty of them around. Now, if we hit the jackpot and our application becomes very popular, traffic goes up, and our servers are brought down by the hordes of people coming to our website. What do we do in that situation? Of course, I am not talking here about the kind of traffic Digg, Yahoo Buzz or other social media sites can bring to a website, which is temporary overnight traffic, or a website which uses cloud computing like Amazon EC2 service, MediaTemple Grid Service or Mosso Hosting Cloud service. I am talking about traffic that consistently increases over time as the service achieves success. Google.com, Yahoo.com, Myspace.com, Facebook.com, Plentyoffish.com, Linkedin.com, Youtube.com and others are examples of services which have constant high traffic. Knowing that users want speed from their applications, these services will always use a Content Delivery Network (CDN) to deliver that speed. What is a Content Delivery Ne
6 0.12638672 194 high scalability-2007-12-26-Golden rule of web caching
7 0.12233213 856 high scalability-2010-07-12-Creating Scalable Digital Libraries
9 0.11993384 1508 high scalability-2013-08-28-Sean Hull's 20 Biggest Bottlenecks that Reduce and Slow Down Scalability
11 0.1171432 576 high scalability-2009-04-21-What CDN would you recommend?
12 0.11687613 1331 high scalability-2012-10-02-An Epic TripAdvisor Update: Why Not Run on the Cloud? The Grand Experiment.
13 0.11583 841 high scalability-2010-06-14-How scalable could be a cPanel Hosting service?
14 0.11486482 313 high scalability-2008-05-02-Friends for Sale Architecture - A 300 Million Page View-Month Facebook RoR App
15 0.11457726 1068 high scalability-2011-06-27-TripAdvisor Architecture - 40M Visitors, 200M Dynamic Page Views, 30TB Data
16 0.11411618 38 high scalability-2007-07-30-Build an Infinitely Scalable Infrastructure for $100 Using Amazon Services
17 0.11127412 63 high scalability-2007-08-09-Lots of questions for high scalability - high availability
18 0.11105144 881 high scalability-2010-08-16-Scaling an AWS infrastructure - Tools and Patterns
19 0.1105341 906 high scalability-2010-09-22-Applying Scalability Patterns to Infrastructure Architecture
20 0.10997818 1521 high scalability-2013-09-23-Salesforce Architecture - How they Handle 1.3 Billion Transactions a Day
topicId topicWeight
[(0, 0.209), (1, 0.063), (2, -0.031), (3, -0.132), (4, -0.032), (5, -0.043), (6, 0.012), (7, -0.104), (8, -0.004), (9, 0.043), (10, -0.004), (11, -0.003), (12, -0.044), (13, -0.053), (14, 0.048), (15, 0.017), (16, 0.01), (17, 0.037), (18, -0.001), (19, -0.048), (20, -0.024), (21, 0.013), (22, -0.013), (23, 0.0), (24, 0.068), (25, 0.009), (26, 0.011), (27, -0.067), (28, 0.013), (29, 0.006), (30, -0.017), (31, 0.081), (32, -0.038), (33, 0.031), (34, -0.045), (35, -0.036), (36, 0.068), (37, -0.008), (38, 0.014), (39, -0.005), (40, 0.024), (41, -0.019), (42, -0.018), (43, -0.032), (44, -0.062), (45, -0.024), (46, -0.018), (47, -0.011), (48, 0.029), (49, 0.06)]
simIndex simValue blogId blogTitle
same-blog 1 0.96366793 176 high scalability-2007-12-07-Synchronizing databases in different geographic locations
Introduction: Our company offers a web service that is provided to users from several d i fferent hosting centers across the globe. The content and functionality at each of the servers is almost exact l y the same, and we could have based them all in a single location. However, we chose to distribute the servers geographica l ly to offer our users the best performance, regardless where they might be. Up unt i l now, the only content on the servers that has had to be synchronized is the server software itse l f. The features and functionality of our service are being updated regularly, so every week or two we push updates out to all the servers at basically the same time. We use a relat i vely manual approach to do the updating, but it works fine. Sometime soon, however, our synchronization needs are going to get a b i t more complex. In particular, we'll soon start offering a feature at our site that wi l l involve a database with content that will change on an a
Introduction: This is a guest post by Dave Hagler Systems Architect at AOL. The AOL homepages receive more than 8 million visitors per day . That’s more daily viewers than Good Morning America or the Today Show on television. Over a billion page views are served each month. AOL.com has been a major internet destination since 1996, and still has a strong following of loyal users. The architecture for AOL.com is in it’s 5th generation . It has essentially been rebuilt from scratch 5 times over two decades. The current architecture was designed 6 years ago. Pieces have been upgraded and new components have been added along the way, but the overall design remains largely intact. The code, tools, development and deployment processes are highly tuned over 6 years of continual improvement, making the AOL.com architecture battle tested and very stable. The engineering team is made up of developers, testers, and operations and totals around 25 people . The majority are in Dulles, Virginia
3 0.77906448 70 high scalability-2007-08-22-How many machines do you need to run your site?
Introduction: Amazingly TechCrunch runs their website on one web server and one database server, according to the fascinating survey What the Web’s most popular sites are running on by Pingdom , a provider of uptime and response time monitoring. Early we learned PlentyOfFish catches and releases many millions of hits a day on just 1 web server and three database servers. Google runs a Dalek army full of servers. YouSendIt , a company making it easy to send and receive large files, has 24 web servers, 3 database servers, 170 storage servers, and a few miscellaneous servers. Vimeo , a video sharing company, has 100 servers for streaming video, 4 web servers, and 2 database servers. Meebo , an AJAX based instant messaging company, uses 40 servers to handle messaging, over 40 web servers, and 10 servers for forums, jabber, testing, and so on. FeedBurner , a news feed management company, has 70 web servers, 15 database servers, and 10 miscellaneous servers. Now
4 0.77617753 856 high scalability-2010-07-12-Creating Scalable Digital Libraries
Introduction: Like many other media content providers, libraries and museums are increasingly moving their content onto the Web. While the move itself is no easy process (with digitization, web development, and training costs), being able to successfully deliver content to a wide audience is an ongoing concern, particularly for large libraries. Much of the concern is financial, as most libraries do not have the internal budget or outside investors that for-profit businesses enjoy. Even large university libraries will face serious budget constraints that even other university departments, such as science and technology would not face. Creating a scalable infrastructure and also distributing a large digital collection that can handle multiple requests, requires planning that many librarians have not even imagined. They must stop thinking in terms of "one-item-per-customer" and start thinking in terms of numerous users accessing the same information simultaneously. Content Delivery Network
5 0.76595527 1268 high scalability-2012-06-20-Ask HighScalability: How do I organize millions of images?
Introduction: Does anyone have any advice or suggestions on how to store millions of images? Currently images are stored in a MS SQL database which performance wise isn't ideal. We'd like to migrate the images over to a file system structure but I'd assume we don't just want to dump millions of images into a single directory. Besides having to contend with naming collisions, the windows filesystem might not perform optimally with that many files. I'm assuming one approach may be to assign each user a unique CSLID, create a folder based on the CSLID and then place one users files in that particular folder. Even so, this could result in hundreds of thousands of folders. Whats the best organizational scheme/heirachy for doing this?
6 0.75424927 1402 high scalability-2013-02-07-Ask HighScalability: Web asset server concept - 3rd party software available?
7 0.75195473 1521 high scalability-2013-09-23-Salesforce Architecture - How they Handle 1.3 Billion Transactions a Day
8 0.75124604 573 high scalability-2009-04-16-Serving 250M quotes-day at CNBC.com with aiCache
9 0.74396598 1501 high scalability-2013-08-13-In Memoriam: Lavabit Architecture - Creating a Scalable Email Service
10 0.7410143 1288 high scalability-2012-07-23-Ask HighScalability: How Do I Build My MegaUpload + Itunes + YouTube Startup?
11 0.74049228 150 high scalability-2007-11-12-Slashdot Architecture - How the Old Man of the Internet Learned to Scale
12 0.73964417 1579 high scalability-2014-01-14-SharePoint VPS solution
13 0.73672217 1593 high scalability-2014-02-10-13 Simple Tricks for Scaling Python and Django with Apache from HackerEarth
14 0.72463393 391 high scalability-2008-09-23-The 7 Stages of Scaling Web Apps
15 0.72256935 135 high scalability-2007-10-27-.Net2 and AJAX scalability?
16 0.71180058 59 high scalability-2007-08-04-Try Squid as a Reverse Proxy
17 0.71036232 881 high scalability-2010-08-16-Scaling an AWS infrastructure - Tools and Patterns
18 0.70798296 1542 high scalability-2013-11-04-ESPN's Architecture at Scale - Operating at 100,000 Duh Nuh Nuhs Per Second
19 0.70769715 1284 high scalability-2012-07-16-Cinchcast Architecture - Producing 1,500 Hours of Audio Every Day
20 0.70742583 319 high scalability-2008-05-14-Scaling an image upload service
topicId topicWeight
[(1, 0.147), (2, 0.222), (61, 0.106), (76, 0.142), (79, 0.11), (85, 0.072), (94, 0.11)]
simIndex simValue blogId blogTitle
1 0.92180717 1161 high scalability-2011-12-22-Architecting Massively-Scalable Near-Real-Time Risk Analysis Solutions
Introduction: Constructing a scalable risk analysis solution is a fascinating architectural challenge. If you come from Financial Services you are sure to appreciate that. But even architects from other domains are bound to find the challenges fascinating, and the architectural patterns of my suggested solution highly useful in other domains. Recently I held an interesting webinar around architecting solutions for scalable and near-real-time risk analysis solutions based on experience gathered with Financial Services customers. Seeing the vast interest in the webinar, I would like to share the highlights with you here. From an architectural point of view, risk analysis is a data-intensive and a compute-intensive process, which also has an elaborate orchestration logic. volumes in this domain are massive and ever-increasing, together with an ever-increasing demand to reduce response time. These trends are aggravated by global financial regulatory reforms set following the late-2000s
same-blog 2 0.92120981 176 high scalability-2007-12-07-Synchronizing databases in different geographic locations
Introduction: Our company offers a web service that is provided to users from several d i fferent hosting centers across the globe. The content and functionality at each of the servers is almost exact l y the same, and we could have based them all in a single location. However, we chose to distribute the servers geographica l ly to offer our users the best performance, regardless where they might be. Up unt i l now, the only content on the servers that has had to be synchronized is the server software itse l f. The features and functionality of our service are being updated regularly, so every week or two we push updates out to all the servers at basically the same time. We use a relat i vely manual approach to do the updating, but it works fine. Sometime soon, however, our synchronization needs are going to get a b i t more complex. In particular, we'll soon start offering a feature at our site that wi l l involve a database with content that will change on an a
3 0.90546674 976 high scalability-2011-01-20-75% Chance of Scale - Leveraging the New Scaleogenic Environment for Growth
Introduction: "I'll never need to scale so why bother? We aren't Twitter or Facebook or Google after all." This is the most common email I get, a question in the form of a thinly disguised rationalization for not having to worry about scaling. And in these days of giant transformer-like machines they are probably right. But what if there are Barry Bonds enhancing type forces at work that argue for the chances of your needing to scale being higher than you think? And if that happens, how will you cross the scalability chasm? Will you want to completely change your architecture or evolve it from a tool-chain that was meant to scale from the start? Architecturally, that's the question you have to answer. Today's tool-chains are making it possible to grow a system from small to large without needing to implement complete architectural phase changes at various scale inflection points, but that's a different topic. We're trying to think about why you may actually need to scale, that is the question.
4 0.90272534 889 high scalability-2010-08-30-Pomegranate - Storing Billions and Billions of Tiny Little Files
Introduction: Pomegranate is a novel distributed file system built over distributed tabular storage that acts an awful lot like a NoSQL system. It's targeted at increasing the performance of tiny object access in order to support applications like online photo and micro-blog services, which require high concurrency, high throughput, and low latency. Their tests seem to indicate it works: We have demonstrate that file system over tabular storage performs well for highly concurrent access. In our test cluster, we observed linearly increased more than 100,000 aggregate read and write requests served per second ( RPS ). Rather than sitting atop the file system like almost every other K-V store, Pomegranate is baked into file system. The idea is that the file system API is common to every platform so it wouldn't require a separate API to use. Every application could use it out of the box. The features of Pomegranate are: It handles billions of small files efficiently, even in on
5 0.90165001 1122 high scalability-2011-09-23-Stuff The Internet Says On Scalability For September 23, 2011
Introduction: I'd walk a mile for HighScalability : 1/12th the World Population on Facebook in One Day ; 1.8 ZettaBytes of data in 2011; 1 Billion Foursquare Checkins ; 2 million on Spotify ; 1 Million on GitHub ; $1,279-per-hour, 30,000-core cluster built on EC2 ; Patent trolls cost .5 trillion dollars ; 235 terabytes of data collected by the U.S. Library of Congress in April . Potent quotables: @jstogdill : Corporations over protect low value info assets (which screws up collaboration) and under protects high value assets. #strataconf @sbtourist : I think BigMemory-like approaches based on large put-and-forget memory cans, are rarely a solution to performance/scalability problems. 1 Million TCP Connections . Remember when 10K was a real limit and you had to build out boxes just to handle the load? Amazing. We don't know how much processing can be attached to these connections, how much memory the apps use, or what the response latency is to
6 0.90142483 665 high scalability-2009-07-29-Strategy: Let Google and Yahoo Host Your Ajax Library - For Free
7 0.89749527 1102 high scalability-2011-08-22-Strategy: Run a Scalable, Available, and Cheap Static Site on S3 or GitHub
8 0.89549309 966 high scalability-2010-12-31-Facebook in 20 Minutes: 2.7M Photos, 10.2M Comments, 4.6M Messages
9 0.89383626 863 high scalability-2010-07-22-How can we spark the movement of research out of the Ivory Tower and into production?
10 0.89213181 1389 high scalability-2013-01-18-Stuff The Internet Says On Scalability For January 18, 2013
11 0.89156163 1402 high scalability-2013-02-07-Ask HighScalability: Web asset server concept - 3rd party software available?
12 0.89092749 65 high scalability-2007-08-16-Scaling Secret #2: Denormalizing Your Way to Speed and Profit
13 0.88877898 545 high scalability-2009-03-19-Product: Redis - Not Just Another Key-Value Store
14 0.88874859 645 high scalability-2009-06-30-Hot New Trend: Linking Clouds Through Cheap IP VPNs Instead of Private Lines
15 0.88872457 517 high scalability-2009-02-21-Google AppEngine - A Second Look
16 0.88846147 266 high scalability-2008-03-04-Manage Downtime Risk by Connecting Multiple Data Centers into a Secure Virtual LAN
17 0.8881166 1174 high scalability-2012-01-13-Stuff The Internet Says On Scalability For January 13, 2012
18 0.88804716 1564 high scalability-2013-12-13-Stuff The Internet Says On Scalability For December 13th, 2013
19 0.88729036 1516 high scalability-2013-09-13-Stuff The Internet Says On Scalability For September 13, 2013
20 0.88709551 1179 high scalability-2012-01-23-Facebook Timeline: Brought to You by the Power of Denormalization