high_scalability high_scalability-2010 high_scalability-2010-773 knowledge-graph by maker-knowledge-mining

773 high scalability-2010-02-06-GEO-aware traffic load balancing and caching at CNBC.com


meta infos for this blog

Source: html

Introduction: CNBC, like many large web sites, relied  on a CDN for content delivery.  Recently, we started looking  to see if we could improve this model.  Our criteria was: - improve response time - have better control over traffic (real time reporting, change management and alerting) - better utilize internal datacenters and their infrastructure - shield users from any troubles at the origin infrastructure - cost out After researching the market, we turned to two vendors: Dyn (Dynamic Network Services) and aiScaler . We' have had   about a year worth of experience with aiScaler  (search for "CNBC" to see my previous post ), but Dyn was a new vendor for us.  We started building our relationship at Velocity conference in the summer of 2009. Dyn has recently started offering a geo-aware DNS load balancing solution, using Anycast and the distributed nature of their DNS presence to enable a key component of what we were trying to achieve: steer users to geographically closest origin


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Recently, we started looking  to see if we could improve this model. [sent-2, score-0.062]

2 We started building our relationship at Velocity conference in the summer of 2009. [sent-5, score-0.062]

3 Dyn has recently started offering a geo-aware DNS load balancing solution, using Anycast and the distributed nature of their DNS presence to enable a key component of what we were trying to achieve: steer users to geographically closest origin point. [sent-6, score-0.704]

4 In principle, to direct  a user to geographically closest origin point, one has to have an idea as to the user's location. [sent-10, score-0.636]

5 A very different, albeit less granular, way to accomplish the same is to use Internet routing (BGP protocol) to advertise routes to the same IP addresses from multiple points of presence. [sent-13, score-0.126]

6 Each cluster is positioned at a major peering point : US East Coast,  West Coast, one in EU and one in Asia. [sent-21, score-0.078]

7 Through magic of routing,  users in Asia will have their DNS requests come to one's DNS servers in Asia, EU to EU and so on. [sent-24, score-0.065]

8 It is easy to see how this implied knowledge of requestor's geo location can now be used to direct their traffic in a certain, location-specific way. [sent-25, score-0.533]

9 com , his/her browser requests DNS resolution for www. [sent-29, score-0.065]

10 The DNS request will naturally flow to the closest Dyn DNS cluster. [sent-32, score-0.125]

11 The DNS servers at the said cluster have implied awareness of their location. [sent-33, score-0.097]

12 Based on that, DNS server infers that the requests are also coming from users in the same geo area and based on that and set of rules we configure, it directs requesting user to proper origin point for www. [sent-34, score-0.719]

13 For origin points, we've chosen our own datacenters,  each with multiple gigabits of egress capacity, at East and West coasts of US. [sent-37, score-0.444]

14 Just 4 common 1RU blade servers, 2 at each location, are all we needed to deliver all of the traffic to our US user base. [sent-39, score-0.247]

15 The latest iteration of aiScaler  product, v6, has been tested to in excess of 250,000 RPS  per  common HP DL360 server. [sent-40, score-0.092]

16 com peak at over 3000 RPS, so we have a lot of excess capacity for any possible traffic spikes. [sent-45, score-0.28]

17 Here're our results so far: - we were able to shave about 1 sec (about 30%! [sent-46, score-0.115]

18 - our CDN traffic has seen about 80% reduction as well - complete with 80%  reduction in CDN fees - we're now better utilizing our own datacenters capacity -  we now have ability to instantaneously affect  our caching rules or load distribution. [sent-49, score-0.428]

19 Summary:  Dyn's Dynamic and Geo-aware DNS load balancing solution and aiScaler 's proven caching software have enabled a top-tier financial news website to shave 30% off response time, save money, have better, real-time monitoring, reporting and alerting setup. [sent-51, score-0.412]

20 And lastly: the above doesn't constitute, in any way, shape or form, an endorsement of the mentioned products, vendors and/or solutions, by CNBC, NBC, GE or any of its subsidiaries. [sent-53, score-0.124]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('dns', 0.324), ('origin', 0.323), ('eu', 0.23), ('dyn', 0.199), ('coast', 0.195), ('cnbc', 0.192), ('traffic', 0.188), ('rps', 0.156), ('asia', 0.152), ('withaiscaler', 0.141), ('cms', 0.136), ('alerting', 0.131), ('closest', 0.125), ('west', 0.117), ('shave', 0.115), ('east', 0.115), ('anycast', 0.111), ('geo', 0.107), ('reporting', 0.102), ('bgp', 0.102), ('cdn', 0.1), ('implied', 0.097), ('excess', 0.092), ('send', 0.087), ('rules', 0.087), ('location', 0.082), ('datacenters', 0.082), ('point', 0.078), ('reduction', 0.071), ('geographically', 0.07), ('requests', 0.065), ('vendors', 0.064), ('coasts', 0.064), ('requestor', 0.064), ('albeit', 0.064), ('ge', 0.064), ('balancing', 0.064), ('started', 0.062), ('addresses', 0.062), ('nbc', 0.06), ('uucp', 0.06), ('steer', 0.06), ('endorsement', 0.06), ('direct', 0.059), ('user', 0.059), ('russia', 0.057), ('busier', 0.057), ('egress', 0.057), ('valve', 0.057), ('distro', 0.055)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 773 high scalability-2010-02-06-GEO-aware traffic load balancing and caching at CNBC.com

Introduction: CNBC, like many large web sites, relied  on a CDN for content delivery.  Recently, we started looking  to see if we could improve this model.  Our criteria was: - improve response time - have better control over traffic (real time reporting, change management and alerting) - better utilize internal datacenters and their infrastructure - shield users from any troubles at the origin infrastructure - cost out After researching the market, we turned to two vendors: Dyn (Dynamic Network Services) and aiScaler . We' have had   about a year worth of experience with aiScaler  (search for "CNBC" to see my previous post ), but Dyn was a new vendor for us.  We started building our relationship at Velocity conference in the summer of 2009. Dyn has recently started offering a geo-aware DNS load balancing solution, using Anycast and the distributed nature of their DNS presence to enable a key component of what we were trying to achieve: steer users to geographically closest origin

2 0.27482149 573 high scalability-2009-04-16-Serving 250M quotes-day at CNBC.com with aiCache

Introduction: As traffic to cnbc.com continued to grow, we found ourselves in an all-too-familiar situation where one feels that a BIG change in how things are done was in order, the status-quo was a road to nowhere. The spending on HW, amount of space and power required to host additional servers, less-than-stellar response times, having to resort to frequent "micro"-caching and similar tricks to try to improve code performance - all of these were surfacing in plain sight, hard to ignore. While code base could clearly be improved, the limited Dev resources and having to innovate to stay competitive always limits ability to go about refactoring. So how can one go about addressing performance and other needs without a full blown effort across the entire team ? For us, the answer was aiCache - a Web caching and application acceleration product (aicache.com). The idea behind caching is simple - handle the requests before they ever hit your regular Apache<->JK

3 0.25104761 1517 high scalability-2013-09-16-The Hidden DNS Tax - Cascading Timeouts and Errors

Introduction: This is a guest post by Nick Burling , VP of Product Management of Bluestripe . Readers of High Scalability know are well versed in performance optimization techniques. Reverse proxies, Varnish, Redis — you hear about them daily. But what you may not realize is that one of the oldest technologies in your stack can be one of your biggest bottlenecks: DNS. People don't spend a lot of time thinking about DNS. It's not sexy. It's an infrastructure service, and it's just supposed to work. At BlueStripe, we work with many teams running applications that support millions of web requests a day. We keep seeing DNS delays and errors that the platform operations team never knows about. It's so common we've start calling it the Hidden DNS Tax . What is the Hidden DNS Tax? The Hidden DNS Tax is a hard-to-see performance hit your users take from DNS timeouts and errors in your back-end architecture. We've seen it bring down the main web application for a Fortune 10 company.

4 0.16163214 1597 high scalability-2014-02-17-How the AOL.com Architecture Evolved to 99.999% Availability, 8 Million Visitors Per Day, and 200,000 Requests Per Second

Introduction: This is a guest post by Dave Hagler Systems Architect at AOL. The AOL homepages receive more than 8 million visitors per day .  That’s more daily viewers than Good Morning America or the Today Show on television.  Over a billion page views are served each month.  AOL.com has been a major internet destination since 1996, and still has a strong following of loyal users. The architecture for AOL.com is in it’s 5th generation .  It has essentially been rebuilt from scratch 5 times over two decades.  The current architecture was designed 6 years ago.  Pieces have been upgraded and new components have been added along the way, but the overall design remains largely intact.  The code, tools, development and deployment processes are highly tuned over 6 years of continual improvement, making the AOL.com architecture battle tested and very stable. The engineering team is made up of developers, testers, and operations and totals around 25 people .  The majority are in Dulles, Virginia

5 0.14634924 289 high scalability-2008-03-27-Amazon Announces Static IP Addresses and Multiple Datacenter Operation

Introduction: Amazon is fixing two of their major problems: no static IP addresses and single datacenter operation. By adding these two new features developers can finally build a no apology system on Amazon. Before you always had to throw in an apology or two. No, we don't have low failover times because of the silly DNS games and unexceptionable DNS update and propagation times and no, we don't operate in more than one datacenter. No more. Now Amazon is adding Elastic IP Addresses and Availability Zones . Elastic IP addresses are far better than normal IP addresses because they are both in tight with Jessica Alba and they are: Static IP addresses designed for dynamic cloud computing. An Elastic IP address is associated with your account, not a particular instance, and you control that address until you choose to explicitly release it. Unlike traditional static IP addresses, however, Elastic IP addresses allow you to mask instance or availability zone failures by programmatica

6 0.1436508 290 high scalability-2008-03-28-How to Get DNS Names of a Web Server

7 0.12481849 576 high scalability-2009-04-21-What CDN would you recommend?

8 0.12039539 796 high scalability-2010-03-16-Justin.tv's Live Video Broadcasting Architecture

9 0.1203413 1359 high scalability-2012-11-15-Gone Fishin': Justin.Tv's Live Video Broadcasting Architecture

10 0.11591496 1557 high scalability-2013-12-02-Evolution of Bazaarvoice’s Architecture to 500M Unique Users Per Month

11 0.1118378 1289 high scalability-2012-07-23-State of the CDN: More Traffic, Stable Prices, More Products, Profits - Not So Much

12 0.10836665 138 high scalability-2007-10-30-Feedblendr Architecture - Using EC2 to Scale

13 0.10335331 382 high scalability-2008-09-09-Content Delivery Networks (CDN) – a comprehensive list of providers

14 0.10329983 788 high scalability-2010-03-04-How MySpace Tested Their Live Site with 1 Million Concurrent Users

15 0.10319521 1335 high scalability-2012-10-08-How UltraDNS Handles Hundreds of Thousands of Zones and Tens of Millions of Records

16 0.09878131 1331 high scalability-2012-10-02-An Epic TripAdvisor Update: Why Not Run on the Cloud? The Grand Experiment.

17 0.098483346 1123 high scalability-2011-09-23-The Real News is Not that Facebook Serves Up 1 Trillion Pages a Month…

18 0.098190509 1257 high scalability-2012-06-05-Sponsored Post: Digital Ocean, NetDNA, Torbit, Velocity, Reality Check Network, Gigaspaces, AiCache, Logic Monitor, Attribution Modeling, AppDynamics, CloudSigma, ManageEnine, Site24x7

19 0.097510666 1272 high scalability-2012-06-26-Sponsored Post: New Relic, Digital Ocean, NetDNA, Torbit, Reality Check Network, Gigaspaces, AiCache, Logic Monitor, AppDynamics, CloudSigma, ManageEnine, Site24x7

20 0.096145809 220 high scalability-2008-01-22-The high scalability community


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.17), (1, 0.044), (2, -0.036), (3, -0.088), (4, -0.048), (5, -0.091), (6, 0.012), (7, -0.025), (8, -0.02), (9, 0.008), (10, -0.019), (11, 0.0), (12, -0.028), (13, -0.054), (14, 0.019), (15, 0.042), (16, 0.074), (17, 0.033), (18, -0.018), (19, -0.069), (20, 0.003), (21, 0.051), (22, 0.019), (23, -0.026), (24, 0.012), (25, 0.029), (26, -0.062), (27, 0.02), (28, -0.029), (29, -0.032), (30, -0.001), (31, 0.041), (32, -0.01), (33, 0.024), (34, 0.042), (35, 0.017), (36, -0.002), (37, -0.029), (38, -0.031), (39, 0.002), (40, 0.025), (41, 0.03), (42, 0.021), (43, -0.001), (44, 0.006), (45, 0.074), (46, -0.015), (47, 0.035), (48, 0.003), (49, 0.029)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97074753 773 high scalability-2010-02-06-GEO-aware traffic load balancing and caching at CNBC.com

Introduction: CNBC, like many large web sites, relied  on a CDN for content delivery.  Recently, we started looking  to see if we could improve this model.  Our criteria was: - improve response time - have better control over traffic (real time reporting, change management and alerting) - better utilize internal datacenters and their infrastructure - shield users from any troubles at the origin infrastructure - cost out After researching the market, we turned to two vendors: Dyn (Dynamic Network Services) and aiScaler . We' have had   about a year worth of experience with aiScaler  (search for "CNBC" to see my previous post ), but Dyn was a new vendor for us.  We started building our relationship at Velocity conference in the summer of 2009. Dyn has recently started offering a geo-aware DNS load balancing solution, using Anycast and the distributed nature of their DNS presence to enable a key component of what we were trying to achieve: steer users to geographically closest origin

2 0.77709275 270 high scalability-2008-03-08-DNS-Record TTL on worst case scenarios

Introduction: i didnt find a nearly good solution for this problem yet: imagine, you're responsible for a small CDN network (static images), with two different datacenter. the balancing for the two DC is done with a anycast nameservice (a nameserver in every DC, user gets on nearest location). so, one of the scenario is that one of the datacenters goes down completly. you can do a monitoring on the nameserver and only route to the dc which is still alive, no problem. But what about the TTL from the DNS-Records? Tiny TTLs like 2 min. are often ignored by several ISP (e.g. AOL). so, the client doesn't get the IP from the other Datacenter. what could be a solution in this scenario?

3 0.77569306 1517 high scalability-2013-09-16-The Hidden DNS Tax - Cascading Timeouts and Errors

Introduction: This is a guest post by Nick Burling , VP of Product Management of Bluestripe . Readers of High Scalability know are well versed in performance optimization techniques. Reverse proxies, Varnish, Redis — you hear about them daily. But what you may not realize is that one of the oldest technologies in your stack can be one of your biggest bottlenecks: DNS. People don't spend a lot of time thinking about DNS. It's not sexy. It's an infrastructure service, and it's just supposed to work. At BlueStripe, we work with many teams running applications that support millions of web requests a day. We keep seeing DNS delays and errors that the platform operations team never knows about. It's so common we've start calling it the Hidden DNS Tax . What is the Hidden DNS Tax? The Hidden DNS Tax is a hard-to-see performance hit your users take from DNS timeouts and errors in your back-end architecture. We've seen it bring down the main web application for a Fortune 10 company.

4 0.76012099 573 high scalability-2009-04-16-Serving 250M quotes-day at CNBC.com with aiCache

Introduction: As traffic to cnbc.com continued to grow, we found ourselves in an all-too-familiar situation where one feels that a BIG change in how things are done was in order, the status-quo was a road to nowhere. The spending on HW, amount of space and power required to host additional servers, less-than-stellar response times, having to resort to frequent "micro"-caching and similar tricks to try to improve code performance - all of these were surfacing in plain sight, hard to ignore. While code base could clearly be improved, the limited Dev resources and having to innovate to stay competitive always limits ability to go about refactoring. So how can one go about addressing performance and other needs without a full blown effort across the entire team ? For us, the answer was aiCache - a Web caching and application acceleration product (aicache.com). The idea behind caching is simple - handle the requests before they ever hit your regular Apache<->JK

5 0.7310583 1329 high scalability-2012-09-26-WordPress.com Serves 70,000 req-sec and over 15 Gbit-sec of Traffic using NGINX

Introduction: This is a guest post by  Barry Abrahamson , Chief Systems Wrangler at Automattic, and Nginx's Coufounder  Andrew Alexeev. WordPress.com  serves more than 33 million sites attracting over 339 million people and 3.4 billion pages each month. Since April 2008, WordPress.com has experienced about 4.4 times growth in page views.  WordPress.com VIP  hosts many popular sites including CNN’s Political Ticker, NFL, Time Inc’s The Page, People Magazine’s Style Watch, corporate blogs for Flickr and KROQ, and many more. Automattic operates two thousand servers in twelve, globally distributed, data centers. WordPress.com customer data is instantly replicated between different locations to provide an extremely reliable and fast web experience for hundreds of millions of visitors. Problem WordPress.com, which began in 2005, started on shared hosting, much like all of the  WordPress.org  sites. It was soon moved to a single dedicated server and then to two servers. In late 2005, WordPress.com

6 0.72376639 1401 high scalability-2013-02-06-Super Bowl Advertisers Ready for the Traffic? Nope..It's Lights Out.

7 0.69951355 1597 high scalability-2014-02-17-How the AOL.com Architecture Evolved to 99.999% Availability, 8 Million Visitors Per Day, and 200,000 Requests Per Second

8 0.69875675 1335 high scalability-2012-10-08-How UltraDNS Handles Hundreds of Thousands of Zones and Tens of Millions of Records

9 0.68583131 800 high scalability-2010-03-26-Strategy: Caching 404s Saved the Onion 66% on Server Time

10 0.68441635 1267 high scalability-2012-06-18-The Clever Ways Chrome Hides Latency by Anticipating Your Every Need

11 0.68410009 228 high scalability-2008-01-28-Product: ISPMan Centralized ISP Management System

12 0.68360299 1359 high scalability-2012-11-15-Gone Fishin': Justin.Tv's Live Video Broadcasting Architecture

13 0.68335193 796 high scalability-2010-03-16-Justin.tv's Live Video Broadcasting Architecture

14 0.68290591 1587 high scalability-2014-01-29-10 Things Bitly Should Have Monitored

15 0.67999727 138 high scalability-2007-10-30-Feedblendr Architecture - Using EC2 to Scale

16 0.67875159 1392 high scalability-2013-01-23-Building Redundant Datacenter Networks is Not For Sissies - Use an Outside WAN Backbone

17 0.67259169 788 high scalability-2010-03-04-How MySpace Tested Their Live Site with 1 Million Concurrent Users

18 0.67181969 290 high scalability-2008-03-28-How to Get DNS Names of a Web Server

19 0.66850978 987 high scalability-2011-02-10-Dispelling the New SSL Myth

20 0.66536736 533 high scalability-2009-03-11-The Implications of Punctuated Scalabilium for Website Architecture


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.127), (2, 0.153), (10, 0.041), (30, 0.077), (32, 0.012), (47, 0.029), (56, 0.025), (61, 0.065), (77, 0.02), (79, 0.089), (81, 0.191), (85, 0.016), (93, 0.032), (94, 0.041)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.88973051 540 high scalability-2009-03-16-Cisco and Sun to Compete for Unified Computing?

Introduction: A recent InfoWorld article claims that "With Cisco expected to enter the blade market and Sun expected to offer networking equipment, things could get interesting awfully fast." How does this effect your infrastructure strategy and decisions? Would you consider to build scalable web applications on the Cisco Unified Computing System? Or would you consider to build a router out of a server with the use of OpenSolaris and Project Crossbow as the article suggests? Will any of these initiatives change the way we build scalable web infrastructure or are these just attempts to sale these systems? What do you think?

same-blog 2 0.8702246 773 high scalability-2010-02-06-GEO-aware traffic load balancing and caching at CNBC.com

Introduction: CNBC, like many large web sites, relied  on a CDN for content delivery.  Recently, we started looking  to see if we could improve this model.  Our criteria was: - improve response time - have better control over traffic (real time reporting, change management and alerting) - better utilize internal datacenters and their infrastructure - shield users from any troubles at the origin infrastructure - cost out After researching the market, we turned to two vendors: Dyn (Dynamic Network Services) and aiScaler . We' have had   about a year worth of experience with aiScaler  (search for "CNBC" to see my previous post ), but Dyn was a new vendor for us.  We started building our relationship at Velocity conference in the summer of 2009. Dyn has recently started offering a geo-aware DNS load balancing solution, using Anycast and the distributed nature of their DNS presence to enable a key component of what we were trying to achieve: steer users to geographically closest origin

3 0.831056 19 high scalability-2007-07-16-Paper: Replication Under Scalable Hashing

Introduction: Replication Under Scalable Hashing: A Family of Algorithms for Scalable Decentralized Data Distribution From the abstract: Typical algorithms for decentralized data distribution work best in a system that is fully built before it first used; adding or removing components results in either extensive reorganization of data or load imbalance in the system. We have developed a family of decentralized algorithms, RUSH (Replication Under Scalable Hashing), that maps replicated objects to a scalable collection of storage servers or disks. RUSH algorithms distribute objects to servers according to user-specified server weighting. While all RUSH variants support addition of servers to the system, different variants have different characteristics with respect to lookup time in petabyte-scale systems, performance with mirroring (as opposed to redundancy codes), and storage server removal. All RUSH variants redistribute as few objects as possible when new servers are added or existing servers

4 0.80212069 100 high scalability-2007-09-26-Use a CDN to Instantly Improve Your Website's Performance by 20% or More

Introduction: If you have a lot of static content to store and you aren't looking forward to setting up and maintaining your own giganto SAN, maybe you can push off a lot of the hard lifting to a CDN? Jesse Robbins at O'Reilly Radar posts that you have a lot more options now because the number of Content Distribution Networks have doubled since last year . In fact, Dan Rayburn says there are now 28 CDN providers in the market. Hopefully you can find reasonable pricing at one of them. Other than easing your burden, why might a CDN work for you? Because it makes your site faster and customers like that. How can a CDN so dramatically improve your site's performance? Steve Saunders, author of High Performance Web Sites: Essential Knowledge for Front-End Engineers , has using a CDN has one of his "Thirteen Simple Rules for Speeding Up Your Web Site." About CDNs Steve says: Remember that 80-90% of the end-user response time is spent downloading all the components in

5 0.79734021 133 high scalability-2007-10-26-How Gravatar scales on WordPress.com hardware

Introduction: Automattic recently purchase Gravatar and have switched the server onto their hosting platform. WordPress.com host over 1.7 million blogs with well over 60'000 new posts submitted each day generating 10 - 12 million page views per day. Barry on WordPress.com has a great post on the changes they've introduced to help Gravatar scale .

6 0.79163712 1375 high scalability-2012-12-21-Stuff The Internet Says On Scalability For December 21, 2012

7 0.77821553 985 high scalability-2011-02-08-Mollom Architecture - Killing Over 373 Million Spams at 100 Requests Per Second

8 0.77577257 1575 high scalability-2014-01-08-Under Snowden's Light Software Architecture Choices Become Murky

9 0.76772398 1618 high scalability-2014-03-24-Big, Small, Hot or Cold - Examples of Robust Data Pipelines from Stripe, Tapad, Etsy and Square

10 0.76715851 195 high scalability-2007-12-28-Amazon's EC2: Pay as You Grow Could Cut Your Costs in Half

11 0.75872606 1330 high scalability-2012-09-28-Stuff The Internet Says On Scalability For September 28, 2012

12 0.75772101 1450 high scalability-2013-05-01-Myth: Eric Brewer on Why Banks are BASE Not ACID - Availability Is Revenue

13 0.75678152 1109 high scalability-2011-09-02-Stuff The Internet Says On Scalability For September 2, 2011

14 0.75588894 576 high scalability-2009-04-21-What CDN would you recommend?

15 0.75503683 573 high scalability-2009-04-16-Serving 250M quotes-day at CNBC.com with aiCache

16 0.75415421 1011 high scalability-2011-03-25-Did the Microsoft Stack Kill MySpace?

17 0.75369471 881 high scalability-2010-08-16-Scaling an AWS infrastructure - Tools and Patterns

18 0.75367647 851 high scalability-2010-07-02-Hot Scalability Links for July 2, 2010

19 0.75364083 1476 high scalability-2013-06-14-Stuff The Internet Says On Scalability For June 14, 2013

20 0.75344503 857 high scalability-2010-07-13-DbShards Part Deux - The Internals