high_scalability high_scalability-2007 high_scalability-2007-21 knowledge-graph by maker-knowledge-mining

21 high scalability-2007-07-23-GoogleTalk Architecture


meta infos for this blog

Source: html

Introduction: Google Talk is Google's instant communications service. Interestingly the IM messages aren't the major architectural challenge, handling user presence indications dominate the design. They also have the challenge of handling small low latency messages and integrating with many other systems. How do they do it? Site: http://www.google.com/talk Information Sources GoogleTalk Architecture Platform Linux Java Google Stack Shard What's Inside? The Stats Support presence and messages for millions of users. Handles billions of packets per day in under 100ms. IM is different than many other applications because the requests are small packets. Routing and application logic are applied per packet for sender and receiver. Messages must be delivered in-order. Architecture extends to new clients and Google services. Lessons Learned Measure the right thing. - People ask about how many IMs do you deliver or how many active users. Turns out not


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Interestingly the IM messages aren't the major architectural challenge, handling user presence indications dominate the design. [sent-2, score-0.787]

2 They also have the challenge of handling small low latency messages and integrating with many other systems. [sent-3, score-0.38]

3 The Stats Support presence and messages for millions of users. [sent-8, score-0.42]

4 Handles billions of packets per day in under 100ms. [sent-9, score-0.214]

5 Architecture extends to new clients and Google services. [sent-13, score-0.201]

6 - People ask about how many IMs do you deliver or how many active users. [sent-15, score-0.198]

7 - Hard part of IM is how to show correct present to all connected users because growth is non-linear: ConnectedUsers * BuddyListSize * OnlineStateChanges - A linear user grown can mean a very non-linear server growth which requires serving many billions of presence packets per day. [sent-17, score-0.61]

8 - Have a large number friends and presence explodes. [sent-18, score-0.379]

9 - Simulate presence requests and going on-line and off-line for weeks and months, even if real data is not returned. [sent-22, score-0.297]

10 - Google Talk backend servers handle traffic for a subset of users. [sent-25, score-0.189]

11 - Servers can bring down servers and backups take over. [sent-29, score-0.192]

12 Then you can bring up new servers and data migrated automatically and clients auto detect and go to new servers. [sent-30, score-0.396]

13 Add Abstractions to Hide System Complexity - Different systems should have little knowledge of each other, especially when separate groups are working together. [sent-31, score-0.187]

14 Can change at anytime without cascading changes throughout the system. [sent-33, score-0.225]

15 This is architectural implications as you can have separate system failing independently. [sent-40, score-0.202]

16 - What happens when servers restart with an empty cache? [sent-45, score-0.269]

17 - Add ability to profile live servers without impacting server. [sent-56, score-0.19]

18 - Log end-to-end so you can reconstruct an entire operation from beginning to end across all machines. [sent-60, score-0.225]

19 Software Development Strategies - Make sure binaries are both backward and forward compatible so you can have old clients work with new code. [sent-61, score-0.31]

20 This is very different than many companies who have completely separate OP teams in their data centers. [sent-65, score-0.202]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('presence', 0.297), ('im', 0.221), ('ims', 0.193), ('abstracted', 0.17), ('retry', 0.153), ('cascading', 0.143), ('rpc', 0.143), ('clients', 0.126), ('operation', 0.125), ('messages', 0.123), ('packets', 0.115), ('servers', 0.112), ('indications', 0.112), ('infect', 0.112), ('separate', 0.103), ('epoll', 0.1), ('reconstruct', 0.1), ('billions', 0.099), ('many', 0.099), ('architectural', 0.099), ('binaries', 0.097), ('google', 0.095), ('gateways', 0.094), ('op', 0.094), ('spoke', 0.094), ('experimentation', 0.089), ('backward', 0.087), ('sender', 0.085), ('challenge', 0.084), ('knowledge', 0.084), ('shifts', 0.084), ('smooth', 0.084), ('number', 0.082), ('anytime', 0.082), ('dominate', 0.082), ('happens', 0.081), ('bring', 0.08), ('complexities', 0.08), ('connections', 0.079), ('impacting', 0.078), ('migrated', 0.078), ('logic', 0.078), ('lab', 0.077), ('backend', 0.077), ('continual', 0.076), ('empty', 0.076), ('everything', 0.076), ('extends', 0.075), ('interestingly', 0.074), ('handling', 0.074)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000004 21 high scalability-2007-07-23-GoogleTalk Architecture

Introduction: Google Talk is Google's instant communications service. Interestingly the IM messages aren't the major architectural challenge, handling user presence indications dominate the design. They also have the challenge of handling small low latency messages and integrating with many other systems. How do they do it? Site: http://www.google.com/talk Information Sources GoogleTalk Architecture Platform Linux Java Google Stack Shard What's Inside? The Stats Support presence and messages for millions of users. Handles billions of packets per day in under 100ms. IM is different than many other applications because the requests are small packets. Routing and application logic are applied per packet for sender and receiver. Messages must be delivered in-order. Architecture extends to new clients and Google services. Lessons Learned Measure the right thing. - People ask about how many IMs do you deliver or how many active users. Turns out not

2 0.157626 318 high scalability-2008-05-14-New Facebook Chat Feature Scales to 70 Million Users Using Erlang

Introduction: Update :  Erlang at Facebook  by Eugene Letuchy. How Facebook uses Erlang to implement Chat, AIM Presence, and Chat Jabber support.  I've done some XMPP development so when I read Facebook was making a Jabber chat client I was really curious how they would make it work. While core XMPP is straightforward, a number of protocol extensions like discovery, forms, chat states, pubsub, multi user chat, and privacy lists really up the implementation complexity. Some real engineering challenges were involved to make this puppy scale and perform. It's not clear what extensions they've implemented, but a blog entry by Facebook's Eugene Letuchy hits some of the architectural challenges they faced and how they overcame them. A web based Jabber client poses a few problems because XMPP, like most IM protocols, is an asynchronous event driven system that pretty much assumes you have a full time open connection. After logging in the server sends a client roster information and presence info

3 0.15525456 672 high scalability-2009-08-06-An Unorthodox Approach to Database Design : The Coming of the Shard

Introduction: Update 4: Why you don’t want to shard. by Morgon on the MySQL Performance Blog. Optimize everything else first, and then if performance still isn’t good enough, it’s time to take a very bitter medicine. Update 3: Building Scalable Databases: Pros and Cons of Various Database Sharding Schemes by Dare Obasanjo. Excellent discussion of why and when you would choose a sharding architecture, how to shard, and problems with sharding. Update 2: Mr. Moore gets to punt on sharding by Alan Rimm-Kaufman of 37signals. Insightful article on design tradeoffs and the evils of premature optimization. With more memory, more CPU, and new tech like SSD, problems can be avoided before more exotic architectures like sharding are needed. Add features not infrastructure. Jeremy Zawodny says he's wrong wrong wrong. we're running multi-core CPUs at slower clock speeds. Moore won't save you. Update: Dan Pritchett shares some excellent Sharding Lessons : Size Your Shards, Use Math on Shard C

4 0.1491465 1456 high scalability-2013-05-13-The Secret to 10 Million Concurrent Connections -The Kernel is the Problem, Not the Solution

Introduction: Now that we have the C10K concurrent connection problem licked, how do we level up and support 10 million concurrent connections? Impossible you say. Nope, systems right now are delivering 10 million concurrent connections using techniques that are as radical as they may be unfamiliar. To learn how it’s done we turn to Robert Graham , CEO of Errata Security, and his absolutely fantastic talk at Shmoocon 2013 called C10M Defending The Internet At Scale . Robert has a brilliant way of framing the problem that I’ve never heard of before. He starts with a little bit of history, relating how Unix wasn’t originally designed to be a general server OS, it was designed to be a control system for a telephone network. It was the telephone network that actually transported the data so there was a clean separation between the control plane and the data plane. The problem is we now use Unix servers as part of the data plane , which we shouldn’t do at all. If we were des

5 0.13968833 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT

Introduction: It remains that, from the same principles, I now demonstrate the frame of the System of the World. -- Isaac Newton The practice of IT reminds me a lot of the practice of science before Isaac Newton. Aristotelianism was dead, but there was nothing to replace it. Then Newton came along, created a scientific revolution with his System of the World . And everything changed. That was New System of the World number one. New System of the World number two was written about by the incomparable Neal Stephenson in his incredible  Baroque Cycle  series. It explores the singular creation of a new way of organizing society grounded in new modes of thought in business, religion, politics, and science. Our modern world emerged Enlightened as it could from this roiling cauldron of forces. In IT we may have had a Leonardo da Vinci or even a Galileo, but we’ve never had our Newton. Maybe we don't need a towering genius to make everything clear? For years startups, like the frenetically inventive

6 0.13667223 448 high scalability-2008-11-22-Google Architecture

7 0.1364273 1573 high scalability-2014-01-06-How HipChat Stores and Indexes Billions of Messages Using ElasticSearch and Redis

8 0.13600057 1042 high scalability-2011-05-17-Facebook: An Example Canonical Architecture for Scaling Billions of Messages

9 0.13527139 1622 high scalability-2014-03-31-How WhatsApp Grew to Nearly 500 Million Users, 11,000 cores, and 70 Million Messages a Second

10 0.13486369 1421 high scalability-2013-03-11-Low Level Scalability Solutions - The Conditioning Collection

11 0.13467631 1501 high scalability-2013-08-13-In Memoriam: Lavabit Architecture - Creating a Scalable Email Service

12 0.13337442 234 high scalability-2008-01-30-The AOL XMPP scalability challenge

13 0.13013718 920 high scalability-2010-10-15-Troubles with Sharding - What can we learn from the Foursquare Incident?

14 0.12983441 1359 high scalability-2012-11-15-Gone Fishin': Justin.Tv's Live Video Broadcasting Architecture

15 0.12967084 796 high scalability-2010-03-16-Justin.tv's Live Video Broadcasting Architecture

16 0.12925628 431 high scalability-2008-10-27-Notify.me Architecture - Synchronicity Kills

17 0.12799279 152 high scalability-2007-11-13-Flickr Architecture

18 0.12690686 906 high scalability-2010-09-22-Applying Scalability Patterns to Infrastructure Architecture

19 0.12639505 1413 high scalability-2013-02-27-42 Monster Problems that Attack as Loads Increase

20 0.12538187 1440 high scalability-2013-04-15-Scaling Pinterest - From 0 to 10s of Billions of Page Views a Month in Two Years


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.233), (1, 0.1), (2, 0.014), (3, -0.092), (4, 0.006), (5, -0.036), (6, 0.058), (7, 0.043), (8, -0.044), (9, -0.011), (10, 0.006), (11, 0.082), (12, -0.035), (13, -0.023), (14, 0.026), (15, 0.066), (16, 0.025), (17, 0.0), (18, 0.013), (19, 0.038), (20, 0.06), (21, -0.001), (22, -0.005), (23, -0.056), (24, 0.031), (25, 0.023), (26, 0.008), (27, 0.028), (28, -0.028), (29, 0.025), (30, 0.026), (31, -0.033), (32, -0.005), (33, -0.039), (34, 0.026), (35, 0.018), (36, 0.019), (37, 0.01), (38, 0.008), (39, 0.04), (40, 0.05), (41, 0.024), (42, 0.062), (43, -0.036), (44, -0.08), (45, -0.068), (46, -0.046), (47, -0.001), (48, -0.018), (49, -0.044)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95896065 21 high scalability-2007-07-23-GoogleTalk Architecture

Introduction: Google Talk is Google's instant communications service. Interestingly the IM messages aren't the major architectural challenge, handling user presence indications dominate the design. They also have the challenge of handling small low latency messages and integrating with many other systems. How do they do it? Site: http://www.google.com/talk Information Sources GoogleTalk Architecture Platform Linux Java Google Stack Shard What's Inside? The Stats Support presence and messages for millions of users. Handles billions of packets per day in under 100ms. IM is different than many other applications because the requests are small packets. Routing and application logic are applied per packet for sender and receiver. Messages must be delivered in-order. Architecture extends to new clients and Google services. Lessons Learned Measure the right thing. - People ask about how many IMs do you deliver or how many active users. Turns out not

2 0.80393398 431 high scalability-2008-10-27-Notify.me Architecture - Synchronicity Kills

Introduction: What's cool about starting a new project is you finally have a chance to do it right. You of course eventually mess everything up in your own way, but for that one moment the world has a perfect order, a rightness that feels satisfying and good. Arne Claassen, the CTO of notify.me, a brand new real time notification delivery service, is in this honeymoon period now. Arne has been gracious enough to share with us his philosophy of how to build a notification service. I think you'll find it fascinating because Arne goes into a lot of useful detail about how his system works. His main design philosophy is to minimize the bottlenecks that form around synchronous access, that is when some resource is requested and the requestor ties up more resources, waiting for a response. If the requested resource can’t be delivered in a timely manner, more and more requests pile up until the server can’t accept any new ones. Nobody gets what they want and you have an outage. Breaking synchronous op

3 0.80381989 1622 high scalability-2014-03-31-How WhatsApp Grew to Nearly 500 Million Users, 11,000 cores, and 70 Million Messages a Second

Introduction: When we last visited WhatsApp they’d just been acquired by Facebook for $19 billion. We learned about their early architecture, which centered around a maniacal focus on optimizing Erlang into handling 2 million connections a server, working on All The Phones, and making users happy through simplicity. Two years later traffic has grown 10x. How did WhatsApp make that jump to the next level of scalability? Rick Reed tells us in a talk he gave at the Erlang Factory: That's 'Billion' with a 'B': Scaling to the next level at WhatsApp ( slides ), which revealed some eye popping WhatsApp stats: What has hundreds of nodes, thousands of cores, hundreds of terabytes of RAM, and hopes to serve the billions of smartphones that will soon be a reality around the globe? The Erlang/FreeBSD-based server infrastructure at WhatsApp. We've faced many challenges in meeting the ever-growing demand for our messaging services, but as we continue to push the envelope on size (>800

4 0.78936291 1602 high scalability-2014-02-26-The WhatsApp Architecture Facebook Bought For $19 Billion

Introduction: Rick Reedin an upcoming talk in March titledThat's 'Billion' with a 'B': Scaling to the next level at WhatsAppreveals some eye poppingWhatsAppstats:What has hundreds of nodes, thousands of cores, hundreds of terabytes of RAM, and hopes to serve the billions of smartphones that will soon be a reality around the globe? The Erlang/FreeBSD-based server infrastructure at WhatsApp. We've faced many challenges in meeting the ever- growing demand for our messaging services, but as we continue to push the envelope on size (>8000 cores) and speed (>70M Erlang messages per second) of our serving system.But since we don't have that talk yet, let's take a look at a talk Rick Reed gave two years ago on WhatsApp:Scaling to Millions of Simultaneous Connections.Having built a high performance messaging bus in C++ while at Yahoo, Rick Reed is not new to the world of high scalability architectures. The founders are also ex-Yahoo guys with not a little experience scaling systems. So WhatsApp comes by thei

5 0.78897703 205 high scalability-2008-01-10-Letting Clients Know What's Changed: Push Me or Pull Me?

Introduction: I had a false belief I thought I came here to stay We're all just visiting All just breaking like waves The oceans made me, but who came up with me? Push me, pull me, push me, or pull me out . So true Perl Jam (Push me Pull me lyrics) , so true. I too have wondered how web clients should be notified of model changes. Should servers push events to clients or should clients pull events from servers? A topic worthy of its own song if ever there was one. To pull events the client simply starts a timer and makes a request to the server. This is polling. You can either pull a complete set of fresh data or get a list of changes. The server "knows" if anything you are interested in has changed and makes those changes available to you. Knowing what has changed can be relatively simple with a publish-subscribe type backend or you can get very complex with fine grained bit maps of attributes and keeping per client state on what I client still needs to see. Polling is heavy man.

6 0.78088301 1501 high scalability-2013-08-13-In Memoriam: Lavabit Architecture - Creating a Scalable Email Service

7 0.76875651 318 high scalability-2008-05-14-New Facebook Chat Feature Scales to 70 Million Users Using Erlang

8 0.76212507 1573 high scalability-2014-01-06-How HipChat Stores and Indexes Billions of Messages Using ElasticSearch and Redis

9 0.76023036 1638 high scalability-2014-04-28-How Disqus Went Realtime with 165K Messages Per Second and Less than .2 Seconds Latency

10 0.75678241 1421 high scalability-2013-03-11-Low Level Scalability Solutions - The Conditioning Collection

11 0.75536901 985 high scalability-2011-02-08-Mollom Architecture - Killing Over 373 Million Spams at 100 Requests Per Second

12 0.74525726 406 high scalability-2008-10-08-Strategy: Flickr - Do the Essential Work Up-front and Queue the Rest

13 0.74322063 1312 high scalability-2012-08-27-Zoosk - The Engineering behind Real Time Communications

14 0.74186337 76 high scalability-2007-08-29-Skype Failed the Boot Scalability Test: Is P2P fundamentally flawed?

15 0.73290664 1372 high scalability-2012-12-14-Stuff The Internet Says On Scalability For December 14, 2012

16 0.73258221 1413 high scalability-2013-02-27-42 Monster Problems that Attack as Loads Increase

17 0.72476232 313 high scalability-2008-05-02-Friends for Sale Architecture - A 300 Million Page View-Month Facebook RoR App

18 0.7244184 728 high scalability-2009-10-26-Facebook's Memcached Multiget Hole: More machines != More Capacity

19 0.72341853 1271 high scalability-2012-06-25-StubHub Architecture: The Surprising Complexity Behind the World’s Largest Ticket Marketplace

20 0.72066057 661 high scalability-2009-07-25-Latency is Everywhere and it Costs You Sales - How to Crush it


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.113), (2, 0.23), (10, 0.091), (33, 0.164), (40, 0.011), (61, 0.107), (73, 0.02), (79, 0.147), (85, 0.018), (94, 0.026)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.93883187 1060 high scalability-2011-06-14-Shakespeare on Why Other People Like Such Stupid Stuff

Introduction: Jumping around the social mediasphere, it's not uncommon to feel the heat generated in praise of a favorite this or that over all the clearly inferior alternatives. Whilst human nature may never cool, I think Old Will had some insight worth considering the next time a flame threatens to flicker forth: My mistress' eyes are nothing like the sun (Sonnet 130) My mistress' eyes are nothing like the sun; Coral is far more red than her lips' red ; If snow be white, why then her breasts are dun; If hairs be wires, black wires grow on her head. I have seen roses damask, red and white, But no such roses see I in her cheeks; And in some perfumes is there more delight Than in the breath that from my mistress reeks. I love to hear her speak, yet well I know That music hath a far more pleasing sound; I grant I never saw a goddess go; My mistress, when she walks, treads on the ground: And yet, by heaven, I think my love as rare As any she belied with false compare. –Willia

2 0.92955101 1315 high scalability-2012-08-30-Stuff The Internet Says On Scalability For August 31, 2012

Introduction: It's HighScalability Time: LHC compute jobs: use 1.5 CPU millennia every 3 days ; Obama helps load test Reddit: 4.3 million page views Quotable Quotes: @secastro : Want to see nearly a terabyte of memory? @DZone : Apache Projects are the Justice League of Scalability Apple And Google Might Be Negotiating Patents . Remember when empires and nation states would have a nice little summer war and then negotiate boundary lines and terms of trade? Google Faculty Summit 2012: The Online Revolution - Education at Scale . Someday we may have direct knowledge downloads and augmented wisdom packs, but until then these primitive attempts at learning process improvement are a good start.  OnLive lost: how the paradise of streaming games was undone by one man's ego . The fascinating story behind a radical idea: applications hosted and rendered in the cloud while being displayed remotely on a device. You might have thought latency would be the killer

3 0.9295131 686 high scalability-2009-08-20-VMware to bridge a DMZ.

Introduction: Hey guys, There is a renewed push at my organization to deploy vmware...everywhere. I am rather excited as I know we have a lot of waste when it comes to resources. What has pricked my ears up however, is the notion of using this technology in our very busy public facing DMZ's. Today we get lots of spikes of traffic and we are coping very well. 40x HP blades, apache/php/perl/tomcat/ all in HA behind HA F5's and HA Checkpoint FW's. (20 servers in 2 datacentres). The idea is, we virtualise these machines, including the firewalls onto hosts vmware clusters that span the public interface to our internal networks. This is something that has gone against the #1 rule I have ever lived by while working on the inet. No airgaps from the unknown to the known! I am interested in feedback on this scenario. From a resource perspective, our resource requirements in the DMZ will be lowered over time due to business change and we still have a lot of head room in our capacity. Do you think t

4 0.92153221 1067 high scalability-2011-06-24-Stuff The Internet Says On Scalability For June 24, 2011

Introduction: Submitted for your scaling pleasure:  Achievements: Watson uses 10,000's of watts, the computer between the ears uses 20. With only 200 million pages and 2TB of data, Watson is BigInsights, not BigData. That Google is pretty big: 1 billion unique monthly visitors tweetimages : We peaked at 22m avatars yesterday. Bandwidth peaked at 9GB of @twitter avatars in a single hour. Foursquare Surpasses 10 Million Users  Reddit Hits 1.2B Monthly Pageviews, More Than Doubles Its Engineering Staff Twitter : 185 million tweets are posted daily;  1.6 billion search queries daily; indexing latency is less than 10 seconds.  Quotable quotes: skr : OH: "people wait their whole lives for a situation where they can use bloom filters" joeweinman : @Werner at #structureconf : as of Nov 10, 2010, all Amazon.com traffic was served from AWS. <-- The child surpasses the parent. bbatsov : A compiled language does not scalability mak

same-blog 5 0.92044646 21 high scalability-2007-07-23-GoogleTalk Architecture

Introduction: Google Talk is Google's instant communications service. Interestingly the IM messages aren't the major architectural challenge, handling user presence indications dominate the design. They also have the challenge of handling small low latency messages and integrating with many other systems. How do they do it? Site: http://www.google.com/talk Information Sources GoogleTalk Architecture Platform Linux Java Google Stack Shard What's Inside? The Stats Support presence and messages for millions of users. Handles billions of packets per day in under 100ms. IM is different than many other applications because the requests are small packets. Routing and application logic are applied per packet for sender and receiver. Messages must be delivered in-order. Architecture extends to new clients and Google services. Lessons Learned Measure the right thing. - People ask about how many IMs do you deliver or how many active users. Turns out not

6 0.91232502 619 high scalability-2009-06-05-HotPads Shows the True Cost of Hosting on Amazon

7 0.90740943 1490 high scalability-2013-07-12-Stuff The Internet Says On Scalability For July 12, 2013

8 0.89375412 1520 high scalability-2013-09-20-Stuff The Internet Says On Scalability For September 20, 2013

9 0.8891319 1470 high scalability-2013-06-05-A Simple 6 Step Transition Guide for Moving Away from X to AWS

10 0.88422251 724 high scalability-2009-10-19-Drupal's Scalability Makeover - You give up some control and you get back scalability

11 0.88176394 780 high scalability-2010-02-19-Twitter’s Plan to Analyze 100 Billion Tweets

12 0.87951446 1112 high scalability-2011-09-07-What Google App Engine Price Changes Say About the Future of Web Architecture

13 0.87929463 498 high scalability-2009-01-20-Product: Amazon's SimpleDB

14 0.87761676 1159 high scalability-2011-12-19-How Twitter Stores 250 Million Tweets a Day Using MySQL

15 0.87684059 1186 high scalability-2012-02-02-The Data-Scope Project - 6PB storage, 500GBytes-sec sequential IO, 20M IOPS, 130TFlops

16 0.87646204 736 high scalability-2009-11-04-Damn, Which Database do I Use Now?

17 0.87626481 709 high scalability-2009-09-19-Space Based Programming in .NET

18 0.87575358 1649 high scalability-2014-05-16-Stuff The Internet Says On Scalability For May 16th, 2014

19 0.87465513 1041 high scalability-2011-05-15-Building a Database remote availability site

20 0.87390476 1626 high scalability-2014-04-04-Stuff The Internet Says On Scalability For April 4th, 2014