high_scalability high_scalability-2008 high_scalability-2008-234 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Large scale distributed instant messaging, presence based protocol are a real challenge. With big players adopting the standard, the XMPP (eXtensible Messaging and Presence Protocol) community is facing the need to validate protocol and implementations to even larger scale.
sentIndex sentText sentNum sentScore
1 Large scale distributed instant messaging, presence based protocol are a real challenge. [sent-1, score-1.339]
2 With big players adopting the standard, the XMPP (eXtensible Messaging and Presence Protocol) community is facing the need to validate protocol and implementations to even larger scale. [sent-2, score-2.033]
wordName wordTfidf (topN-words)
[('protocol', 0.469), ('presence', 0.396), ('messaging', 0.305), ('validate', 0.297), ('xmpp', 0.297), ('adopting', 0.258), ('extensible', 0.238), ('players', 0.202), ('implementations', 0.199), ('facing', 0.185), ('instant', 0.174), ('community', 0.13), ('standard', 0.128), ('larger', 0.124), ('scale', 0.105), ('real', 0.076), ('big', 0.065), ('based', 0.062), ('even', 0.06), ('large', 0.058), ('distributed', 0.057), ('need', 0.044)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 234 high scalability-2008-01-30-The AOL XMPP scalability challenge
Introduction: Large scale distributed instant messaging, presence based protocol are a real challenge. With big players adopting the standard, the XMPP (eXtensible Messaging and Presence Protocol) community is facing the need to validate protocol and implementations to even larger scale.
2 0.18835765 318 high scalability-2008-05-14-New Facebook Chat Feature Scales to 70 Million Users Using Erlang
Introduction: Update : Erlang at Facebook by Eugene Letuchy. How Facebook uses Erlang to implement Chat, AIM Presence, and Chat Jabber support. I've done some XMPP development so when I read Facebook was making a Jabber chat client I was really curious how they would make it work. While core XMPP is straightforward, a number of protocol extensions like discovery, forms, chat states, pubsub, multi user chat, and privacy lists really up the implementation complexity. Some real engineering challenges were involved to make this puppy scale and perform. It's not clear what extensions they've implemented, but a blog entry by Facebook's Eugene Letuchy hits some of the architectural challenges they faced and how they overcame them. A web based Jabber client poses a few problems because XMPP, like most IM protocols, is an asynchronous event driven system that pretty much assumes you have a full time open connection. After logging in the server sends a client roster information and presence info
3 0.17069232 780 high scalability-2010-02-19-Twitter’s Plan to Analyze 100 Billion Tweets
Introduction: If Twitter is the “nervous system of the web” as some people think, then what is the brain that makes sense of all those signals (tweets) from the nervous system? That brain is the Twitter Analytics System and Kevin Weil, as Analytics Lead at Twitter, is the homunculus within in charge of figuring out what those over 100 billion tweets (approximately the number of neurons in the human brain) mean. Twitter has only 10% of the expected 100 billion tweets now, but a good brain always plans ahead. Kevin gave a talk, Hadoop and Protocol Buffers at Twitter , at the Hadoop Meetup , explaining how Twitter plans to use all that data to an answer key business questions. What type of questions is Twitter interested in answering? Questions that help them better understand Twitter. Questions like: How many requests do we serve in a day? What is the average latency? How many searches happen in day? How many unique queries, how many unique users, what is their geographic dist
4 0.14404686 431 high scalability-2008-10-27-Notify.me Architecture - Synchronicity Kills
Introduction: What's cool about starting a new project is you finally have a chance to do it right. You of course eventually mess everything up in your own way, but for that one moment the world has a perfect order, a rightness that feels satisfying and good. Arne Claassen, the CTO of notify.me, a brand new real time notification delivery service, is in this honeymoon period now. Arne has been gracious enough to share with us his philosophy of how to build a notification service. I think you'll find it fascinating because Arne goes into a lot of useful detail about how his system works. His main design philosophy is to minimize the bottlenecks that form around synchronous access, that is when some resource is requested and the requestor ties up more resources, waiting for a response. If the requested resource can’t be delivered in a timely manner, more and more requests pile up until the server can’t accept any new ones. Nobody gets what they want and you have an outage. Breaking synchronous op
5 0.14172739 1243 high scalability-2012-05-10-Paper: Paxos Made Moderately Complex
Introduction: If you are a normal human being and find the Paxos protocol confusing, then this paper, Paxos Made Moderately Complex , is a great find. Robbert van Renesse from Cornell University has written a clear and well written paper with excellent explanations. The Abstract: For anybody who has ever tried to implement it, Paxos is by no means a simple protocol, even though it is based on relatively simple invariants. This paper provides imperative pseudo-code for the full Paxos (or Multi-Paxos) protocol without shying away from discussing various implementation details. The initial description avoids optimizations that complicate comprehension. Next we discuss liveness, and list various optimizations that make the protocol practical. Related Articles Paxos on HighScalability.com
6 0.13337442 21 high scalability-2007-07-23-GoogleTalk Architecture
7 0.11493814 1312 high scalability-2012-08-27-Zoosk - The Engineering behind Real Time Communications
8 0.11080667 1573 high scalability-2014-01-06-How HipChat Stores and Indexes Billions of Messages Using ElasticSearch and Redis
9 0.10604934 1459 high scalability-2013-05-16-Paper: Warp: Multi-Key Transactions for Key-Value Stores
10 0.093662381 1142 high scalability-2011-11-14-Using Gossip Protocols for Failure Detection, Monitoring, Messaging and Other Good Things
11 0.088192157 960 high scalability-2010-12-20-Netflix: Use Less Chatty Protocols in the Cloud - Plus 26 Fixes
12 0.084244587 469 high scalability-2008-12-17-Scalability Strategies Primer: Database Sharding
13 0.081585243 122 high scalability-2007-10-14-Product: The Spread Toolkit
15 0.078641161 1446 high scalability-2013-04-25-Paper: Making reliable distributed systems in the presence of software errors
17 0.07357914 1184 high scalability-2012-01-31-Performance in the Cloud: Business Jitter is Bad
19 0.069077782 70 high scalability-2007-08-22-How many machines do you need to run your site?
20 0.066823989 212 high scalability-2008-01-14-OpenSpaces.org community site launched - framework for building scale-out applications
topicId topicWeight
[(0, 0.058), (1, 0.019), (2, 0.003), (3, 0.028), (4, 0.029), (5, -0.006), (6, 0.017), (7, 0.013), (8, -0.027), (9, 0.011), (10, 0.033), (11, 0.049), (12, -0.014), (13, -0.0), (14, 0.017), (15, -0.0), (16, 0.03), (17, 0.028), (18, 0.015), (19, -0.033), (20, 0.013), (21, 0.024), (22, 0.002), (23, 0.015), (24, 0.006), (25, -0.011), (26, 0.064), (27, -0.021), (28, -0.005), (29, -0.031), (30, 0.03), (31, -0.013), (32, -0.082), (33, -0.026), (34, 0.015), (35, -0.013), (36, 0.007), (37, -0.009), (38, 0.002), (39, -0.024), (40, 0.03), (41, 0.022), (42, 0.034), (43, 0.004), (44, -0.031), (45, -0.047), (46, -0.036), (47, 0.025), (48, -0.067), (49, -0.067)]
simIndex simValue blogId blogTitle
same-blog 1 0.94073915 234 high scalability-2008-01-30-The AOL XMPP scalability challenge
Introduction: Large scale distributed instant messaging, presence based protocol are a real challenge. With big players adopting the standard, the XMPP (eXtensible Messaging and Presence Protocol) community is facing the need to validate protocol and implementations to even larger scale.
Introduction: So how do you knit multiple datacenters and many thousands of phones and other clients into a single cooperating system? Usually you don't. It's too hard. We see nascent attempts in services like Firebase and Parse. SwiftCloud , as described in SwiftCloud: Fault-Tolerant Geo-Replication Integrated all the Way to the Client Machine , goes two steps further, by leveraging Conflict free Replicated Data Types (CRDTs), which means "data can be replicated at multiple sites and be updated independently with the guarantee that all replicas converge to the same value. In a cloud environment, this allows a user to access the data center closer to the user, thus optimizing the latency for all users." While we don't see these kind of systems just yet, they are a strong candidate for how things will work in the future, efficiently using resources at every level while supporting huge numbers of cooperating users. Abstract : Client-side logic and storage are increasingly used in web a
3 0.49669921 1153 high scalability-2011-12-08-Update on Scalable Causal Consistency For Wide-Area Storage With COPS
Introduction: Here are a few updates on the article Paper: Don’t Settle For Eventual: Scalable Causal Consistency For Wide-Area Storage With COPS from Mike Freedman and Wyatt Lloyd. Q: How software architectures could change in response to casual+ consistency? A : I don't really think they would much. Somebody would still run a two-tier architecture in their datacenter: a front-tier of webservers running both (say) PHP and our client library, and a back tier of storage nodes running COPS. (I'm not sure if it was obvious given the discussion of our "thick" client -- you should think of the COPS client dropping in where a memcache client library does...albeit ours has per-session state.) Q: Why not just use vector clocks? A : The problem with vector clocks and scalability has always been that the size of vector clocks in O(N), where N is the number of nodes. So if we want to scale to a datacenter with 10K nodes, each piece of metadata must have size O(10K). And in fact, vector
4 0.49608028 1243 high scalability-2012-05-10-Paper: Paxos Made Moderately Complex
Introduction: If you are a normal human being and find the Paxos protocol confusing, then this paper, Paxos Made Moderately Complex , is a great find. Robbert van Renesse from Cornell University has written a clear and well written paper with excellent explanations. The Abstract: For anybody who has ever tried to implement it, Paxos is by no means a simple protocol, even though it is based on relatively simple invariants. This paper provides imperative pseudo-code for the full Paxos (or Multi-Paxos) protocol without shying away from discussing various implementation details. The initial description avoids optimizations that complicate comprehension. Next we discuss liveness, and list various optimizations that make the protocol practical. Related Articles Paxos on HighScalability.com
5 0.48995042 1374 high scalability-2012-12-18-Georeplication: When Bad Things Happen to Good Systems
Introduction: Georeplication is one of the standard techniques for dealing when bad things--failure and latency--happen to good systems. The problem is always: how do you do that? Murat Demirbas , Associate Professor at SUNY Buffalo, has a couple of really good posts that can help: MDCC: Multi-Data Center Consistency and Making Geo-Replicated Systems Fast as Possible, Consistent when Necessary . In MDCC: Multi-Data Center Consistency Murat discusses a paper that says synchronous wide-area replication can be feasible. There's a quick and clear explanation of Paxos and various optimizations that is worth the price of admission. We find that strong consistency doesn't have to be lost across a WAN: The good thing about using Paxos over the WAN is you /almost/ get the full CAP (all three properties: consistency, availability, and partition-freedom). As we discussed earlier (Paxos taught), Paxos is CP, that is, in the presence of a partition, Paxos keeps consistency over availability. But, P
6 0.4898369 529 high scalability-2009-03-10-Paper: Consensus Protocols: Paxos
7 0.48981529 318 high scalability-2008-05-14-New Facebook Chat Feature Scales to 70 Million Users Using Erlang
8 0.48014551 1459 high scalability-2013-05-16-Paper: Warp: Multi-Key Transactions for Key-Value Stores
9 0.47244319 431 high scalability-2008-10-27-Notify.me Architecture - Synchronicity Kills
10 0.46464881 792 high scalability-2010-03-10-How FarmVille Scales - The Follow-up
11 0.46272928 1273 high scalability-2012-06-27-Paper: Logic and Lattices for Distributed Programming
12 0.46153101 1142 high scalability-2011-11-14-Using Gossip Protocols for Failure Detection, Monitoring, Messaging and Other Good Things
13 0.45253664 1146 high scalability-2011-11-23-Paper: Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS
14 0.4506647 1450 high scalability-2013-05-01-Myth: Eric Brewer on Why Banks are BASE Not ACID - Availability Is Revenue
15 0.44797266 478 high scalability-2008-12-29-Paper: Spamalytics: An Empirical Analysisof Spam Marketing Conversion
17 0.43980259 205 high scalability-2008-01-10-Letting Clients Know What's Changed: Push Me or Pull Me?
18 0.43842411 343 high scalability-2008-06-09-Apple's iPhone to Use a Centralized Push Based Notification Architecture
19 0.43656132 510 high scalability-2009-02-09-Paper: Consensus Protocols: Two-Phase Commit
20 0.42452782 879 high scalability-2010-08-12-Think of Latency as a Pseudo-permanent Network Partition
topicId topicWeight
[(2, 0.247), (65, 0.555)]
simIndex simValue blogId blogTitle
same-blog 1 0.85638297 234 high scalability-2008-01-30-The AOL XMPP scalability challenge
Introduction: Large scale distributed instant messaging, presence based protocol are a real challenge. With big players adopting the standard, the XMPP (eXtensible Messaging and Presence Protocol) community is facing the need to validate protocol and implementations to even larger scale.
2 0.68842727 463 high scalability-2008-12-09-Rules of Thumb in Data Engineering
Introduction: This is an interesting and still relevant research paper by Jim Gray, Prashant Shenoy at Microsoft Research that examines the rules of thumb for the design of data storage systems. It looks at storage, processing, and networking costs, ratios, and trends with a particular focus on performance and price/performance. Jim Gray has an updated presentation on this interesting topic: Long Term Storage Trends and You . Robin Harris has a great post that reflects on the Rules of Thumb whitepaper on his StorageMojo blog: Architecting the Internet Data Center - Parts I-IV .
3 0.64025891 48 high scalability-2007-07-30-What is Mashery?
Introduction: In the Amazon Services architecture article the podcast mentions Mashery. I went to their site at http://www.mashery.com/, but I can't quite figure out what it is. They want to: Unleash and manage channels for your API responsibly with Mashery’s combination of security, usage, access management, tracking, metrics, commerce, performance optimization and developer community tools. An example would help, because I am not getting it.
4 0.61230129 158 high scalability-2007-11-17-Can How Bees Solve their Load Balancing Problems Help Build More Scalable Websites?
Introduction: Bees have a similar problem to website servers: how to do a lot of work with limited resources in an ever changing environment. Usually lessons from biology are hard to apply to computer problems. Nature throws hardware at problems. Billions and billions of cells cooperate at different levels of organizations to find food, fight lions, and make sure your DNA is passed on. Nature's software is "simple," but her hardware rocks. We do the opposite. For us hardware is in short supply so we use limited hardware and leverage "smart" software to work around our inability to throw hardware at problems. But we might be able to borrow some load balancing techniques from bees. What do bees do that we can learn from? Bees do a dance to indicate the quality and location of a nectar source. When a bee finds a better source they do a better dance and resources shift to the new location. This approach may seem inefficient, but it turns out to be "optimal for the unpredictable nectar world." Crai
5 0.56137311 489 high scalability-2009-01-11-17 Distributed Systems and Web Scalability Resources
Introduction: Here's a short list of some great resources that I've found very inspirational and thought provoking. I've broken these resources up into two lists: Blogs and Presentations.
6 0.52218992 800 high scalability-2010-03-26-Strategy: Caching 404s Saved the Onion 66% on Server Time
7 0.49592739 1296 high scalability-2012-08-02-Strategy: Use Spare Region Capacity to Survive Availability Zone Failures
8 0.46939814 1581 high scalability-2014-01-17-Stuff The Internet Says On Scalability For January 17th, 2014
9 0.46637568 1373 high scalability-2012-12-17-11 Uses For the Humble Presents Queue, er, Message Queue
10 0.45175129 1365 high scalability-2012-11-30-Stuff The Internet Says On Scalability For November 30, 2012
11 0.40635836 56 high scalability-2007-08-03-Running Hadoop MapReduce on Amazon EC2 and Amazon S3
12 0.40635836 565 high scalability-2009-04-13-Benchmark for keeping data in browser in AJAX projects
13 0.40615603 436 high scalability-2008-11-02-Strategy: How to Manage Sessions Using Memcached
14 0.40589744 223 high scalability-2008-01-25-Google: Introduction to Distributed System Design
15 0.40574688 728 high scalability-2009-10-26-Facebook's Memcached Multiget Hole: More machines != More Capacity
16 0.40453631 836 high scalability-2010-06-04-Strategy: Cache Larger Chunks - Cache Hit Rate is a Bad Indicator
17 0.40444577 878 high scalability-2010-08-12-Strategy: Terminate SSL Connections in Hardware and Reduce Server Count by 40%
18 0.40405652 911 high scalability-2010-09-30-More Troubles with Caching
19 0.40130678 594 high scalability-2009-05-08-Eight Best Practices for Building Scalable Systems
20 0.40092656 455 high scalability-2008-12-01-MySQL Database Scale-out and Replication for High Growth Businesses