high_scalability high_scalability-2008 high_scalability-2008-265 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: One of the most important architectural decisions that must be done early on in a scalable web site project is splitting the data flow into two streams : one that is user specific and one that is generic. If this is done properly, the system will be able to grow easily. On the other hand, if the data streams are not separated from the start, then the growth options will be severely limited. Trying to make such a web site scale will be just painting the corpse, and this change will cost a whole lot more when you need to introduce it later (and it is "when" in this case, not "if").
sentIndex sentText sentNum sentScore
1 One of the most important architectural decisions that must be done early on in a scalable web site project is splitting the data flow into two streams : one that is user specific and one that is generic. [sent-1, score-2.281]
2 If this is done properly, the system will be able to grow easily. [sent-2, score-0.437]
3 On the other hand, if the data streams are not separated from the start, then the growth options will be severely limited. [sent-3, score-1.253]
4 Trying to make such a web site scale will be just painting the corpse, and this change will cost a whole lot more when you need to introduce it later (and it is "when" in this case, not "if"). [sent-4, score-1.376]
wordName wordTfidf (topN-words)
[('streams', 0.363), ('painting', 0.352), ('corpse', 0.324), ('severely', 0.324), ('separated', 0.244), ('properly', 0.204), ('done', 0.193), ('introduce', 0.182), ('hand', 0.168), ('architectural', 0.166), ('decisions', 0.166), ('site', 0.157), ('flow', 0.145), ('options', 0.143), ('grow', 0.125), ('specific', 0.124), ('growth', 0.123), ('early', 0.121), ('project', 0.118), ('later', 0.117), ('trying', 0.117), ('whole', 0.112), ('one', 0.101), ('start', 0.095), ('case', 0.094), ('web', 0.092), ('important', 0.09), ('must', 0.088), ('change', 0.084), ('cost', 0.081), ('able', 0.081), ('user', 0.069), ('scalable', 0.068), ('two', 0.063), ('lot', 0.056), ('data', 0.056), ('scale', 0.053), ('make', 0.046), ('need', 0.044), ('system', 0.038)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 265 high scalability-2008-03-03-Two data streams for a happy website
Introduction: One of the most important architectural decisions that must be done early on in a scalable web site project is splitting the data flow into two streams : one that is user specific and one that is generic. If this is done properly, the system will be able to grow easily. On the other hand, if the data streams are not separated from the start, then the growth options will be severely limited. Trying to make such a web site scale will be just painting the corpse, and this change will cost a whole lot more when you need to introduce it later (and it is "when" in this case, not "if").
2 0.12969144 1359 high scalability-2012-11-15-Gone Fishin': Justin.Tv's Live Video Broadcasting Architecture
Introduction: This is one of my favorite posts for a couple of reasons. I think it gives a lot of useful information in an interesting space. And Kyle Vogt was just a real pleasure to talk to. He was very helpful and forthcoming, which makes the whole experience better for everyone. The future is live. The future is real-time. The future is now. That's the hype anyway. And as it has a habit of doing, the hype is slowly becoming reality. We are seeing live searches, live tweets, live location, live reality augmentation, live crab (fresh and local), and live event publishing. One of the most challenging of all live technologies is that of live video broadcasting. Imagine a world in which everyone becomes a broadcaster and a consumer of video streams, all in real-time (< 250 msec latency), all so you can talk and interact directly without feeling like you are in the middle of a time shift war. The resources and the engineering needed to make this happened must be substantial. How do you do tha
3 0.12909324 796 high scalability-2010-03-16-Justin.tv's Live Video Broadcasting Architecture
Introduction: The future is live. The future is real-time. The future is now. That's the hype anyway. And as it has a habit of doing, the hype is slowly becoming reality. We are seeing live searches, live tweets, live location, live reality augmentation, live crab (fresh and local), and live event publishing. One of the most challenging of all live technologies is that of live video broadcasting. Imagine a world in which everyone becomes a broadcaster and a consumer of video streams, all in real-time (< 250 msec latency), all so you can talk and interact directly without feeling like you are in the middle of a time shift war. The resources and the engineering needed to make this happened must be substantial. How do you do that? To find out I talked to Kyle Vogt, Justin.tv Founder and VP of Engineering. Justin.tv certainly has the numbers. Their 30 million unique monthly visitors even outshine YouTube in the video upload game, reportedly uploading nearly 30 hours per minute of video compared to Y
4 0.095356382 1175 high scalability-2012-01-17-Paper: Feeding Frenzy: Selectively Materializing Users’ Event Feeds
Introduction: How do you scale an inbox that has multiple highly volatile feeds? That's a problem faced by social networks like Tumblr, Facebook, and Twitter. Follow a few hundred event sources and it's hard to scalably order an inbox so that you see a correct view as event sources continually publish new events. This can be considered like a view materialization problem in a database. In a database a view is a virtual table defined by a query that can be accessed like a table. Materialization refers to when the data behind the view is created. If a view is a join on several tables and that join is performed when the view is accessed, then performance will be slow. If the view is precomputed access to the view will be fast, but more resources are used, especially considering that the view may never be accessed. Your wall/inbox/stream is a view on all the people/things you follow. If you never look at your inbox then materializing the view in your inbox is a waste of resources, yet you'll be ma
5 0.091372102 1 high scalability-2007-07-06-Start Here
Introduction: This page is here to help you get started using High Scalability. Here are a few useful topics to get you going... Why does the High Scalability site exist? Good things to read. Participate by adding your own links to interesting sites and articles. Participate by signing up for the RSS feed. Consider the many benefits of registering as a user. How do I get notification of content and comment changes? Contact High Scalability. About. Why does the High Scalability site exist? To help you build successful scalable websites. This site tries to bring together all the lore, art, science, practice, and experience of building scalable websites into one place so you can learn how to build your website with confidence. When it becomes clear you must grow your website or die, most people have no idea where to start. It's not a skill you learn in school or pick up from a magazine article on a plane flight home. No, building scalable systems is a body o
7 0.079271838 632 high scalability-2009-06-15-starting small with growth in mind
8 0.077736363 1630 high scalability-2014-04-11-Stuff The Internet Says On Scalability For April 11th, 2014
9 0.074879169 194 high scalability-2007-12-26-Golden rule of web caching
10 0.073715799 570 high scalability-2009-04-15-Implementing large scale web analytics
11 0.073638663 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?
12 0.072869815 96 high scalability-2007-09-18-Amazon Architecture
13 0.07079006 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT
14 0.068603635 834 high scalability-2010-06-01-Web Speed Can Push You Off of Google Search Rankings! What Can You Do?
15 0.068432413 1413 high scalability-2013-02-27-42 Monster Problems that Attack as Loads Increase
16 0.06838055 218 high scalability-2008-01-17-Moving old to new. Do not be afraid of the re-write -- but take some help
17 0.067924403 1062 high scalability-2011-06-15-101 Questions to Ask When Considering a NoSQL Database
18 0.067840971 1440 high scalability-2013-04-15-Scaling Pinterest - From 0 to 10s of Billions of Page Views a Month in Two Years
19 0.066376545 572 high scalability-2009-04-16-Paper: The End of an Architectural Era (It’s Time for a Complete Rewrite)
20 0.066154689 1331 high scalability-2012-10-02-An Epic TripAdvisor Update: Why Not Run on the Cloud? The Grand Experiment.
topicId topicWeight
[(0, 0.113), (1, 0.052), (2, 0.008), (3, -0.045), (4, 0.011), (5, -0.042), (6, -0.024), (7, 0.004), (8, 0.017), (9, 0.015), (10, -0.021), (11, 0.022), (12, -0.034), (13, 0.013), (14, 0.074), (15, 0.005), (16, 0.037), (17, -0.018), (18, 0.03), (19, 0.032), (20, -0.003), (21, -0.01), (22, 0.026), (23, 0.022), (24, -0.016), (25, -0.053), (26, -0.042), (27, 0.002), (28, 0.042), (29, -0.016), (30, 0.039), (31, 0.066), (32, 0.0), (33, -0.013), (34, -0.027), (35, 0.009), (36, -0.039), (37, -0.039), (38, -0.038), (39, 0.012), (40, -0.022), (41, 0.028), (42, -0.006), (43, 0.003), (44, 0.021), (45, -0.003), (46, -0.046), (47, -0.032), (48, 0.026), (49, 0.027)]
simIndex simValue blogId blogTitle
same-blog 1 0.9585253 265 high scalability-2008-03-03-Two data streams for a happy website
Introduction: One of the most important architectural decisions that must be done early on in a scalable web site project is splitting the data flow into two streams : one that is user specific and one that is generic. If this is done properly, the system will be able to grow easily. On the other hand, if the data streams are not separated from the start, then the growth options will be severely limited. Trying to make such a web site scale will be just painting the corpse, and this change will cost a whole lot more when you need to introduce it later (and it is "when" in this case, not "if").
2 0.76619381 632 high scalability-2009-06-15-starting small with growth in mind
Introduction: Hello all, I'm working on a web site that might totally flop or it might explode to be the next facebook/flickr/digg/etc. Since I really don't know how popular the site will be I don't want to spend a ton of money on the hardware/hosting right away but I want to be able to scale it easily if it does grow rapidly. With this in mind, what would be the best approach to launch the site? Thanks, Dan
3 0.7367835 1 high scalability-2007-07-06-Start Here
Introduction: This page is here to help you get started using High Scalability. Here are a few useful topics to get you going... Why does the High Scalability site exist? Good things to read. Participate by adding your own links to interesting sites and articles. Participate by signing up for the RSS feed. Consider the many benefits of registering as a user. How do I get notification of content and comment changes? Contact High Scalability. About. Why does the High Scalability site exist? To help you build successful scalable websites. This site tries to bring together all the lore, art, science, practice, and experience of building scalable websites into one place so you can learn how to build your website with confidence. When it becomes clear you must grow your website or die, most people have no idea where to start. It's not a skill you learn in school or pick up from a magazine article on a plane flight home. No, building scalable systems is a body o
4 0.73675072 232 high scalability-2008-01-29-When things aren't scalable
Introduction: OK, I know this site is for scalable web site design. But as there aren't any sites I can find for graceful failure under "slashdotted" like pressure I'll ask here. Does anyone have a sensible way, once you have a "web application" that either won't scale, or can't scale, that you can give some users a good consistent experience and bounce other users to a busy site page. I have seen sites do this to varying degrees, some of which work better than others, but no explanations beyond simply bouncing requests to a "we're busy page server" when you have more than a given number of connections. This is obviously useless as a web page likely requires multiple connection (ignoring keep-alive, pipelining etc) multiple connection to completely render properly. The normal problem is users getting a page and not the "furniture" for that page like images or css. Other problems are having to wait ages to get the busy page or the site being slow even if you do "get in". And some site let
5 0.70354646 965 high scalability-2010-12-29-Pinboard.in Architecture - Pay to Play to Keep a System Small
Introduction: How do you keep a system small enough, while still being successful, that a simple scale-up strategy becomes the preferred architecture? StackOverflow , for example, could stick with a tool chain they were comfortable with because they had a natural brake on how fast they could grow: there are only so many programmers in the world. If this doesn't work for you, here's another natural braking strategy to consider: charge for your service . Paul Houle summarized this nicely as: avoid scaling problems by building a service that's profitable at a small scale . This interesting point, one I hadn't properly considered before, was brought up by Maciej Ceglowski, co-founder of Pinboard.in , in an interview with Leo Laporte and Amber MacArthur on their their net@night show. Pinboard is a lean, mean, pay for bookmarking machine, a timely replacement for the nearly departed Delicious. And as a self professed anti-social bookmarking site, it emphasizes speed over socializing . Maciej
6 0.69170761 10 high scalability-2007-07-15-Book: Building Scalable Web Sites
7 0.69154257 493 high scalability-2009-01-16-Just-In-Time Scalability: Agile Methods to Support Massive Growth (IMVU case study)
8 0.68440139 1356 high scalability-2012-11-07-Gone Fishin': 10 Ways to Take your Site from One to One Million Users by Kevin Rose
9 0.6790126 715 high scalability-2009-10-06-10 Ways to Take your Site from One to One Million Users by Kevin Rose
10 0.67788583 71 high scalability-2007-08-22-Profiling WEB applications
11 0.67366344 731 high scalability-2009-10-28-Need for change in your IT infrastructure
12 0.67295301 659 high scalability-2009-07-20-A Scalability Lament
13 0.67170644 344 high scalability-2008-06-09-FaceStat's Rousing Tale of Scaling Woe and Wisdom Won
14 0.67104185 614 high scalability-2009-06-01-Guess How Many Users it Takes to Kill Your Site?
15 0.66678113 206 high scalability-2008-01-10-MONO ASP.NET. Will it make the web???
16 0.65470123 714 high scalability-2009-10-02-HighScalability has Moved to Squarespace.com!
17 0.65469003 711 high scalability-2009-09-22-How Ravelry Scales to 10 Million Requests Using Rails
18 0.65323651 330 high scalability-2008-05-27-Should Twitter be an All-You-Can-Eat Buffet or a Vending Machine?
19 0.65311396 158 high scalability-2007-11-17-Can How Bees Solve their Load Balancing Problems Help Build More Scalable Websites?
20 0.65304458 159 high scalability-2007-11-18-Reverse Proxy
topicId topicWeight
[(1, 0.045), (2, 0.109), (10, 0.124), (27, 0.224), (61, 0.221), (79, 0.129)]
simIndex simValue blogId blogTitle
same-blog 1 0.88342589 265 high scalability-2008-03-03-Two data streams for a happy website
Introduction: One of the most important architectural decisions that must be done early on in a scalable web site project is splitting the data flow into two streams : one that is user specific and one that is generic. If this is done properly, the system will be able to grow easily. On the other hand, if the data streams are not separated from the start, then the growth options will be severely limited. Trying to make such a web site scale will be just painting the corpse, and this change will cost a whole lot more when you need to introduce it later (and it is "when" in this case, not "if").
Introduction: Abstract: When designing distributed web services, there are three properties that are commonly desired: consistency, availability, and partition tolerance. It is impossible to achieve all three. In this note, we prove this conjecture in the asynchronous network model, and then discuss solutions to this dilemma in the partially synchronous model.
3 0.79767257 1483 high scalability-2013-06-27-Paper: XORing Elephants: Novel Erasure Codes for Big Data
Introduction: Erasure codes are one of those seemingly magical mathematical creations that with the developments described in the paper XORing Elephants: Novel Erasure Codes for Big Data , are set to replace triple replication as the data storage protection mechanism of choice. The result says Robin Harris (StorageMojo) in an excellent article, Facebook’s advanced erasure codes : "WebCos will be able to store massive amounts of data more efficiently than ever before. Bad news: so will anyone else." Robin says with cheap disks triple replication made sense and was economical. With ever bigger BigData the overhead has become costly. But erasure codes have always suffered from unacceptably long time to repair times. This paper describes new Locally Repairable Codes (LRCs) that are efficiently repairable in disk I/O and bandwidth requirements: These systems are now designed to survive the loss of up to four storage elements – disks, servers, nodes or even entire data centers – without losing
4 0.76223052 1097 high scalability-2011-08-12-Stuff The Internet Says On Scalability For August 12, 2011
Introduction: Submitted for your scaling pleasure, you may not scale often, but when you scale, please drink us: Quotably quotable quotes: @mardix : There is no single point of truth in #NoSQL . #Consistency is no longer global, it's relative to the one accessing it. #Scalability @kekline : RT @CurtMonash: "...from industry figures, Basho/Riak is our third-biggest competitor." How often do you encounter them? "Never have" #nosql @dave_jacobs : Love being in a city where I can overhear a convo about Heroku scalability while doing deadlifts. #ahsanfrancisco @satheeshilu : Doctor at #hospital in india says #ge #healthcare software is slow to handle 100K X-rays an year.Scalability is critical 4 Indian #software @sufw : How can it be possible that Tagged has 80m users and I have *never* heard of it!?! @EventCloudPro : One of my vacation realizations? Whole #bigdata thing has turned into a lotta #bighype - many distinct issues & nothing to do w/ #bigdata No
5 0.72454458 347 high scalability-2008-07-07-Five Ways to Stop Framework Fixation from Crashing Your Scaling Strategy
Introduction: If you've wondered why I haven't been posting lately it's because I've been on an amazing Beach's motorcycle tour of the Alps ( and , and , and , and , and , and , and , and ). My wife (Linda) and I rode two-up on a BMW 1200 GS through the alps in Germany, Austria, Switzerland, Italy, Slovenia, and Lichtenstein. The trip was more beautiful than I ever imagined. We rode challenging mountain pass after mountain pass, froze in the rain, baked in the heat, woke up on excellent Italian coffee, ate slice after slice of tasty apple strudel, drank dazzling local wines, smelled the fresh cut grass as the Swiss en masse cut hay for the winter feeding of their dairy cows, rode the amazing Munich train system, listened as cow bells tinkled like wind chimes throughout small valleys, drank water from a pure alpine spring on a blisteringly hot hike, watched local German folk dancers represent their regions, and had fun in the company of fellow riders. Magical. They say you'll ride more
6 0.72372758 28 high scalability-2007-07-25-Product: NetApp MetroCluster Software
7 0.72087449 555 high scalability-2009-04-04-Performance Anti-Pattern
9 0.714504 1287 high scalability-2012-07-20-Stuff The Internet Says On Scalability For July 20, 2012
10 0.71041942 142 high scalability-2007-11-05-Strategy: Diagonal Scaling - Don't Forget to Scale Out AND Up
11 0.70801824 1142 high scalability-2011-11-14-Using Gossip Protocols for Failure Detection, Monitoring, Messaging and Other Good Things
12 0.70699567 1411 high scalability-2013-02-22-Stuff The Internet Says On Scalability For February 22, 2013
13 0.70306176 475 high scalability-2008-12-22-SLAs in the SaaS space
14 0.70303023 739 high scalability-2009-11-09-10 NoSQL Systems Reviewed
15 0.6976552 1242 high scalability-2012-05-09-Cell Architectures
16 0.69751519 930 high scalability-2010-10-28-NoSQL Took Away the Relational Model and Gave Nothing Back
17 0.69634885 150 high scalability-2007-11-12-Slashdot Architecture - How the Old Man of the Internet Learned to Scale
18 0.69489473 746 high scalability-2009-11-26-Kngine Snippet Search New Indexing Technology
19 0.69034076 332 high scalability-2008-05-28-Job queue and search engine
20 0.68767166 1337 high scalability-2012-10-10-Antirez: You Need to Think in Terms of Organizing Your Data for Fetching