high_scalability high_scalability-2009 high_scalability-2009-659 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: In Scalability issues for dummies Alex Barrera talks movingly about the challenges he faces trying to scale his startup inkzee as the lone developer. Inkzee is an online news reader that automatically groups similar topics . This is a cool problem and is one you know right away is going to have some killer scalability problems as the number of feeds and the number of users increase. And these problems lead to the point of the post, to explain here what are scalability problems and how deep the repercussions are for a small company , which Alex does admirably. Some takeaways: Sites are composed of a frontend and backend. The backend isn't visible to users, but it does all the work and this is where the scalability problems show up. As more and more users use a site it becomes slow because more users reveal bottlenecks in the system that weren't visible before. There can be many many reasons for these bottlenecks. They are often very hard to find because the backend sy
sentIndex sentText sentNum sentScore
1 In Scalability issues for dummies Alex Barrera talks movingly about the challenges he faces trying to scale his startup inkzee as the lone developer. [sent-1, score-0.299]
2 This is a cool problem and is one you know right away is going to have some killer scalability problems as the number of feeds and the number of users increase. [sent-3, score-0.449]
3 And these problems lead to the point of the post, to explain here what are scalability problems and how deep the repercussions are for a small company , which Alex does admirably. [sent-4, score-0.482]
4 The backend isn't visible to users, but it does all the work and this is where the scalability problems show up. [sent-6, score-0.429]
5 As more and more users use a site it becomes slow because more users reveal bottlenecks in the system that weren't visible before. [sent-7, score-0.696]
6 They are often very hard to find because the backend systems are very complex and have a lot of complex interactions. [sent-9, score-0.295]
7 It takes a while to find the bottlenecks and create fixes for them. [sent-10, score-0.513]
8 You are never quite sure if the fixes will really work. [sent-11, score-0.236]
9 Many of these problems are unique to the problem space so pre-canned solutions aren't always available. [sent-12, score-0.269]
10 And because you don't want to destroy your production servers it takes a while to put fixes into the system. [sent-13, score-0.431]
11 This means your release cycle is slow which means progress on your site is slow. [sent-14, score-0.31]
12 The process of redesign sucks up all your resources so progress on the site stalls almost completely, especially for a small development group. [sent-15, score-0.441]
13 Scalability problems aren’t something you can discard as being ONLY technical, it’s roots might be technical but its effects will shake the whole company. [sent-20, score-0.505]
14 I found Alex's commentary quite touching and familiar. [sent-21, score-0.204]
15 It's the modern equivalent of an explorer following a dream. [sent-23, score-0.118]
16 Going alone into uncharted territory where the Dragons live and trying to survive when everything seems against you. [sent-24, score-0.334]
17 For every great returning hero there are 10 who do not make it back. [sent-25, score-0.188]
18 But it also doesn't hurt to use our Venus brain for a moment and simply recognize the toll this process can take. [sent-28, score-0.218]
19 The continual stream of problems and lack of positive feedback can wear you down after a while. [sent-30, score-0.367]
20 To stick with it takes a bit of craziness in the heart. [sent-31, score-0.097]
wordName wordTfidf (topN-words)
[('fixes', 0.236), ('alex', 0.219), ('problems', 0.175), ('visible', 0.163), ('progress', 0.146), ('alone', 0.145), ('dragons', 0.132), ('jerry', 0.132), ('repercussions', 0.132), ('marton', 0.124), ('toll', 0.124), ('hard', 0.119), ('discard', 0.118), ('explorer', 0.118), ('tharakan', 0.11), ('lone', 0.11), ('mars', 0.11), ('shake', 0.107), ('territory', 0.107), ('faces', 0.107), ('stalls', 0.107), ('redesign', 0.105), ('roots', 0.105), ('hero', 0.102), ('wear', 0.102), ('commentary', 0.102), ('touching', 0.102), ('royans', 0.098), ('destroy', 0.098), ('takes', 0.097), ('generator', 0.095), ('takeaways', 0.095), ('along', 0.095), ('bottlenecks', 0.095), ('opinions', 0.094), ('hurt', 0.094), ('reveal', 0.094), ('problem', 0.094), ('stops', 0.092), ('backend', 0.091), ('going', 0.09), ('continual', 0.09), ('users', 0.09), ('returning', 0.086), ('martin', 0.085), ('find', 0.085), ('partners', 0.083), ('site', 0.083), ('trying', 0.082), ('slow', 0.081)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 659 high scalability-2009-07-20-A Scalability Lament
Introduction: In Scalability issues for dummies Alex Barrera talks movingly about the challenges he faces trying to scale his startup inkzee as the lone developer. Inkzee is an online news reader that automatically groups similar topics . This is a cool problem and is one you know right away is going to have some killer scalability problems as the number of feeds and the number of users increase. And these problems lead to the point of the post, to explain here what are scalability problems and how deep the repercussions are for a small company , which Alex does admirably. Some takeaways: Sites are composed of a frontend and backend. The backend isn't visible to users, but it does all the work and this is where the scalability problems show up. As more and more users use a site it becomes slow because more users reveal bottlenecks in the system that weren't visible before. There can be many many reasons for these bottlenecks. They are often very hard to find because the backend sy
Introduction: "But it is not complicated. [There's] just a lot of it." \--Richard Feynmanon how the immense variety of the world arises from simple rules.Contents:Have We Reached the End of Scaling?Applications Become Black Boxes Using Markets to Scale and Control CostsLet's Welcome our Neo-Feudal OverlordsThe Economic Argument for the Ambient CloudWhat Will Kill the Cloud?The Amazing Collective Compute Power of the Ambient CloudUsing the Ambient Cloud as an Application RuntimeApplications as Virtual StatesConclusionWe have not yet begun to scale. The world is still fundamentally disconnected and for all our wisdom we are still in the earliest days of learning how to build truly large planet-scaling applications.Today 350 million users on Facebook is a lot of users and five million followers on Twitter is a lot of followers. This may seem like a lot now, but consider we have no planet wide applications yet. None.Tomorrow the numbers foreshadow a newCambrian explosionof connectivity that will look as
Introduction: All in all this is still my favorite post and I still think it's an accurate vision of a future. Not everyone agrees, but I guess we'll see..."But it is not complicated. [There's] just a lot of it." \--Richard Feynmanon how the immense variety of the world arises from simple rules.Contents:Have We Reached the End of Scaling?Applications Become Black Boxes Using Markets to Scale and Control CostsLet's Welcome our Neo-Feudal OverlordsThe Economic Argument for the Ambient CloudWhat Will Kill the Cloud?The Amazing Collective Compute Power of the Ambient CloudUsing the Ambient Cloud as an Application RuntimeApplications as Virtual StatesConclusionWe have not yet begun to scale. The world is still fundamentally disconnected and for all our wisdom we are still in the earliest days of learning how to build truly large planet-scaling applications.Today 350 million users on Facebook is a lot of users and five million followers on Twitter is a lot of followers. This may seem like a lot now, but c
4 0.11769405 970 high scalability-2011-01-06-BankSimple Mini-Architecture - Using a Next Generation Toolchain
Introduction: I know people are always interested in what others are using to build their systems. Alex Payne , CTO of the new startup BankSimple , gives us a quick hit on their toolchain choices in this Quora thread . BankSimple positions itself as a customer-focused alternative to online banking. You may remember Alex from the early days of Twitter . Alex was always helpful to me on Twitter's programmer support list, so I really wish them well. Alex is also a bit of an outside the box thinker, which is reflected in some of their choices: The JVM acts as a convergence platform for these languages: Scala - ideal for writing performance-sensitive components that need the safety and expressiveness of the language's advanced type system. Clojure - rapidly prototype in a more dynamic language while still offering the benefits of functional programming. JRuby - makes available a bunch of great libraries and frameworks for doing frontend web development, like Rails and Pa
5 0.10731975 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT
Introduction: It remains that, from the same principles, I now demonstrate the frame of the System of the World. -- Isaac Newton The practice of IT reminds me a lot of the practice of science before Isaac Newton. Aristotelianism was dead, but there was nothing to replace it. Then Newton came along, created a scientific revolution with his System of the World . And everything changed. That was New System of the World number one. New System of the World number two was written about by the incomparable Neal Stephenson in his incredible Baroque Cycle series. It explores the singular creation of a new way of organizing society grounded in new modes of thought in business, religion, politics, and science. Our modern world emerged Enlightened as it could from this roiling cauldron of forces. In IT we may have had a Leonardo da Vinci or even a Galileo, but we’ve never had our Newton. Maybe we don't need a towering genius to make everything clear? For years startups, like the frenetically inventive
6 0.10659887 715 high scalability-2009-10-06-10 Ways to Take your Site from One to One Million Users by Kevin Rose
7 0.10616921 1359 high scalability-2012-11-15-Gone Fishin': Justin.Tv's Live Video Broadcasting Architecture
8 0.10613066 796 high scalability-2010-03-16-Justin.tv's Live Video Broadcasting Architecture
9 0.10586753 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?
10 0.1049494 1356 high scalability-2012-11-07-Gone Fishin': 10 Ways to Take your Site from One to One Million Users by Kevin Rose
11 0.10399979 1508 high scalability-2013-08-28-Sean Hull's 20 Biggest Bottlenecks that Reduce and Slow Down Scalability
12 0.10255733 1535 high scalability-2013-10-21-Google's Sanjay Ghemawat on What Made Google Google and Great Big Data Career Advice
13 0.10034257 691 high scalability-2009-08-31-Squarespace Architecture - A Grid Handles Hundreds of Millions of Requests a Month
14 0.09951596 313 high scalability-2008-05-02-Friends for Sale Architecture - A 300 Million Page View-Month Facebook RoR App
15 0.098642595 1366 high scalability-2012-12-03-Resiliency is the New Normal - A Deep Look at What It Means and How to Build It
16 0.097716779 70 high scalability-2007-08-22-How many machines do you need to run your site?
17 0.095965713 96 high scalability-2007-09-18-Amazon Architecture
18 0.095877282 1568 high scalability-2013-12-23-What Happens While Your Brain Sleeps is Surprisingly Like How Computers Stay Sane
19 0.095368139 639 high scalability-2009-06-27-Scaling Twitter: Making Twitter 10000 Percent Faster
20 0.09458039 721 high scalability-2009-10-13-Why are Facebook, Digg, and Twitter so hard to scale?
topicId topicWeight
[(0, 0.196), (1, 0.079), (2, -0.009), (3, 0.007), (4, 0.027), (5, -0.088), (6, -0.049), (7, 0.042), (8, 0.016), (9, -0.036), (10, -0.025), (11, 0.039), (12, -0.016), (13, 0.025), (14, 0.036), (15, -0.065), (16, 0.072), (17, -0.011), (18, -0.016), (19, 0.053), (20, 0.015), (21, -0.033), (22, 0.012), (23, 0.013), (24, 0.008), (25, -0.002), (26, 0.006), (27, 0.026), (28, -0.009), (29, -0.001), (30, 0.044), (31, 0.017), (32, -0.027), (33, -0.009), (34, -0.018), (35, 0.064), (36, -0.066), (37, 0.002), (38, -0.02), (39, 0.004), (40, -0.032), (41, -0.032), (42, 0.003), (43, -0.003), (44, -0.037), (45, -0.01), (46, -0.003), (47, -0.004), (48, 0.027), (49, 0.033)]
simIndex simValue blogId blogTitle
same-blog 1 0.98397416 659 high scalability-2009-07-20-A Scalability Lament
Introduction: In Scalability issues for dummies Alex Barrera talks movingly about the challenges he faces trying to scale his startup inkzee as the lone developer. Inkzee is an online news reader that automatically groups similar topics . This is a cool problem and is one you know right away is going to have some killer scalability problems as the number of feeds and the number of users increase. And these problems lead to the point of the post, to explain here what are scalability problems and how deep the repercussions are for a small company , which Alex does admirably. Some takeaways: Sites are composed of a frontend and backend. The backend isn't visible to users, but it does all the work and this is where the scalability problems show up. As more and more users use a site it becomes slow because more users reveal bottlenecks in the system that weren't visible before. There can be many many reasons for these bottlenecks. They are often very hard to find because the backend sy
Introduction: Update 2: Velocity 09: John Allspaw, 10+ Deploys Per Day: Dev and Ops Cooperation at Flickr . Insightful talk. Some highlights: Change is good if you can build tools and culture to lower the risk of change. Operations and developers need to become of one mind and respect each other. An automated infrastructure is the one tool you need most. Common source control. One step build. One step deploy. Don't be a pussy, deploy. Always ship trunk. Feature flags - don't branch code, make features runtime configurable in code. Dark launch - release data paths early without UI component. Shared metrics. Adaptive feedback to prioritize important features. IRC for communication for human context. Best solutions occur when dev and op work together and trust each other. Trust is earned by helping each other solve their problems. Look at what new features imply for operations, what can go wrong, and how to recover. Provide knobs and levers to help operations. Devs should have access to production
3 0.81432074 614 high scalability-2009-06-01-Guess How Many Users it Takes to Kill Your Site?
Introduction: Update: Here's the first result . Good response time until 400 users. At 1,340 users the response time was 6 seconds. And at 2000 users the site was effectively did. An interesting point was that errors that could harm a site's reputation started at 1000 users. Cheers to the company that had the guts to give this a try. That which doesn't kill your site makes it stronger. Or at least that's the capacity planning strategy John Allspaw recommends (not really, but I'm trying to make a point here) in The Art of Capacity Planning : Using production traffic to define your resources ceilings in a controlled setting allows you to see firsthand what would happen when you run out of capacity in a particular resource. Of course I'm not suggesting that you run your site into the ground, but better to know what your real (not simulated) loads are while you're watching, than find out the hard way. In addition, a lot of unexpected systemic things can happen when load increases in a particular
4 0.81023306 344 high scalability-2008-06-09-FaceStat's Rousing Tale of Scaling Woe and Wisdom Won
Introduction: Lukas Biewald shares a fascinating slam by slam recount of how his FaceStat (upload your picture and be judged by the masses) site was battered by a link on Yahoo's main page that caused an almost instantaneous 650,000 page view jump on their site. Yahoo spends considerable effort making sure its own properties can handle the truly massive flow from the main page. Turning the Great Eye of the Internet towards an unsuspecting newborn site must be quite the diaper ready experience. Theo Schlossnagle eerily prophesized about such events in The Implications of Punctuated Scalabilium for Website Architecture : massive, unexpected and sudden traffic spikes will become more common as a fickle internet seeks ever for new entertainments (my summary). Exactly FaceStat's situation. This is also one of our first exposures to an application written on Merb, a popular Ruby on Rails competitor. For those who think Ruby is the problem, their architecture now serves 100 times the original load
5 0.80287474 330 high scalability-2008-05-27-Should Twitter be an All-You-Can-Eat Buffet or a Vending Machine?
Introduction: Om proposes one solution to the Twitter Problem is to limit followers to three square meals a day. The reasonable idea being that lower limits should mean fewer scaling problems. And as a kicker raising those limits is a good way to raise much needed revenue. Scoble thinks users should consume without limit and will drive to another buffet if all-you-can-eat privileges are revoked. The reasonable idea being that if an internet service can't solve internet scale problems then there's not much use for it. Dave says comp power users a top floor suite and shower them with free passes to the buffet. Let the good times roll! The reasonable idea being that power users help create popular restaurants, er, services in the first place and limiting them starves users and starved users won't come back. So, should web services like Twitter be a buffet, a fixed eight course fine dining experience, a small plate restaurant, a family style joint, or a vending machine? Or
6 0.78792632 347 high scalability-2008-07-07-Five Ways to Stop Framework Fixation from Crashing Your Scaling Strategy
7 0.78454119 1011 high scalability-2011-03-25-Did the Microsoft Stack Kill MySpace?
8 0.78173399 1492 high scalability-2013-07-17-How do you create a 100th Monkey software development culture?
9 0.7725423 714 high scalability-2009-10-02-HighScalability has Moved to Squarespace.com!
10 0.77237743 533 high scalability-2009-03-11-The Implications of Punctuated Scalabilium for Website Architecture
11 0.76950902 1500 high scalability-2013-08-12-100 Curse Free Lessons from Gordon Ramsay on Building Great Software
12 0.76784557 116 high scalability-2007-10-08-Lessons from Pownce - The Early Years
13 0.76599824 1108 high scalability-2011-08-31-Pud is the Anti-Stack - Windows, CFML, Dropbox, Xeround, JungleDisk, ELB
14 0.76507962 1503 high scalability-2013-08-19-What can the Amazing Race to the South Pole Teach us About Startups?
15 0.76360357 1432 high scalability-2013-04-01-Khan Academy Checkbook Scaling to 6 Million Users a Month on GAE
16 0.75824225 1356 high scalability-2012-11-07-Gone Fishin': 10 Ways to Take your Site from One to One Million Users by Kevin Rose
17 0.75591254 1026 high scalability-2011-04-18-6 Ways Not to Scale that Will Make You Hip, Popular and Loved By VCs
18 0.75351262 1477 high scalability-2013-06-18-Scaling Mailbox - From 0 to One Million Users in 6 Weeks and 100 Million Messages Per Day
19 0.75330746 1515 high scalability-2013-09-11-Ten Lessons from GitHub’s First Year in 2008
topicId topicWeight
[(1, 0.144), (2, 0.164), (10, 0.045), (56, 0.271), (61, 0.13), (79, 0.083), (85, 0.045), (94, 0.04)]
simIndex simValue blogId blogTitle
1 0.93478394 779 high scalability-2010-02-16-Seven Signs You May Need a NoSQL Database
Introduction: While exploring deep into some dusty old library stacks, I dug up Nostradamus' long lost NoSQL codex. What are the chances? Strangely, it also gave the plot to the next Dan Brown novel, but I left that out for reasons of sanity. About NoSQL, here is what Nosty (his friends call him Nosty) predicted are the signs you may need a NoSQL database... You noticed a lot of your database fields are really serialized complex objects in disguise . Why bother with a RDBMS at all then? Storing serialized objects in a relational database is like being on the pill while trying to get pregnant, a bit counter productive. Just use a schemaless database from the start. Using a standard query language has become too confining . You just want to be free. SQL is so easy, so convenient, and so standard, it's really not a challenge anymore. You need to be different. Then NoSQL is for you. Each has their own completely different query mechanism . Your toolbox only contains a hammer . Hammers wh
2 0.92443454 732 high scalability-2009-10-29-Digg - Looking to the Future with Cassandra
Introduction: Digg has been researching ways to scale our database infrastructure for some time now. We’ve adopted a traditional vertically partitioned master-slave configuration with MySQL, and also investigated sharding MySQL with IDDB . Ultimately, these solutions left us wanting. In the case of the traditional architecture, the lack of redundancy on the write masters is painful, and both approaches have significant management overhead to keep running. Since it was already necessary to abandon data normalization and consistency to make these approaches work, we felt comfortable looking at more exotic, non-relational data stores. After considering HBase, Hypertable, Cassandra, Tokyo Cabinet/Tyrant, Voldemort, and Dynomite, we settled on Cassandra . Each system has its own strengths and weaknesses, but Cassandra has a good blend of everything. It offers column-oriented data storage, so you have a bit more structure than plain key/value stores. It operates in a distributed, highly available,
3 0.89865768 941 high scalability-2010-11-15-How Google's Instant Previews Reduces HTTP Requests
Introduction: In a strange case of synchronicity, Google just published Instant Previews: Under the hood , a very well written blog post by Matías Pelenur of the Instant Previews team, giving some fascinating inside details on how Google implemented Instant Previews . It's syncronicty because I had just posted Strategy: Biggest Performance Impact Is To Reduce The Number Of HTTP Requests and one of the major ideas behind the design Instant Previews is to reduce the number of HTTP requests through a few well chosen tricks. Cosmic! Some of what Google does to reduce HTTP requests: Data URIs , which are are base64 encodings of image data, are used instead of static images that are served from the server. This means the whole preview can be pieced together from image slices in one request as both the data and the image are returned in the same request. Google found that even though base64 encoding adds about 33% to the size of the image, tests showed that gzip-compressed data URIs are compara
4 0.89493263 854 high scalability-2010-07-09-Hot Scalability Links for July 9, 2010
Introduction: Facebook serves 3 billion Like buttons a day says VentureBeat. CloudScaling reports: Rumor Mill: Google EC2 Competitor Coming in 2010? It looks like GAE for PaaS and an EC2 clone for IaaS. Tweets of gold: alandipert : scalability is a drug seldo : Scalability lesson #23: if any part of your system involves a list that gets bigger over time, eventually that list will become too big. obfuscurity : Her: "Go look at the pictures on the database." Me: "You mean our fileserver?" Her: "Whatever." luiscab : Ouch, I just read on an Info Mgmt rag that Hadoop could easily be an acronym for "Heck, Another Darn Obscure Open-source Project." sanity : Depressed about how much time I've had to spend searching for the right database solution for a new project. Each has it's flaws ioshints : You cannot take a car, grow it 10 times and expect to get a mining truck. A contentious thread on Hacker News: Mong
5 0.89411181 1022 high scalability-2011-04-13-Paper: NoSQL Databases - NoSQL Introduction and Overview
Introduction: Christof Strauch, from Stuttgart Media University, has written an incredible 120+ page paper titled NoSQL Databases as an introduction and overview to NoSQL databases . The paper was written between 2010-06 and 2011-02, so it may be a bit out of date, but if you are looking to take in the NoSQL world in one big gulp, this is your chance. I asked Christof to give us a short taste of what he was trying to accomplish in his paper: The paper aims at giving a systematic and thorough introduction and overview of the NoSQL field by assembling information dispersed among blogs, wikis and scientific papers. It firstly discusses reasons, rationales and motives for the development and usage of nonrelational database systems. These can be summarized by the need for high scalability, the processing of large amounts of data, the ability to distribute data among many (often commodity) servers, consequently a distribution-aware design of DBMSs. The paper then introduces fundamental concepts,
6 0.89389729 45 high scalability-2007-07-30-Product: SmarterStats
7 0.8795045 446 high scalability-2008-11-18-Scalability Perspectives #2: Van Jacobson – Content-Centric Networking
8 0.87089026 1394 high scalability-2013-01-25-Stuff The Internet Says On Scalability For January 25, 2013
9 0.85734564 67 high scalability-2007-08-17-What is the best hosting option?
same-blog 10 0.85233265 659 high scalability-2009-07-20-A Scalability Lament
11 0.82542753 1322 high scalability-2012-09-14-Stuff The Internet Says On Scalability For September 14, 2012
12 0.81275743 479 high scalability-2008-12-29-Platform virtualization - top 25 providers (software, hardware, combined)
13 0.807643 759 high scalability-2010-01-11-Strategy: Don't Use Polling for Real-time Feeds
14 0.79716891 815 high scalability-2010-04-27-Paper: Dapper, Google's Large-Scale Distributed Systems Tracing Infrastructure
15 0.75889653 1565 high scalability-2013-12-16-22 Recommendations for Building Effective High Traffic Web Software
16 0.75515729 245 high scalability-2008-02-12-Product: rPath - Creating and Managing Virtual Appliances
17 0.74638098 1180 high scalability-2012-01-24-The State of NoSQL in 2012
18 0.74126953 1189 high scalability-2012-02-07-Hypertable Routs HBase in Performance Test -- HBase Overwhelmed by Garbage Collection
19 0.73200184 711 high scalability-2009-09-22-How Ravelry Scales to 10 Million Requests Using Rails
20 0.73148876 1236 high scalability-2012-04-30-Masstree - Much Faster than MongoDB, VoltDB, Redis, and Competitive with Memcached