high_scalability high_scalability-2007 high_scalability-2007-120 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Colin Charles has cool picture showing Flickr's message telling him they'll need about 15 minutes to move his 11,500 images to another shard. One, that's a lot of pictures! Two, it just goes to show you don't have to make this stuff complicated. Sure, it might be nice if their infrastructure could auto-balance shards with no down time and no loss of performance, but do you really need to go to all the extra complexity? The manual system works and though Colin would probably like his service to have been up, I am sure his day will still be a pleasant one.
sentIndex sentText sentNum sentScore
1 Colin Charles has cool picture showing Flickr's message telling him they'll need about 15 minutes to move his 11,500 images to another shard. [sent-1, score-0.974]
2 Two, it just goes to show you don't have to make this stuff complicated. [sent-3, score-0.397]
3 Sure, it might be nice if their infrastructure could auto-balance shards with no down time and no loss of performance, but do you really need to go to all the extra complexity? [sent-4, score-1.123]
4 The manual system works and though Colin would probably like his service to have been up, I am sure his day will still be a pleasant one. [sent-5, score-1.336]
wordName wordTfidf (topN-words)
[('colin', 0.629), ('pleasant', 0.306), ('pictures', 0.217), ('sure', 0.216), ('shards', 0.187), ('loss', 0.184), ('picture', 0.182), ('manual', 0.167), ('extra', 0.16), ('images', 0.152), ('message', 0.142), ('complexity', 0.133), ('show', 0.122), ('stuff', 0.12), ('nice', 0.12), ('minutes', 0.12), ('probably', 0.114), ('though', 0.111), ('goes', 0.109), ('cool', 0.108), ('move', 0.101), ('need', 0.087), ('works', 0.086), ('day', 0.085), ('another', 0.082), ('might', 0.081), ('infrastructure', 0.077), ('still', 0.069), ('one', 0.067), ('go', 0.067), ('service', 0.064), ('two', 0.063), ('really', 0.062), ('could', 0.06), ('lot', 0.056), ('would', 0.048), ('make', 0.046), ('performance', 0.043), ('system', 0.038), ('time', 0.038), ('like', 0.032)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 120 high scalability-2007-10-11-How Flickr Handles Moving You to Another Shard
Introduction: Colin Charles has cool picture showing Flickr's message telling him they'll need about 15 minutes to move his 11,500 images to another shard. One, that's a lot of pictures! Two, it just goes to show you don't have to make this stuff complicated. Sure, it might be nice if their infrastructure could auto-balance shards with no down time and no loss of performance, but do you really need to go to all the extra complexity? The manual system works and though Colin would probably like his service to have been up, I am sure his day will still be a pleasant one.
2 0.11220222 1387 high scalability-2013-01-15-More Numbers Every Awesome Programmer Must Know
Introduction: Colin Scott , a Berkeley researcher, updated Jeff Dean’s famous Numbers Everyone Should Know with his Latency Numbers Every Programmer Should Know interactive graphic. The interactive aspect is cool because it has a slider that let’s you see numbers back from as early as 1990 to the far far future of 2020. Colin explained his motivation for updating the numbers : The other day, a friend mentioned a latency number to me, and I realized that it was an order of magnitude smaller than what I had memorized from Jeff’s talk. The problem, of course, is that hardware performance increases exponentially! After some digging, I actually found that the numbers Jeff quotes are over a decade old Since numbers without interpretation are simply data, take a look at Google Pro Tip: Use Back-Of-The-Envelope-Calculations To Choose The Best Design . The idea is back-of-the-envelope calculations are estimates you create using a combination of thought experiments and common perfor
3 0.10271957 920 high scalability-2010-10-15-Troubles with Sharding - What can we learn from the Foursquare Incident?
Introduction: For everything given something seems to be taken. Caching is a great scalability solution, but caching also comes with problems . Sharding is a great scalability solution, but as Foursquare recently revealed in a post-mortem about their 17 hours of downtime, sharding also has problems. MongoDB, the database Foursquare uses, also contributed their post-mortem of what went wrong too. Now that everyone has shared and resharded, what can we learn to help us skip these mistakes and quickly move on to a different set of mistakes? First, like for Facebook , huge props to Foursquare and MongoDB for being upfront and honest about their problems. This helps everyone get better and is a sign we work in a pretty cool industry. Second, overall, the fault didn't flow from evil hearts or gross negligence. As usual the cause was more mundane: a key system, that could be a little more robust, combined with a very popular application built by a small group of people, under immense pressure
4 0.09119077 666 high scalability-2009-07-30-Learn How to Think at Scale
Introduction: Aaron Kimball of Cloudera gives a wonderful 23 minute presentation titled Cloudera Hadoop Training: Thinking at Scale Cloudera which talks about "common challenges and general best practices for scaling with your data." As a company Cloudera offers "enterprise-level support to users of Apache Hadoop." Part of that offering is a really useful series of tutorial videos on the Hadoop ecosystem . Like TV lawyer Perry Mason (or is it Harmon Rabb?), Aaron gradually builds his case. He opens with the problem of storing lots of data. Then a blistering cross examination of the problem of building distributed systems to analyze that data sets up a powerful closing argument. With so much testimony behind him, on closing Aaron really brings it home with why shared nothing systems like map-reduce are the right solution on how to query lots of data. They jury loved it. Here's the video Thinking at Scale . And here's a summary of some of the lessons learned from the talk: Lessons Learned
5 0.089671686 808 high scalability-2010-04-12-Poppen.de Architecture
Introduction: This is a guest a post by Alvaro Videla describing their architecture for Poppen.de , a popular German dating site. This site is very much NSFW, so be careful before clicking on the link. What I found most interesting is how they manage to sucessfully blend a little of the old with a little of the new, using technologies like Nginx, MySQL, CouchDB, and Erlang, Memcached, RabbitMQ, PHP, Graphite, Red5, and Tsung. What is Poppen.de? Poppen.de (NSFW) is the top dating website in Germany, and while it may be a small site compared to giants like Flickr or Facebook, we believe it's a nice architecture to learn from if you are starting to get some scaling problems. The Stats 2.000.000 users 20.000 concurrent users 300.000 private messages per day 250.000 logins per day We have a team of eleven developers, two designers and two sysadmins for this project. Business Model The site works with a freemium model, where users can do for free things like: Search
6 0.08770401 24 high scalability-2007-07-24-Product: Hibernate Shards
7 0.085448995 1440 high scalability-2013-04-15-Scaling Pinterest - From 0 to 10s of Billions of Page Views a Month in Two Years
8 0.085262686 358 high scalability-2008-07-26-Sharding the Hibernate Way
9 0.084356368 557 high scalability-2009-04-06-A picture is realy worth a thousand word, and also a window in time...
10 0.078088209 319 high scalability-2008-05-14-Scaling an image upload service
11 0.075247623 672 high scalability-2009-08-06-An Unorthodox Approach to Database Design : The Coming of the Shard
12 0.073239863 1048 high scalability-2011-05-27-Stuff The Internet Says On Scalability For May 27, 2011
13 0.072821371 5 high scalability-2007-07-10-mixi.jp Architecture
14 0.072561868 152 high scalability-2007-11-13-Flickr Architecture
15 0.069075406 1268 high scalability-2012-06-20-Ask HighScalability: How do I organize millions of images?
16 0.068892218 1141 high scalability-2011-11-11-Stuff The Internet Says On Scalability For November 11, 2011
17 0.066553958 38 high scalability-2007-07-30-Build an Infinitely Scalable Infrastructure for $100 Using Amazon Services
18 0.06635648 291 high scalability-2008-03-29-20 New Rules for Faster Web Pages
19 0.065756746 645 high scalability-2009-06-30-Hot New Trend: Linking Clouds Through Cheap IP VPNs Instead of Private Lines
20 0.06575197 1589 high scalability-2014-02-03-How Google Backs Up the Internet Along With Exabytes of Other Data
topicId topicWeight
[(0, 0.103), (1, 0.064), (2, -0.017), (3, -0.016), (4, -0.006), (5, -0.027), (6, -0.009), (7, 0.022), (8, 0.017), (9, -0.033), (10, -0.025), (11, 0.013), (12, -0.01), (13, -0.014), (14, 0.04), (15, 0.006), (16, -0.003), (17, 0.004), (18, -0.017), (19, 0.034), (20, 0.0), (21, -0.019), (22, 0.009), (23, -0.004), (24, -0.002), (25, 0.048), (26, 0.022), (27, -0.018), (28, -0.03), (29, 0.018), (30, 0.006), (31, 0.017), (32, 0.019), (33, -0.005), (34, 0.036), (35, -0.008), (36, 0.013), (37, 0.034), (38, 0.019), (39, 0.015), (40, -0.03), (41, 0.019), (42, -0.005), (43, 0.013), (44, -0.031), (45, 0.007), (46, -0.006), (47, 0.032), (48, -0.036), (49, -0.031)]
simIndex simValue blogId blogTitle
same-blog 1 0.94963175 120 high scalability-2007-10-11-How Flickr Handles Moving You to Another Shard
Introduction: Colin Charles has cool picture showing Flickr's message telling him they'll need about 15 minutes to move his 11,500 images to another shard. One, that's a lot of pictures! Two, it just goes to show you don't have to make this stuff complicated. Sure, it might be nice if their infrastructure could auto-balance shards with no down time and no loss of performance, but do you really need to go to all the extra complexity? The manual system works and though Colin would probably like his service to have been up, I am sure his day will still be a pleasant one.
2 0.72232896 76 high scalability-2007-08-29-Skype Failed the Boot Scalability Test: Is P2P fundamentally flawed?
Introduction: Skype's 220 millions users lost service for a stunning two days. The primary cause for Skype's nightmare (can you imagine the beeper storm that went off?) was a massive global roll-out of a Window's patch triggering the simultaneous reboot of millions of machines across the globe. The secondary cause was a bug in Skype's software that prevented "self-healing" in the face of such attacks. The flood of log-in requests and a lack of "peer-to-peer resources" melted their system. Who's fault is it? Is Skype to blame? Is Microsoft to blame? Or is the peer-to-peer model itself fundamentally flawed in some way? Let's be real, how could Skype possibly test booting 220 million servers over a random configuration of resources? Answer: they can't. Yes, it's Skype's responsibility, but they are in a bit of a pickle on this one. The boot scenario is one of the most basic and one of the most difficult scalability scenarios to plan for and test. You can't simulate the viciousness of real-life
3 0.71214348 347 high scalability-2008-07-07-Five Ways to Stop Framework Fixation from Crashing Your Scaling Strategy
Introduction: If you've wondered why I haven't been posting lately it's because I've been on an amazing Beach's motorcycle tour of the Alps ( and , and , and , and , and , and , and , and ). My wife (Linda) and I rode two-up on a BMW 1200 GS through the alps in Germany, Austria, Switzerland, Italy, Slovenia, and Lichtenstein. The trip was more beautiful than I ever imagined. We rode challenging mountain pass after mountain pass, froze in the rain, baked in the heat, woke up on excellent Italian coffee, ate slice after slice of tasty apple strudel, drank dazzling local wines, smelled the fresh cut grass as the Swiss en masse cut hay for the winter feeding of their dairy cows, rode the amazing Munich train system, listened as cow bells tinkled like wind chimes throughout small valleys, drank water from a pure alpine spring on a blisteringly hot hike, watched local German folk dancers represent their regions, and had fun in the company of fellow riders. Magical. They say you'll ride more
4 0.71184158 1027 high scalability-2011-04-20-Packet Pushers: How to Build a Low Cost Data Center
Introduction: The main thrust of the Packet Pushers Show 41 episode was to reveal and ruminate over the horrors of a successful attack on RSA , which puts the whole world security complex at risk. Near the end, at about 46 minutes in, there was an excellent section on how to go about building out a low cost datacenter. Who cares? Well, someone emailed me this exact same question awhile back and I had a pretty useless response. So here's making up for that by summarizing the recommendations from the elite Packet Pushers cabal: Look at Arista and Juniper. Juniper Has a range of stackable switches, which includes some 10 gig. If your budget can stretch for it they might make a good deal on their new QFX proto-fabric product. You can't get a full sized fabric solution, but you can get a few switches together to make a two port fabric. Good solution if you are running 10 gig and only need 30 or 40 10 gig ports. Thinks Juniper would make a good deal in order to get a few re
5 0.71158105 1506 high scalability-2013-08-23-Stuff The Internet Says On Scalability For August 23, 2013
Introduction: Hey, it's HighScalability time: ( Parkour is to terrain as programming is to frameworks ) 5x : AWS vs combined size of other cloud vendors; Every Second on The Internet : Why we need so many servers. Quotable Quotes: @chaliy : Today I learned that I do not understand how #azure scaling works, instance scale does not affect requests/sec I can load. @Lariar : Note how crazy this is. An international launch would have been a huge deal. Now it's just another thing you do. smacktoward : The problem with relying on donations is that people don't make donations. @toddhoffious : Programming is a tool built by logical positivists to solve the problems of idealists and pragmatists. We have a fundamental mismatch here. @etherealmind : Me: "Weird, my phone data isn't working" Them: "They turned the 3G off at the tower because it interferes with the particle accelerator" John Carmack : In computer science, just about t
7 0.69484162 1388 high scalability-2013-01-16-What if Cars Were Rented Like We Hire Programmers?
8 0.68478626 1503 high scalability-2013-08-19-What can the Amazing Race to the South Pole Teach us About Startups?
9 0.68351495 1140 high scalability-2011-11-10-Kill the Telcos Save the Internet - The Unsocial Network
10 0.68198824 1589 high scalability-2014-02-03-How Google Backs Up the Internet Along With Exabytes of Other Data
11 0.68065327 1617 high scalability-2014-03-21-Stuff The Internet Says On Scalability For March 21st, 2014
12 0.67943782 1492 high scalability-2013-07-17-How do you create a 100th Monkey software development culture?
13 0.67632431 1209 high scalability-2012-03-14-The Azure Outage: Time Is a SPOF, Leap Day Doubly So
14 0.67461175 1573 high scalability-2014-01-06-How HipChat Stores and Indexes Billions of Messages Using ElasticSearch and Redis
15 0.67371261 1649 high scalability-2014-05-16-Stuff The Internet Says On Scalability For May 16th, 2014
16 0.67210507 1477 high scalability-2013-06-18-Scaling Mailbox - From 0 to One Million Users in 6 Weeks and 100 Million Messages Per Day
17 0.67164546 165 high scalability-2007-11-26-Scale to China
18 0.67142326 1584 high scalability-2014-01-22-How would you build the next Internet? Loons, Drones, Copters, Satellites, or Something Else?
19 0.66607445 1500 high scalability-2013-08-12-100 Curse Free Lessons from Gordon Ramsay on Building Great Software
20 0.66356432 1438 high scalability-2013-04-10-Check Yourself Before You Wreck Yourself - Avocado's 5 Early Stages of Architecture Evolution
topicId topicWeight
[(1, 0.055), (2, 0.308), (30, 0.108), (61, 0.02), (79, 0.122), (99, 0.229)]
simIndex simValue blogId blogTitle
1 0.93228376 1350 high scalability-2012-10-29-Gone Fishin' Two
Introduction: Well, not exactly Fishin', I'll be on vacation starting today and I'll be back late November. I won't be posting anything new, so we'll all have a break. Disappointing, I know, but fear not, I will be posting some oldies for your re-enjoyment. And If you've ever wanted to write an article for HighScalability, this would be a great time :-) I especially need help on writing Stuff the Internet Says on Scalability as I will be reading the Interwebs on a much reduced schedule. Shock! Horror! Â So if the spirit moves you, please write something. My connectivity in Italy will probably be good, so I will check in and approve articles on a regular basis. Ciao...
2 0.93208003 1128 high scalability-2011-09-30-Gone Fishin'
Introduction: Well, not exactly Fishin', I'll be on vacation starting today and I'll be back in mid October. I won't be posting, so we'll all have a break. Disappointing, I know. If you've ever wanted to write an article for HighScalability, this would be a great time :-) I especially need help on writing Stuff the Internet Says on Scalability as I won't even be reading the Interwebs. Shock! Horror! So if the spirit moves you, please write something. My connectivity in South Africa is unknown, but I will check in and approve articles when I can. See you on down the road...
same-blog 3 0.89025879 120 high scalability-2007-10-11-How Flickr Handles Moving You to Another Shard
Introduction: Colin Charles has cool picture showing Flickr's message telling him they'll need about 15 minutes to move his 11,500 images to another shard. One, that's a lot of pictures! Two, it just goes to show you don't have to make this stuff complicated. Sure, it might be nice if their infrastructure could auto-balance shards with no down time and no loss of performance, but do you really need to go to all the extra complexity? The manual system works and though Colin would probably like his service to have been up, I am sure his day will still be a pleasant one.
4 0.88503253 1653 high scalability-2014-05-23-Gone Fishin' 2014
Introduction: Well, not exactly Fishin', but I'll be on a month long vacation starting today. I won't be posting new content, so we'll all have a break. Disappointing, I know. If you've ever wanted to write an article for HighScalability this would be a great time :-) I'd be very interested in your experiences with containers vs VMs if you have some thoughts on the subject. So if the spirit moves you, please write something. See you on down the road...
5 0.86919576 1367 high scalability-2012-12-05-5 Ways to Make Cloud Failure Not an Option
Introduction: With cloud SLAs generally being worth what you don't pay for them, what can you do to protect yourself? Sean Hull in AirBNB didn’t have to fail has some solid advice on how to deal with outages: Use Redundancy . Make database and webserver tiers redundant using multi-az or alternately read-replicas. Have a browsing only mode . Give users a read-only version of your site. Users may not even notice failures as they will only see problems when they need to perform a write operation. Web Applications need Feature Flags . Build in the ability to turn off and on major parts of your site and flip the switch when problems arise. Consider Netflix’s Simian . By randomly causing outages in your application you can continually test your failover and redundancy infrastructure. Use multiple clouds . Use Redundant Arrays of Inexpensive Clouds as a way of surviving outages in any one particular cloud. None of these are easy and it's worth considering that your application may
6 0.85054743 478 high scalability-2008-12-29-Paper: Spamalytics: An Empirical Analysisof Spam Marketing Conversion
7 0.83113569 374 high scalability-2008-08-30-Paper: GargantuanComputing—GRIDs and P2P
8 0.82490689 464 high scalability-2008-12-13-Strategy: Facebook Tweaks to Handle 6 Time as Many Memcached Requests
9 0.81940889 721 high scalability-2009-10-13-Why are Facebook, Digg, and Twitter so hard to scale?
10 0.80985695 252 high scalability-2008-02-18-limit on the number of databases open
11 0.80763251 1301 high scalability-2012-08-08-3 Tips and Tools for Creating Reliable Billion Page View Web Services
12 0.80704176 907 high scalability-2010-09-23-Working With Large Data Sets
13 0.79335928 359 high scalability-2008-07-29-Ehcache - A Java Distributed Cache
14 0.7927314 844 high scalability-2010-06-18-Paper: The Declarative Imperative: Experiences and Conjectures in Distributed Logic
15 0.7915206 1387 high scalability-2013-01-15-More Numbers Every Awesome Programmer Must Know
16 0.79126847 230 high scalability-2008-01-29-Speed up (Oracle) database code with result caching
17 0.79090524 1321 high scalability-2012-09-12-Using Varnish for Paywalls: Moving Logic to the Edge
18 0.79045671 1266 high scalability-2012-06-18-Google on Latency Tolerant Systems: Making a Predictable Whole Out of Unpredictable Parts
19 0.78993887 910 high scalability-2010-09-30-Facebook and Site Failures Caused by Complex, Weakly Interacting, Layered Systems
20 0.7881037 1568 high scalability-2013-12-23-What Happens While Your Brain Sleeps is Surprisingly Like How Computers Stay Sane