high_scalability high_scalability-2010 high_scalability-2010-917 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Robert Haas in his SURGE Recap of the Surge conference, reflected a bit, and came up with an interesting checklist of general themes from what he was seeing. I'm directly quoting his post, so please see the post for a full discussion. He uses this framework to think about the larger picture and where PostgreSQL stands in its progression. Make use of the academic literature . Inventing your own way to do something is fine, but at least consider the possibility that someone smarter than you has thought about this problem before. Failures are inevitable, so plan for them . Try to minimize the possibility of cascading failures, and plan in advance how you can operate in degraded mode if disaster (or the Slashdot effect) strikes. Disk technology matters . Drive firmware bugs are common and nightmarish, and you can expect very limited help from the manufacturer, especially if the drive is billed as consumer-grade rather than enterprise-grade. SSDs can save you a lot of m
sentIndex sentText sentNum sentScore
1 Robert Haas in his SURGE Recap of the Surge conference, reflected a bit, and came up with an interesting checklist of general themes from what he was seeing. [sent-1, score-0.575]
2 I'm directly quoting his post, so please see the post for a full discussion. [sent-2, score-0.297]
3 He uses this framework to think about the larger picture and where PostgreSQL stands in its progression. [sent-3, score-0.315]
4 Inventing your own way to do something is fine, but at least consider the possibility that someone smarter than you has thought about this problem before. [sent-5, score-0.462]
5 Try to minimize the possibility of cascading failures, and plan in advance how you can operate in degraded mode if disaster (or the Slashdot effect) strikes. [sent-7, score-1.241]
6 Drive firmware bugs are common and nightmarish, and you can expect very limited help from the manufacturer, especially if the drive is billed as consumer-grade rather than enterprise-grade. [sent-9, score-0.726]
7 SSDs can save you a lot of money, both because a given number of dollars buys more IOs-per-second, and because electricity isn't free. [sent-10, score-0.442]
8 Large data sets require horizontal scalability . [sent-11, score-0.179]
9 I n the era of 1TB drives, "large" doesn't mean quite what it used to, but even though the amount of data you can manage with one machine is growing all the time, the amount of data people want to manage is growing even faster. [sent-12, score-0.943]
wordName wordTfidf (topN-words)
[('possibility', 0.265), ('quoting', 0.212), ('firmware', 0.199), ('manufacturer', 0.19), ('haas', 0.177), ('recap', 0.177), ('slashdot', 0.173), ('billed', 0.173), ('checklist', 0.168), ('buys', 0.168), ('reflected', 0.168), ('electricity', 0.165), ('degraded', 0.161), ('plan', 0.157), ('themes', 0.151), ('inevitable', 0.148), ('academic', 0.146), ('stands', 0.141), ('cascading', 0.135), ('growing', 0.133), ('smarter', 0.127), ('advance', 0.125), ('amount', 0.124), ('manage', 0.123), ('robert', 0.12), ('bugs', 0.119), ('era', 0.112), ('postgresql', 0.109), ('ssds', 0.109), ('dollars', 0.109), ('disaster', 0.105), ('mode', 0.104), ('picture', 0.103), ('effect', 0.099), ('horizontal', 0.099), ('operate', 0.098), ('drives', 0.094), ('fine', 0.094), ('minimize', 0.091), ('came', 0.088), ('post', 0.085), ('drive', 0.084), ('failures', 0.082), ('sets', 0.08), ('limited', 0.077), ('expect', 0.074), ('mean', 0.071), ('larger', 0.071), ('bit', 0.07), ('consider', 0.07)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 917 high scalability-2010-10-08-4 Scalability Themes from Surgecon
Introduction: Robert Haas in his SURGE Recap of the Surge conference, reflected a bit, and came up with an interesting checklist of general themes from what he was seeing. I'm directly quoting his post, so please see the post for a full discussion. He uses this framework to think about the larger picture and where PostgreSQL stands in its progression. Make use of the academic literature . Inventing your own way to do something is fine, but at least consider the possibility that someone smarter than you has thought about this problem before. Failures are inevitable, so plan for them . Try to minimize the possibility of cascading failures, and plan in advance how you can operate in degraded mode if disaster (or the Slashdot effect) strikes. Disk technology matters . Drive firmware bugs are common and nightmarish, and you can expect very limited help from the manufacturer, especially if the drive is billed as consumer-grade rather than enterprise-grade. SSDs can save you a lot of m
2 0.11408988 150 high scalability-2007-11-12-Slashdot Architecture - How the Old Man of the Internet Learned to Scale
Introduction: Slashdot effect : overwhelming unprepared sites with an avalanche of reader's clicks after being mentioned on Slashdot. Sure, we now have the "Digg effect" and other hot new stars, but Slashdot was the original. And like many stars from generations past, Slashdot plays the elder statesman's role with with class, dignity, and restraint. Yet with millions and millions of users Slashdot is still box office gold and more than keeps up with the young'ins. And with age comes the wisdom of learning how to handle all those users. Just how does Slashdot scale and what can you learn by going old school? Site: http://slashdot.org Information Sources Slashdot's Setup, Part 1- Hardware Slashdot's Setup, Part 2- Software History of Slashdot Part 3- Going Corporate The History of Slashdot Part 4 - Yesterday, Today, Tomorrow The Platform MySQL Linux (CentOS/RHEL) Pound Apache Perl Memcached LVS The Stats Started building the system in 1999
3 0.092528865 1291 high scalability-2012-07-25-Vertical Scaling Ascendant - How are SSDs Changing Architectures?
Introduction: With Amazon announcing new High I/O 2TB SSD instances the age of SSD has almost arrived. I say almost because the $27K a year price tag for the hi1.4xlarge on demand instance type is outside the budget of many. Yet even at the full on demand rate the price per IOP for the high IO instance is attractive: 27 cents ($27K/100K IOPS) per vs $1.25 for disk . With the obvious benefits of giant SSD machines combined with 10 Gbps networking, it’s interesting to consider: what architecture decisions might you make differently in the future? More Headroom for Vertical Scaling Simplifies Everything The beauty of higher hardware performance is it shifts effort away from the programmer which allows developers to focus on the business of business, minimizing trickeration. This has always been the allure of vertical scaling and is well realized by SSDs through a combination of high throughput, low latencies, and just as important, high densities. We have a few earl
4 0.081276968 304 high scalability-2008-04-19-How to build a real-time analytics system?
Introduction: Hello everybody! I am a developer of a website with a lot of traffic. Right now we are managing the whole website using perl + postgresql + fastcgi + memcached + mogileFS + lighttpd + roundrobin DNS distributed over 5 servers and I must say it works like a charm, load is stable and everything works very fast and we are recording about 8 million pageviews per day. The only problem is with postgres database since we have it installed only on one server and if this server goes down, the whole "cluster" goes down. That's why we have a master2slave replication so we still have a backup database except that when the master goes down, all inserts/updates are disabled so the whole website is just read only. But this is not a problem since this configuration is working for us and we don't have any problems with it. Right now we are planning to build our own analytics service that would be customized for our needs. We tried various different software packages but were not satisfi
Introduction: All in all this is still my favorite post and I still think it's an accurate vision of a future. Not everyone agrees, but I guess we'll see..."But it is not complicated. [There's] just a lot of it." \--Richard Feynmanon how the immense variety of the world arises from simple rules.Contents:Have We Reached the End of Scaling?Applications Become Black Boxes Using Markets to Scale and Control CostsLet's Welcome our Neo-Feudal OverlordsThe Economic Argument for the Ambient CloudWhat Will Kill the Cloud?The Amazing Collective Compute Power of the Ambient CloudUsing the Ambient Cloud as an Application RuntimeApplications as Virtual StatesConclusionWe have not yet begun to scale. The world is still fundamentally disconnected and for all our wisdom we are still in the earliest days of learning how to build truly large planet-scaling applications.Today 350 million users on Facebook is a lot of users and five million followers on Twitter is a lot of followers. This may seem like a lot now, but c
7 0.071452186 666 high scalability-2009-07-30-Learn How to Think at Scale
8 0.070364311 361 high scalability-2008-08-08-Separation into read-write only databases
9 0.069747359 1384 high scalability-2013-01-09-The Story of How Turning Disk Into a Service Lead to a Deluge of Density
10 0.069579892 276 high scalability-2008-03-15-New Website Design Considerations
11 0.069062933 1066 high scalability-2011-06-22-It's the Fraking IOPS - 1 SSD is 44,000 IOPS, Hard Drive is 180
12 0.067460939 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?
13 0.067265764 1420 high scalability-2013-03-08-Stuff The Internet Says On Scalability For March 8, 2013
14 0.067161903 334 high scalability-2008-05-29-Amazon Improves Diagonal Scaling Support with High-CPU Instances
15 0.067086339 920 high scalability-2010-10-15-Troubles with Sharding - What can we learn from the Foursquare Incident?
16 0.067015789 1430 high scalability-2013-03-27-The Changing Face of Scale - The Downside of Scaling in the Contextual Age
17 0.066573247 1369 high scalability-2012-12-10-Switch your databases to Flash storage. Now. Or you're doing it wrong.
18 0.063910328 1313 high scalability-2012-08-28-Making Hadoop Run Faster
19 0.06350074 1588 high scalability-2014-01-31-Stuff The Internet Says On Scalability For January 31st, 2014
20 0.063336708 375 high scalability-2008-09-01-A Scalability checklist?
topicId topicWeight
[(0, 0.113), (1, 0.057), (2, 0.003), (3, 0.009), (4, 0.003), (5, -0.017), (6, -0.009), (7, 0.016), (8, 0.013), (9, -0.033), (10, -0.023), (11, -0.011), (12, -0.01), (13, 0.016), (14, 0.041), (15, -0.011), (16, 0.004), (17, -0.008), (18, -0.029), (19, 0.036), (20, -0.019), (21, -0.016), (22, 0.011), (23, 0.02), (24, -0.015), (25, 0.017), (26, 0.022), (27, -0.008), (28, -0.039), (29, 0.008), (30, -0.006), (31, -0.015), (32, 0.024), (33, 0.028), (34, -0.049), (35, 0.012), (36, -0.009), (37, -0.017), (38, 0.014), (39, 0.001), (40, 0.011), (41, 0.012), (42, 0.004), (43, -0.003), (44, 0.032), (45, -0.012), (46, -0.023), (47, -0.046), (48, -0.04), (49, -0.043)]
simIndex simValue blogId blogTitle
same-blog 1 0.9470017 917 high scalability-2010-10-08-4 Scalability Themes from Surgecon
Introduction: Robert Haas in his SURGE Recap of the Surge conference, reflected a bit, and came up with an interesting checklist of general themes from what he was seeing. I'm directly quoting his post, so please see the post for a full discussion. He uses this framework to think about the larger picture and where PostgreSQL stands in its progression. Make use of the academic literature . Inventing your own way to do something is fine, but at least consider the possibility that someone smarter than you has thought about this problem before. Failures are inevitable, so plan for them . Try to minimize the possibility of cascading failures, and plan in advance how you can operate in degraded mode if disaster (or the Slashdot effect) strikes. Disk technology matters . Drive firmware bugs are common and nightmarish, and you can expect very limited help from the manufacturer, especially if the drive is billed as consumer-grade rather than enterprise-grade. SSDs can save you a lot of m
2 0.82736194 1027 high scalability-2011-04-20-Packet Pushers: How to Build a Low Cost Data Center
Introduction: The main thrust of the Packet Pushers Show 41 episode was to reveal and ruminate over the horrors of a successful attack on RSA , which puts the whole world security complex at risk. Near the end, at about 46 minutes in, there was an excellent section on how to go about building out a low cost datacenter. Who cares? Well, someone emailed me this exact same question awhile back and I had a pretty useless response. So here's making up for that by summarizing the recommendations from the elite Packet Pushers cabal: Look at Arista and Juniper. Juniper Has a range of stackable switches, which includes some 10 gig. If your budget can stretch for it they might make a good deal on their new QFX proto-fabric product. You can't get a full sized fabric solution, but you can get a few switches together to make a two port fabric. Good solution if you are running 10 gig and only need 30 or 40 10 gig ports. Thinks Juniper would make a good deal in order to get a few re
3 0.80143976 777 high scalability-2010-02-15-Scaling Ambition at StackOverflow
Introduction: Joel Spolsky and Jeff Atwood are raising VC money for StackOverflow. This is interesting for three reasons: 1) Joel has always seemed like a keep it small and grow organically type of guy, so this is a big step in a different direction. 2) It means they think there's a very big market in the Q&A; space and they mean to capture as much as the market as possible. 3) Most importantly for this blog, Joel gives some good advice on when to stay fresh and local and when it's time to jump for the brass ring, scale up your ambition, and go for VC money. Please see Joel's blog post for the details, but here's when to go VC: There’s a land grab going on. There is a provable concept that’s repeatable. The business itself could benefit from the publicity. The investor will add substantial value to the business. The business can potentially have a big exit or become a large, publically traded company. The founders are not in it for their own personal aggrandizement. Joel t
4 0.80111545 311 high scalability-2008-04-29-Strategy: Sample to Reduce Data Set
Introduction: Update: Arjen links to video Supporting Scalable Online Statistical Processing which shows "rather than doing complete aggregates, use statistical sampling to provide a reasonable estimate (unbiased guess) of the result." When you have a lot of data, sampling allows you to draw conclusions from a much smaller amount of data. That's why sampling is a scalability solution. If you don't have to process all your data to get the information you need then you've made the problem smaller and you'll need fewer resources and you'll get more timely results. Sampling is not useful when you need a complete list that matches a specific criteria. If you need to know the exact set of people who bought a car in the last week then sampling won't help. But, if you want to know many people bought a car then you could take a sample and then create estimate of the full data-set. The difference is you won't really know the exact car count. You'll have a confidence interval saying how confident
5 0.78696305 347 high scalability-2008-07-07-Five Ways to Stop Framework Fixation from Crashing Your Scaling Strategy
Introduction: If you've wondered why I haven't been posting lately it's because I've been on an amazing Beach's motorcycle tour of the Alps ( and , and , and , and , and , and , and , and ). My wife (Linda) and I rode two-up on a BMW 1200 GS through the alps in Germany, Austria, Switzerland, Italy, Slovenia, and Lichtenstein. The trip was more beautiful than I ever imagined. We rode challenging mountain pass after mountain pass, froze in the rain, baked in the heat, woke up on excellent Italian coffee, ate slice after slice of tasty apple strudel, drank dazzling local wines, smelled the fresh cut grass as the Swiss en masse cut hay for the winter feeding of their dairy cows, rode the amazing Munich train system, listened as cow bells tinkled like wind chimes throughout small valleys, drank water from a pure alpine spring on a blisteringly hot hike, watched local German folk dancers represent their regions, and had fun in the company of fellow riders. Magical. They say you'll ride more
6 0.77397853 1410 high scalability-2013-02-20-Smart Companies Fail Because they Do Everything Right - Staying Alive to Scale
7 0.77184159 1589 high scalability-2014-02-03-How Google Backs Up the Internet Along With Exabytes of Other Data
8 0.75685006 1388 high scalability-2013-01-16-What if Cars Were Rented Like We Hire Programmers?
10 0.75029123 1503 high scalability-2013-08-19-What can the Amazing Race to the South Pole Teach us About Startups?
11 0.74638391 1199 high scalability-2012-02-27-Zen and the Art of Scaling - A Koan and Epigram Approach
12 0.74488199 1384 high scalability-2013-01-09-The Story of How Turning Disk Into a Service Lead to a Deluge of Density
13 0.74248672 1500 high scalability-2013-08-12-100 Curse Free Lessons from Gordon Ramsay on Building Great Software
15 0.73816931 1506 high scalability-2013-08-23-Stuff The Internet Says On Scalability For August 23, 2013
16 0.72846752 1492 high scalability-2013-07-17-How do you create a 100th Monkey software development culture?
17 0.72673398 1430 high scalability-2013-03-27-The Changing Face of Scale - The Downside of Scaling in the Contextual Age
18 0.72493047 747 high scalability-2009-11-26-What I'm Thankful For on Thanksgiving
19 0.72401881 23 high scalability-2007-07-24-Major Websites Down: Or Why You Want to Run in Two or More Data Centers.
20 0.72162926 1172 high scalability-2012-01-10-A Perfect Fifth of Notes on Scalability
topicId topicWeight
[(1, 0.058), (2, 0.224), (10, 0.043), (30, 0.243), (61, 0.14), (73, 0.021), (77, 0.029), (79, 0.068), (85, 0.012), (94, 0.064)]
simIndex simValue blogId blogTitle
1 0.94964349 261 high scalability-2008-02-25-Make Your Site Run 10 Times Faster
Introduction: This is what Mike Peters says he can do : make your site run 10 times faster. His test bed is "half a dozen servers parsing 200,000 pages per hour over 40 IP addresses, 24 hours a day." Before optimization CPU spiked to 90% with 50 concurrent connections. After optimization each machine "was effectively handling 500 concurrent connections per second with CPU at 8% and no degradation in performance." Mike identifies six major bottlenecks: Database write access (read is cheaper) Database read access PHP, ASP, JSP and any other server side scripting Client side JavaScript Multiple/Fat Images, scripts or css files from different domains on your page Slow keep-alive client connections, clogging your available sockets Mike's solutions: Switch all database writes to offline processing Minimize number of database read access to the bare minimum. No more than two queries per page. Denormalize your database and Optimize MySQL tables Implement MemCached and cha
2 0.93075407 831 high scalability-2010-05-26-End-To-End Performance Study of Cloud Services
Introduction: Cloud computing promises a number of advantages for the deployment of data-intensive applications. Most prominently, these include reducing cost with a pay-as-you-go pricing model and (virtually) unlimited throughput by adding servers if the workload increases. At the Systems Group , ETH Zurich, we did an extensive end-to-end performance study to compare the major cloud offerings regarding their ability to fulfill these promises and their implied cost. The focus of the work is on transaction processing (i.e., read and update work-loads), rather than analytics workloads. We used the TPC-W , a standardized benchmark simulating a Web-shop, as the baseline for our comparison. The TPC-W defines that users are simulated through emulated browsers (EB) and issue page requests, called web-interactions (WI), against the system. As a major modification to the benchmark, we constantly increase the load from 1 to 9000 simultaneous users to measure the scalability and cost variance of the syst
3 0.92886537 1459 high scalability-2013-05-16-Paper: Warp: Multi-Key Transactions for Key-Value Stores
Introduction: Looks like an interesting take on "a completely asynchronous, low-latency transaction management protocol, in line with the fully distributed NoSQL architecture." Warp: Multi-Key Transactions for Key-Value Stores  overview: Implementing ACID transactions has been a longstanding challenge for NoSQL systems. Because these systems are based on a sharded architecture, transactions necessarily require coordination across multiple servers. Past work in this space has relied either on heavyweight protocols such as Paxos or clock synchronization for this coordination. This paper presents a novel protocol for coordinating distributed transactions with ACID semantics on top of a sharded data store. Called linear transactions, this protocol achieves scalability by distributing the coordination task to only those servers that hold relevant data for each transaction. It achieves high performance by serializing only those transactions whose concurrent execution could potentially yield a vio
same-blog 4 0.92774242 917 high scalability-2010-10-08-4 Scalability Themes from Surgecon
Introduction: Robert Haas in his SURGE Recap of the Surge conference, reflected a bit, and came up with an interesting checklist of general themes from what he was seeing. I'm directly quoting his post, so please see the post for a full discussion. He uses this framework to think about the larger picture and where PostgreSQL stands in its progression. Make use of the academic literature . Inventing your own way to do something is fine, but at least consider the possibility that someone smarter than you has thought about this problem before. Failures are inevitable, so plan for them . Try to minimize the possibility of cascading failures, and plan in advance how you can operate in degraded mode if disaster (or the Slashdot effect) strikes. Disk technology matters . Drive firmware bugs are common and nightmarish, and you can expect very limited help from the manufacturer, especially if the drive is billed as consumer-grade rather than enterprise-grade. SSDs can save you a lot of m
5 0.91062611 991 high scalability-2011-02-16-Paper: An Experimental Investigation of the Akamai Adaptive Video Streaming
Introduction: Video is hot on the Internet and people are really interested in knowing how to make it work. Dan Rayburn has a post pointing to a fascinating paper: An Experimental Investigation of the Akamai Adaptive Video Streaming , which talks in some detail about the protocols big players like YouTube, Skype and Akamai use to serve video over on an inherently video unfriendly medium like the Internet. For Akamai they found: Each video is encoded in five versions at different bit rates and stored in separate files. The client sends commands to the server with an average inter departure time of about 2 s, i.e. the control algorithm is executed on average each 2 seconds. Akamai uses only the video level to adapt the video source to the available bandwidth, whereas the frame rate of the video is kept constant. When a sudden drop in the available bandwidth occurs, short interruptions of the video playback can occur due to the a large actuation delay. For a sudden increase of the avai
6 0.8843528 783 high scalability-2010-02-24-Hot Scalability Links for February 24, 2010
7 0.8835454 361 high scalability-2008-08-08-Separation into read-write only databases
8 0.88341677 1016 high scalability-2011-04-04-Scaling Social Ecommerce Architecture Case study
9 0.8772164 263 high scalability-2008-02-27-Product: System Imager - Automate Deployment and Installs
10 0.87187141 312 high scalability-2008-04-30-Rather small site architecture.
11 0.86620981 336 high scalability-2008-05-31-Biggest Under Reported Story: Google's BigTable Costs 10 Times Less than Amazon's SimpleDB
12 0.86441219 16 high scalability-2007-07-16-Book: High Performance MySQL
13 0.86346525 464 high scalability-2008-12-13-Strategy: Facebook Tweaks to Handle 6 Time as Many Memcached Requests
14 0.85822517 342 high scalability-2008-06-08-Search fast in million rows
15 0.85157841 308 high scalability-2008-04-22-Simple NFS failover solution with symbolic link?
16 0.84592319 500 high scalability-2009-01-22-Heterogeneous vs. Homogeneous System Architectures
17 0.84501678 1284 high scalability-2012-07-16-Cinchcast Architecture - Producing 1,500 Hours of Audio Every Day
18 0.83635396 800 high scalability-2010-03-26-Strategy: Caching 404s Saved the Onion 66% on Server Time
19 0.83527392 131 high scalability-2007-10-25-Should JSPs be avoided for high scalability?
20 0.83035296 177 high scalability-2007-12-08-thesimsonstage.ea.com