high_scalability high_scalability-2010 high_scalability-2010-869 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Jeremy Zawodny, while performing data alchemy in the dungeons of Craigslist, stored 1,250,000,000 Key/Value Pairs in Redis on a 32GB Machine . Data sorting world record: 1 terabyte, 1 minute . The system has 52 computer nodes, each node is a commodity server with two quad-core processors, 24 gigabytes (GB) memory and sixteen 500 GB disks . It's not just hardware though, they also built a software that utilized all their CPU and RAM. Tweets of Gold: wm : I am really getting the sense that none of you yokels waxing profound about scalability actually has anything factual to say joestump : I think you can do things to *mitigate* pain points up front. You don't need to over-engineer, but it's not hard to look forward. danielcrenna : I love it when I check in debug code accidentally and it turns into a three day hunt for a major scalability problem joestump : Your post also makes me think of another phrase I say often: Scaling == Specialization. Bigger scale =
sentIndex sentText sentNum sentScore
1 Jeremy Zawodny, while performing data alchemy in the dungeons of Craigslist, stored 1,250,000,000 Key/Value Pairs in Redis on a 32GB Machine . [sent-1, score-0.121]
2 Data sorting world record: 1 terabyte, 1 minute . [sent-2, score-0.083]
3 The system has 52 computer nodes, each node is a commodity server with two quad-core processors, 24 gigabytes (GB) memory and sixteen 500 GB disks . [sent-3, score-0.113]
4 It's not just hardware though, they also built a software that utilized all their CPU and RAM. [sent-4, score-0.083]
5 Tweets of Gold: wm : I am really getting the sense that none of you yokels waxing profound about scalability actually has anything factual to say joestump : I think you can do things to *mitigate* pain points up front. [sent-5, score-0.332]
6 danielcrenna : I love it when I check in debug code accidentally and it turns into a three day hunt for a major scalability problem joestump : Your post also makes me think of another phrase I say often: Scaling == Specialization. [sent-7, score-0.289]
7 Quora: What are the scaling issues to keep in mind while developing a social network feed? [sent-9, score-0.133]
8 Very good discussion of scaling advice: denormalize; cache; SSD; optimize writes; avoid stupid things. [sent-10, score-0.212]
9 Each and every problem has an appropriate set of applicable technologies, and it’s up to the engineer to justify their use. [sent-12, score-0.161]
10 In a system of no significant scale, basically anything works. [sent-15, score-0.101]
11 In a system of significant scale, there is no magic bullet. [sent-18, score-0.101]
12 I think with great tools like memcached it is easy to get carried away and use it as the mallet for every performance problem, but in many cases it should not be your first choice . [sent-20, score-0.079]
13 Caching should be seen more as a burden that many applications just can’t live without. [sent-21, score-0.149]
14 You don’t want that burden until you have exhausted all other easily reachable optimizations . [sent-22, score-0.346]
15 NVIDIA announced that it had partnered with PEER 1 to provide the industry’s first large-scale hosted GPU cloud . [sent-24, score-0.094]
16 The basic insight behind Levenshtein automata is that it's possible to construct a Finite state automaton that recognizes exactly the set of strings within a given Levenshtein distance of a target word. [sent-27, score-0.5]
17 A sweet 127 slide whirlwind tour of the theory and application of graphs . [sent-29, score-0.108]
18 On an AWS Cluster Compute Instance I was able to insert a million small documents in about 3 minutes. [sent-31, score-0.113]
19 A smart grid network will be highly dependent on “people’s willingness to connect in this way,” and “this is not going to be something that can be forced on anyone no matter how hard we try. [sent-37, score-0.275]
wordName wordTfidf (topN-words)
[('levenshtein', 0.4), ('automata', 0.227), ('burden', 0.149), ('scaling', 0.133), ('traversals', 0.121), ('alchemy', 0.121), ('largeby', 0.121), ('waxing', 0.121), ('sixteen', 0.113), ('voluntary', 0.113), ('factual', 0.113), ('documents', 0.113), ('reachable', 0.108), ('recognizes', 0.108), ('whirlwind', 0.108), ('cerf', 0.108), ('gb', 0.105), ('scoring', 0.104), ('william', 0.101), ('accidentally', 0.101), ('jvms', 0.101), ('willingness', 0.101), ('significant', 0.101), ('craigslist', 0.098), ('profound', 0.098), ('hunt', 0.098), ('morgan', 0.096), ('marko', 0.096), ('partnered', 0.094), ('finite', 0.094), ('denormalize', 0.092), ('cliff', 0.092), ('phrase', 0.09), ('smart', 0.09), ('zawodny', 0.089), ('exhausted', 0.089), ('strings', 0.085), ('grid', 0.084), ('sorting', 0.083), ('utilized', 0.083), ('recommendation', 0.082), ('justify', 0.082), ('terabyte', 0.081), ('mitigate', 0.081), ('distance', 0.08), ('ranking', 0.08), ('applicable', 0.079), ('carried', 0.079), ('stupid', 0.079), ('gold', 0.078)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 869 high scalability-2010-07-30-Hot Scalability Links for July 30, 2010
Introduction: Jeremy Zawodny, while performing data alchemy in the dungeons of Craigslist, stored 1,250,000,000 Key/Value Pairs in Redis on a 32GB Machine . Data sorting world record: 1 terabyte, 1 minute . The system has 52 computer nodes, each node is a commodity server with two quad-core processors, 24 gigabytes (GB) memory and sixteen 500 GB disks . It's not just hardware though, they also built a software that utilized all their CPU and RAM. Tweets of Gold: wm : I am really getting the sense that none of you yokels waxing profound about scalability actually has anything factual to say joestump : I think you can do things to *mitigate* pain points up front. You don't need to over-engineer, but it's not hard to look forward. danielcrenna : I love it when I check in debug code accidentally and it turns into a three day hunt for a major scalability problem joestump : Your post also makes me think of another phrase I say often: Scaling == Specialization. Bigger scale =
2 0.10678279 195 high scalability-2007-12-28-Amazon's EC2: Pay as You Grow Could Cut Your Costs in Half
Introduction: Update 2: Summize Computes Computing Resources for a Startup . Lots of nice graphs showing Amazon is hard to beat for small machines and become less cost efficient for well used larger machines. Long term storage costs may eat your saving away. And out of cloud bandwidth costs are high. Update: via ProductionScale , a nice Digital Web article on how to setup S3 to store media files and how Blue Origin was able to handle 3.5 million requests and 758 GBs in bandwidth in a single day for very little $$$. Also a Right Scale article on Network performance within Amazon EC2 and to Amazon S3 . 75MB/s between EC2 instances, 10.2MB/s between EC2 and S3 for download, 6.9MB/s upload. Now that Amazon's S3 (storage service) is out of beta and EC2 (elastic compute cloud) has added new instance types (the class of machine you can rent) with more CPU and more RAM, I thought it would be interesting to take a look out how their pricing stacks up. The quick conclusion: the m
Introduction: All in all this is still my favorite post and I still think it's an accurate vision of a future. Not everyone agrees, but I guess we'll see..."But it is not complicated. [There's] just a lot of it." \--Richard Feynmanon how the immense variety of the world arises from simple rules.Contents:Have We Reached the End of Scaling?Applications Become Black Boxes Using Markets to Scale and Control CostsLet's Welcome our Neo-Feudal OverlordsThe Economic Argument for the Ambient CloudWhat Will Kill the Cloud?The Amazing Collective Compute Power of the Ambient CloudUsing the Ambient Cloud as an Application RuntimeApplications as Virtual StatesConclusionWe have not yet begun to scale. The world is still fundamentally disconnected and for all our wisdom we are still in the earliest days of learning how to build truly large planet-scaling applications.Today 350 million users on Facebook is a lot of users and five million followers on Twitter is a lot of followers. This may seem like a lot now, but c
Introduction: "But it is not complicated. [There's] just a lot of it." \--Richard Feynmanon how the immense variety of the world arises from simple rules.Contents:Have We Reached the End of Scaling?Applications Become Black Boxes Using Markets to Scale and Control CostsLet's Welcome our Neo-Feudal OverlordsThe Economic Argument for the Ambient CloudWhat Will Kill the Cloud?The Amazing Collective Compute Power of the Ambient CloudUsing the Ambient Cloud as an Application RuntimeApplications as Virtual StatesConclusionWe have not yet begun to scale. The world is still fundamentally disconnected and for all our wisdom we are still in the earliest days of learning how to build truly large planet-scaling applications.Today 350 million users on Facebook is a lot of users and five million followers on Twitter is a lot of followers. This may seem like a lot now, but consider we have no planet wide applications yet. None.Tomorrow the numbers foreshadow a newCambrian explosionof connectivity that will look as
5 0.099787228 313 high scalability-2008-05-02-Friends for Sale Architecture - A 300 Million Page View-Month Facebook RoR App
Introduction: Update: Jake in Does Django really scale better than Rails? thinks apps like FFS shouldn't need so much hardware to scale. In a short three months Friends for Sale (think Hot-or-Not with a market economy) grew to become a top 10 Facebook application handling 200 gorgeous requests per second and a stunning 300 million page views a month. They did all this using Ruby on Rails, two part time developers, a cluster of a dozen machines, and a fairly standard architecture. How did Friends for Sale scale to sell all those beautiful people? And how much do you think your friends are worth on the open market? Site: http://www.facebook.com/apps/application.php?id=7019261521 Information Sources Siqi Chen and Alexander Le, co-creators of Friends for Sale, answering my standard questionairre. Virality on Facebook The Platform Ruby on Rails CentOS 5 (64 bit) Capistrano - update and restart application servers. Memcached MySQL Nginx Starling - distrib
6 0.097075552 538 high scalability-2009-03-16-Are Cloud Based Memory Architectures the Next Big Thing?
7 0.096334353 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?
8 0.094497338 628 high scalability-2009-06-13-Neo4j - a Graph Database that Kicks Buttox
9 0.087177701 1530 high scalability-2013-10-11-Stuff The Internet Says On Scalability For October 11th, 2013
10 0.087125748 691 high scalability-2009-08-31-Squarespace Architecture - A Grid Handles Hundreds of Millions of Requests a Month
11 0.085004516 1538 high scalability-2013-10-28-Design Decisions for Scaling Your High Traffic Feeds
14 0.080438398 1036 high scalability-2011-05-06-Stuff The Internet Says On Scalability For May 6th, 2011
15 0.080435894 778 high scalability-2010-02-15-The Amazing Collective Compute Power of the Ambient Cloud
16 0.080276519 1559 high scalability-2013-12-06-Stuff The Internet Says On Scalability For December 6th, 2013
topicId topicWeight
[(0, 0.17), (1, 0.062), (2, 0.007), (3, 0.02), (4, -0.01), (5, 0.03), (6, -0.039), (7, 0.017), (8, 0.004), (9, 0.008), (10, 0.009), (11, -0.017), (12, 0.028), (13, 0.033), (14, -0.016), (15, -0.018), (16, 0.016), (17, 0.014), (18, -0.006), (19, 0.051), (20, -0.041), (21, -0.011), (22, -0.028), (23, -0.032), (24, -0.006), (25, -0.039), (26, 0.009), (27, -0.007), (28, -0.001), (29, 0.014), (30, 0.022), (31, -0.036), (32, -0.029), (33, 0.009), (34, 0.001), (35, -0.021), (36, -0.012), (37, -0.017), (38, 0.035), (39, 0.002), (40, 0.01), (41, -0.012), (42, 0.036), (43, -0.027), (44, 0.005), (45, 0.022), (46, 0.007), (47, 0.02), (48, 0.011), (49, -0.036)]
simIndex simValue blogId blogTitle
same-blog 1 0.96658522 869 high scalability-2010-07-30-Hot Scalability Links for July 30, 2010
Introduction: Jeremy Zawodny, while performing data alchemy in the dungeons of Craigslist, stored 1,250,000,000 Key/Value Pairs in Redis on a 32GB Machine . Data sorting world record: 1 terabyte, 1 minute . The system has 52 computer nodes, each node is a commodity server with two quad-core processors, 24 gigabytes (GB) memory and sixteen 500 GB disks . It's not just hardware though, they also built a software that utilized all their CPU and RAM. Tweets of Gold: wm : I am really getting the sense that none of you yokels waxing profound about scalability actually has anything factual to say joestump : I think you can do things to *mitigate* pain points up front. You don't need to over-engineer, but it's not hard to look forward. danielcrenna : I love it when I check in debug code accidentally and it turns into a three day hunt for a major scalability problem joestump : Your post also makes me think of another phrase I say often: Scaling == Specialization. Bigger scale =
2 0.81180859 1147 high scalability-2011-11-25-Stuff The Internet Says On Scalability For November 25, 2011
Introduction: A HighScalability a day keeps the fail whale away.: 46 million turkeys eaten at Thanksgiving ; Pinterest 421 million pageviews Quotable quotes: @al3xandr3 : scalability demands decoupling @startupandrew : "Yesterday I did some digestive system scalability testing" -- in reference to eating 140 chicken nuggets Peter Wayner : While the crazy dreamers can continue to craft NoSQL data stores, serious people will want to take a look at Oracle's version. @weblearning : Unitil we get scalability - or that wonderful word massification - in education, we will not get value for money Embrace and extend: @conor_omahony : IBM is Baking NoSQL Capabilities into DB2 and Informix. @GGopman : I look for scalability. That's what turns me on Supercomputing: An Industry in Need of a Revolution . Bartosz Milewski wants decently paying, interesting and meaningful jobs for all (supercomputer programmers). Get on that Santa. Caching a
3 0.80268872 1019 high scalability-2011-04-08-Stuff The Internet Says On Scalability For April 8, 2011
Introduction: Submitted for your reading pleasure on this tomato killing frosty morn... Now we really know why vampires feed on blood...they are elastically acquiring more compute power. Your Next Computer May Be Made of...Blood! It's those memristors again. Ancient vamps are really just giant super computers. Twitter now at 155 million tweets a day , up from 55 million a year ago. 10,000-core Linux supercomputer built in Amazon cloud By Jon Brodkin. T he 10,000 cores were composed of 1,250 instances with eight cores each, as well as 8.75TB of RAM and 2PB disk space. The cluster ran for eight hours at a cost of $8,500 . Quotable Quotes for $273 Alex: @davidklemke : Holy balls Windows Azure Tables is awesome. Man am I regretting not getting into this cloud stuff sooner, it's scalability heaven. @nik : The volume of tweets we are flowing into HBase is truly staggering #bigdata #datasift @wattersjames : One of the key points I mentioned before: Scalability is being abl
4 0.79417187 1439 high scalability-2013-04-12-Stuff The Internet Says On Scalability For April 12, 2013
Introduction: Hey, it's HighScalability time: ( Ukrainian daredevil scaling buildings) 877,000 TPS : Erlang and VoltDB. Quotable Quotes: Hendrik Volkmer : Complexity + Scale => Reduced Reliability + Increased Chance of catastrophic failures @TheRealHirsty : This coffee could use some "scalability" @billcurtis_ : Angular.js with Magento + S3 json file caching = wicked scalability Dan Milstein : Screw you Joel Spolsky, We're Rewriting It From Scratch! Anil Dash : Terms of Service and IP trump the Constitution Jeremy Zawodny : Yeah, seek time matters. A lot. @joeweinman : @adrianco proves why auto scaling is better than curated capacity management. < 50% + Cost Saving @ascendantlogic : Any "framework" naturally follows this progression. Something is complex so someone does something to make it easier. Everyone rushes to it but needs one or two things from the technologies they left behind so they introduce that into the "new"
5 0.78710055 1024 high scalability-2011-04-15-Stuff The Internet Says On Scalability For April 15, 2011
Introduction: Submitted for your reading pleasure... Luxury is an ancient notion. There was once a Chinese mandarin who had himself wakened three times every morning simply for the pleasure of being told it was not yet time to get up . ~Argosy We have a Qutoable Quote machine for you today: @kevinweil : Twitter monthly signups have increased more than 50% since December, and we're now doing well over 150 million Tweets per day. @ChrisShain : Prediction: Black art of query optimization will become black art of #nosql data modeling, for same reasons. Minimize IOs, query time. @ui_matters : Infrastructure as a Service = no hardware headaches. Platform as a Svc = no scalability headaches. SaaS = common dev platform #amchamtech @plcstpierre : Thinking about high scalability stuff... I never thought database stuff can be interesting... @webdz9r : mass scalability for dynamic web content. What took us 8 machines, now take us 1 web and 1 app. @joelvarty : CDN is always an aft
6 0.78106689 1411 high scalability-2013-02-22-Stuff The Internet Says On Scalability For February 22, 2013
7 0.77754933 1645 high scalability-2014-05-09-Stuff The Internet Says On Scalability For May 9th, 2014
8 0.77668905 1455 high scalability-2013-05-10-Stuff The Internet Says On Scalability For May 10, 2013
9 0.77590537 1170 high scalability-2012-01-06-Stuff The Internet Says On Scalability For January 6, 2012
10 0.77533817 1451 high scalability-2013-05-03-Stuff The Internet Says On Scalability For May 3, 2013
11 0.77196813 142 high scalability-2007-11-05-Strategy: Diagonal Scaling - Don't Forget to Scale Out AND Up
12 0.76808804 1460 high scalability-2013-05-17-Stuff The Internet Says On Scalability For May 17, 2013
13 0.76756287 980 high scalability-2011-01-28-Stuff The Internet Says On Scalability For January 28, 2011
14 0.76545173 1203 high scalability-2012-03-02-Stuff The Internet Says On Scalability For March 2, 2012
15 0.76395452 1195 high scalability-2012-02-17-Stuff The Internet Says On Scalability For February 17, 2012
16 0.76365352 1397 high scalability-2013-02-01-Stuff The Internet Says On Scalability For February 1, 2013
17 0.76321965 1487 high scalability-2013-07-05-Stuff The Internet Says On Scalability For July 5, 2013
18 0.76230896 1516 high scalability-2013-09-13-Stuff The Internet Says On Scalability For September 13, 2013
19 0.75856113 1637 high scalability-2014-04-25-Stuff The Internet Says On Scalability For April 25th, 2014
20 0.75824255 185 high scalability-2007-12-13-Is premature scalation a real disease?
topicId topicWeight
[(1, 0.121), (2, 0.157), (10, 0.028), (17, 0.227), (30, 0.044), (40, 0.02), (47, 0.031), (56, 0.02), (61, 0.13), (77, 0.013), (79, 0.093), (85, 0.04)]
simIndex simValue blogId blogTitle
1 0.94994855 506 high scalability-2009-02-03-10 More Rules for Even Faster Websites
Introduction: Update: How-To Minimize Load Time for Fast User Experiences . Shows how to analyze the bottlenecks preventing websites and blogs from loading quickly and how to resolve them. 80-90% of the end-user response time is spent on the frontend, so it makes sense to concentrate efforts there before heroically rewriting the backend. Take a shower before buying a Porsche, if you know what I mean. Steve Souders, author of High Performance Websites and Yslow , has ten more best practices to speed up your website : Split the initial payload Load scripts without blocking Don’t scatter scripts Split dominant content domains Make static content cookie-free Reduce cookie weight Minify CSS Optimize images Use iframes sparingly To www or not to www Sadly, according to String Theory, there are only 26.7 rules left, so get them while they're still in our dimension. Here are slides on the first few rules. Love the speeding dog slide. That's exactly what my dog looks like trav
2 0.93284518 631 high scalability-2009-06-15-Large-scale Graph Computing at Google
Introduction: To continue the graph theme Google has got into the act and released information on Pregel . Pregel does not appear to be a new type of potato chip. Pregel is instead a scalable infrastructure... ...to mine a wide range of graphs. In Pregel, programs are expressed as a sequence of iterations. In each iteration, a vertex can, independently of other vertices, receive messages sent to it in the previous iteration, send messages to other vertices, modify its own and its outgoing edges' states, and mutate the graph's topology. Currently, Pregel scales to billions of vertices and edges, but this limit will keep expanding. Pregel's applicability is harder to quantify, but so far we haven't come across a type of graph or a practical graph computing problem which is not solvable with Pregel. It computes over large graphs much faster than alternatives, and the application programming interface is easy to use. Implementing PageRank, for example, takes only about 15 lines of code. Developers
3 0.9205609 1225 high scalability-2012-04-09-Why My Slime Mold is Better than Your Hadoop Cluster
Introduction: Update : Organism without a brain creates external memories for navigation shows slime mold is even cooler than originally thought, storing a record of where it's been using slime: The authors conclude, the slime isn't just the mold's calling card. Instead, it's a way of marking the environment so that the organism can sense where it's been, and not expend effort on searches that won't pay off. Although the situation isn't an exact parallel, the authors make a comparison to the pheromone trails used by ants. In After Life: The Strange Science Of Decay there’s a truly incredible sequence of gorgeously shot video showing how creeping slime mold solves mazes and performs other other amazing feats of computation. Take a look at what simple one celled organisms can do: The whole video is really well done and shockingly revelatory. It’s the story of decay, how atoms created during the Big Bang and through countless supernova explosions are continually rearranged an
4 0.89841354 1467 high scalability-2013-05-30-Google Finds NUMA Up to 20% Slower for Gmail and Websearch
Introduction: When you have a large population of servers you have both the opportunity and the incentive to perform interesting studies. Authors from Google and the University of California in Optimizing Google’s Warehouse Scale Computers: The NUMA Experience conducted such a study, taking a look at how jobs run on clusters of machines using a NUMA architecture. Since NUMA is common on server class machines it's a topic of general interest for those looking to maximize machine utilization across clusters. Some of the results are surprising: The methodology of how to attribute such fine performance variations to NUMA effects within such a complex system is perhaps more interesting than the results themselves. Well worth reading just for that story. The performance swing due to NUMA is up to 15% on AMD Barcelona for Gmail backend and 20% on Intel Westmere for Web-search frontend. Memory locality is not always King. Because of the interaction between NUMA and cache sharing/contention it
same-blog 5 0.88658816 869 high scalability-2010-07-30-Hot Scalability Links for July 30, 2010
Introduction: Jeremy Zawodny, while performing data alchemy in the dungeons of Craigslist, stored 1,250,000,000 Key/Value Pairs in Redis on a 32GB Machine . Data sorting world record: 1 terabyte, 1 minute . The system has 52 computer nodes, each node is a commodity server with two quad-core processors, 24 gigabytes (GB) memory and sixteen 500 GB disks . It's not just hardware though, they also built a software that utilized all their CPU and RAM. Tweets of Gold: wm : I am really getting the sense that none of you yokels waxing profound about scalability actually has anything factual to say joestump : I think you can do things to *mitigate* pain points up front. You don't need to over-engineer, but it's not hard to look forward. danielcrenna : I love it when I check in debug code accidentally and it turns into a three day hunt for a major scalability problem joestump : Your post also makes me think of another phrase I say often: Scaling == Specialization. Bigger scale =
6 0.86873108 1393 high scalability-2013-01-24-NoSQL Parody: say No! No! and No!
7 0.8671242 956 high scalability-2010-12-08-How To Get Experience Working With Large Datasets
8 0.86690933 543 high scalability-2009-03-17-Sun to Announce Open Cloud APIs at CommunityOne
10 0.83437341 199 high scalability-2008-01-01-S3 for image storing
11 0.80736667 765 high scalability-2010-01-25-Let's Welcome our Neo-Feudal Overlords
12 0.77392709 427 high scalability-2008-10-22-Server load balancing architectures, Part 2: Application-level load balancing
13 0.7736178 507 high scalability-2009-02-03-Paper: Optimistic Replication
14 0.76845098 1333 high scalability-2012-10-04-LinkedIn Moved from Rails to Node: 27 Servers Cut and Up to 20x Faster
15 0.76234055 1392 high scalability-2013-01-23-Building Redundant Datacenter Networks is Not For Sissies - Use an Outside WAN Backbone
16 0.75622511 467 high scalability-2008-12-16-[ANN] New Open Source Cache System
17 0.74956858 1189 high scalability-2012-02-07-Hypertable Routs HBase in Performance Test -- HBase Overwhelmed by Garbage Collection
18 0.74909019 465 high scalability-2008-12-14-Scaling MySQL on a 256-way T5440 server using Solaris ZFS and Java 1.7
19 0.74790859 1153 high scalability-2011-12-08-Update on Scalable Causal Consistency For Wide-Area Storage With COPS
20 0.74480939 790 high scalability-2010-03-09-Applications as Virtual States