high_scalability high_scalability-2012 high_scalability-2012-1169 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: In the "should I or shouldn't I" debate around deploying SSD, it always helps to have real-world data. Fiesta! with a live-blog summary of a presentation by Kenny Gorman on Shutterfly on MongoDB Performance Tuning . What if you still need more performance after doing all of this tuning? One option is to use SSDs. Shutterfly uses Facebook’s flashcache : kernel module to cache data on SSD. Designed for MySQL/InnoDB. SSD in front of a disk, but exposed as a single mount point. This only makes sense when you have lots of physical I/O. Shutterfly saw a speedup of 500% w/ flashcache. A benefit is that you can delay sharding: less complexity. The whole series of posts has a lot of great information and is worth a longer look, especially if you are considering using MongoDB. Related Articles Slides for MongoSF 2011 slides: MongoDB Performance Tuning SSD+HDD sharding setup for large and permanently growing collections Imlementing MongoDB at Shutterfly by Kenny
sentIndex sentText sentNum sentScore
1 In the "should I or shouldn't I" debate around deploying SSD, it always helps to have real-world data. [sent-1, score-0.229]
2 with a live-blog summary of a presentation by Kenny Gorman on Shutterfly on MongoDB Performance Tuning . [sent-3, score-0.134]
3 What if you still need more performance after doing all of this tuning? [sent-4, score-0.057]
4 Shutterfly uses Facebook’s flashcache : kernel module to cache data on SSD. [sent-6, score-0.363]
5 SSD in front of a disk, but exposed as a single mount point. [sent-8, score-0.276]
6 This only makes sense when you have lots of physical I/O. [sent-9, score-0.152]
7 A benefit is that you can delay sharding: less complexity. [sent-11, score-0.207]
8 The whole series of posts has a lot of great information and is worth a longer look, especially if you are considering using MongoDB. [sent-12, score-0.31]
wordName wordTfidf (topN-words)
[('shutterfly', 0.66), ('kenny', 0.33), ('mongodb', 0.31), ('gorman', 0.165), ('articlesslides', 0.155), ('flashcache', 0.148), ('ssd', 0.139), ('permanently', 0.138), ('joy', 0.134), ('sharding', 0.128), ('mount', 0.121), ('hdd', 0.117), ('speedup', 0.108), ('debate', 0.106), ('exposed', 0.099), ('module', 0.095), ('delay', 0.094), ('slides', 0.091), ('kernel', 0.082), ('saw', 0.081), ('considering', 0.078), ('tuning', 0.076), ('deploying', 0.073), ('benefit', 0.072), ('posts', 0.072), ('presentation', 0.068), ('option', 0.066), ('summary', 0.066), ('follow', 0.065), ('setup', 0.063), ('conference', 0.061), ('vs', 0.061), ('life', 0.058), ('performance', 0.057), ('front', 0.056), ('worth', 0.055), ('sense', 0.054), ('longer', 0.054), ('growing', 0.052), ('especially', 0.051), ('physical', 0.051), ('helps', 0.05), ('lots', 0.047), ('related', 0.046), ('nosql', 0.045), ('designed', 0.044), ('disk', 0.042), ('case', 0.041), ('less', 0.041), ('cache', 0.038)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 1169 high scalability-2012-01-05-Shutterfly Saw a Speedup of 500% With Flashcache
Introduction: In the "should I or shouldn't I" debate around deploying SSD, it always helps to have real-world data. Fiesta! with a live-blog summary of a presentation by Kenny Gorman on Shutterfly on MongoDB Performance Tuning . What if you still need more performance after doing all of this tuning? One option is to use SSDs. Shutterfly uses Facebook’s flashcache : kernel module to cache data on SSD. Designed for MySQL/InnoDB. SSD in front of a disk, but exposed as a single mount point. This only makes sense when you have lots of physical I/O. Shutterfly saw a speedup of 500% w/ flashcache. A benefit is that you can delay sharding: less complexity. The whole series of posts has a lot of great information and is worth a longer look, especially if you are considering using MongoDB. Related Articles Slides for MongoSF 2011 slides: MongoDB Performance Tuning SSD+HDD sharding setup for large and permanently growing collections Imlementing MongoDB at Shutterfly by Kenny
2 0.23470849 1166 high scalability-2011-12-30-Stuff The Internet Says On Scalability For December 30, 2011
Introduction: Pork. The Other HighScalability: PlentyOfFish: 6 Billion Page Views ; World: info doubling every 2 years ; 2015: 7,910 exabytes of global digital data ; Khan Academy: 4 million uniques ; G+: 62 million users ; Zynga: leased 9 megawatts of capacity ; Heroku: billions of page views a month Quoteable quotes: Udi Dhan : Scalability is not boolean. John Boyd: Look at the mission, not the technology. And if you do look at the mission, don't look at the most fashionable mission of the day. @BigDataClouds : I think the fear of change is the biggest challenge that companies are facing @cjzero : If you were wondering, the #Mythbusters scalability test of a Newton's Cradle using wrecking balls? Busted. @Xorlev : Scalability is really really hard. That's why it's fun. It pushes the limits of engineering talent. 100 Best Cloud & Data Stats of 2011 by Zenoss. Lots of fun facts about how mind bendingly huge the world of information is exponentially be
3 0.18848279 1606 high scalability-2014-03-05-10 Things You Should Know About Running MongoDB at Scale
Introduction: Guest post by Asya Kamsky , Principal Solutions Architect at MongoDB. This post outlines ten things you need to know for operating MongoDB at scale based on my experience working with MongoDB customers and open source users: MongoDB requires DevOps, too. MongoDB is a database. Like any other data store, it requires capacity planning, tuning, monitoring, and maintenance. Just because it's easy to install and get started and it fits the developer paradigm more naturally than a relational database, don't assume that MongoDB doesn't need proper care and feeding. And just because it performs super-fast on a small sample dataset in development doesn't mean you can get away without having a good schema and indexing strategy, as well as the right hardware resources in production! But if you prepare well and understand the best practices, operating large MongoDB clusters can be boring instead of nerve-wracking. Successful MongoDB users monitor everything and prepare for growth.
4 0.10689625 990 high scalability-2011-02-15-Wordnik - 10 million API Requests a Day on MongoDB and Scala
Introduction: Wordnik is an online dictionary and language resource that has both a website and an API component. Their goal is to show you as much information as possible, as fast as we can find it, for every word in English, and to give you a place where you can make your own opinions about words known. As cool as that is, what is really cool is the information they share in their blog about their experiences building a web service. They've written an excellent series of articles and presentations you may find useful: What has technology do Save & Close ne for words lately? Eventual consistency . Using an eventually consistent model they can do work in parallel and we count as many words as possible when we can, and add them all up when there’s a lag. The count’s always in the ballpark, and we never have to stop .D Document-oriented storage . Dictionary entries are more naturally modeled as hierarchical documents and using that model has made it quicker to find data and is
5 0.10062171 1114 high scalability-2011-09-13-Must see: 5 Steps to Scaling MongoDB (Or Any DB) in 8 Minutes
Introduction: Jared Rosoff concisely, effectively, entertainingly, and convincingly gives an 8 minute MongoDB tutorial on scaling MongoDB at Scale Out Camp . The ideas aren't just limited to MongoDB, they work for most any database: Optimize your queries; Know your working set size; Tune your file system; Choose the right disks; Shard. Here's an explanation of all 5 strategies: Optimize your queries . Computer science works. Complexity analysis works. A btree search is faster than a table scan. So analyze your queries. Use explain to see what your query is doing. If it is saying it's using a cursor then it's doing a table scan. That's slow. Look at the number of documents it looks at to satisfy a query. Look at how long it takes. Fix: add indexes. It doesn't matter if you are running on 1 or 100 servers. Know your working set size . Sticking memcache in front of your database is silly. You have lots of RAM, use it. Embed your cache in the database, which is how MongoDB works. Working set
6 0.096010409 1089 high scalability-2011-07-29-Stuff The Internet Says On Scalability For July 29, 2011
7 0.09579657 257 high scalability-2008-02-22-Kevin's Great Adventures in SSDland
8 0.093378395 1386 high scalability-2013-01-14-MongoDB and GridFS for Inter and Intra Datacenter Data Replication
9 0.088257946 1054 high scalability-2011-06-06-NoSQL Pain? Learn How to Read-write Scale Without a Complete Re-write
10 0.081275292 1101 high scalability-2011-08-19-Stuff The Internet Says On Scalability For August 19, 2011
12 0.080077268 1291 high scalability-2012-07-25-Vertical Scaling Ascendant - How are SSDs Changing Architectures?
13 0.076843947 1322 high scalability-2012-09-14-Stuff The Internet Says On Scalability For September 14, 2012
14 0.07669282 670 high scalability-2009-08-05-Anti-RDBMS: A list of distributed key-value stores
15 0.076669976 1609 high scalability-2014-03-11-Building a Social Music Service Using AWS, Scala, Akka, Play, MongoDB, and Elasticsearch
16 0.075328678 920 high scalability-2010-10-15-Troubles with Sharding - What can we learn from the Foursquare Incident?
19 0.070153356 684 high scalability-2009-08-18-Real World Web: Performance & Scalability
20 0.069750465 1093 high scalability-2011-08-05-Stuff The Internet Says On Scalability For August 5, 2011
topicId topicWeight
[(0, 0.081), (1, 0.026), (2, -0.029), (3, 0.025), (4, 0.036), (5, 0.058), (6, -0.004), (7, 0.007), (8, 0.052), (9, -0.01), (10, -0.026), (11, -0.043), (12, -0.003), (13, 0.042), (14, -0.054), (15, 0.002), (16, 0.02), (17, 0.0), (18, -0.048), (19, -0.008), (20, -0.016), (21, -0.004), (22, -0.023), (23, -0.005), (24, 0.029), (25, 0.003), (26, 0.015), (27, 0.016), (28, -0.089), (29, 0.018), (30, 0.031), (31, 0.005), (32, 0.056), (33, -0.015), (34, 0.066), (35, -0.007), (36, -0.011), (37, -0.037), (38, -0.015), (39, -0.029), (40, -0.018), (41, 0.012), (42, -0.003), (43, 0.038), (44, -0.035), (45, 0.008), (46, -0.013), (47, -0.05), (48, -0.008), (49, 0.021)]
simIndex simValue blogId blogTitle
same-blog 1 0.9600839 1169 high scalability-2012-01-05-Shutterfly Saw a Speedup of 500% With Flashcache
Introduction: In the "should I or shouldn't I" debate around deploying SSD, it always helps to have real-world data. Fiesta! with a live-blog summary of a presentation by Kenny Gorman on Shutterfly on MongoDB Performance Tuning . What if you still need more performance after doing all of this tuning? One option is to use SSDs. Shutterfly uses Facebook’s flashcache : kernel module to cache data on SSD. Designed for MySQL/InnoDB. SSD in front of a disk, but exposed as a single mount point. This only makes sense when you have lots of physical I/O. Shutterfly saw a speedup of 500% w/ flashcache. A benefit is that you can delay sharding: less complexity. The whole series of posts has a lot of great information and is worth a longer look, especially if you are considering using MongoDB. Related Articles Slides for MongoSF 2011 slides: MongoDB Performance Tuning SSD+HDD sharding setup for large and permanently growing collections Imlementing MongoDB at Shutterfly by Kenny
2 0.66441256 1114 high scalability-2011-09-13-Must see: 5 Steps to Scaling MongoDB (Or Any DB) in 8 Minutes
Introduction: Jared Rosoff concisely, effectively, entertainingly, and convincingly gives an 8 minute MongoDB tutorial on scaling MongoDB at Scale Out Camp . The ideas aren't just limited to MongoDB, they work for most any database: Optimize your queries; Know your working set size; Tune your file system; Choose the right disks; Shard. Here's an explanation of all 5 strategies: Optimize your queries . Computer science works. Complexity analysis works. A btree search is faster than a table scan. So analyze your queries. Use explain to see what your query is doing. If it is saying it's using a cursor then it's doing a table scan. That's slow. Look at the number of documents it looks at to satisfy a query. Look at how long it takes. Fix: add indexes. It doesn't matter if you are running on 1 or 100 servers. Know your working set size . Sticking memcache in front of your database is silly. You have lots of RAM, use it. Embed your cache in the database, which is how MongoDB works. Working set
3 0.66168946 1606 high scalability-2014-03-05-10 Things You Should Know About Running MongoDB at Scale
Introduction: Guest post by Asya Kamsky , Principal Solutions Architect at MongoDB. This post outlines ten things you need to know for operating MongoDB at scale based on my experience working with MongoDB customers and open source users: MongoDB requires DevOps, too. MongoDB is a database. Like any other data store, it requires capacity planning, tuning, monitoring, and maintenance. Just because it's easy to install and get started and it fits the developer paradigm more naturally than a relational database, don't assume that MongoDB doesn't need proper care and feeding. And just because it performs super-fast on a small sample dataset in development doesn't mean you can get away without having a good schema and indexing strategy, as well as the right hardware resources in production! But if you prepare well and understand the best practices, operating large MongoDB clusters can be boring instead of nerve-wracking. Successful MongoDB users monitor everything and prepare for growth.
4 0.64729679 1054 high scalability-2011-06-06-NoSQL Pain? Learn How to Read-write Scale Without a Complete Re-write
Introduction: Lately I've been reading more cases were different people have started to realize the limitations of the NoSQL promise to database scalability. Note the references below: Why does Quora use MySQL as the data store instead of NoSQLs such as Cassandra, MongoDB, CouchDB etc? Why did Diaspora abandon MongoDB for MySQL? How scalable is CouchDB in practice, not just in theory? Take MongoDB for example. It's damn fast, but it doesn't really know how to save data reliably to disk. I've had it set up in a replica pair to mitigate that risk. Guess what - both servers in the pair failed and corrupted their data files at the same day. It appears that for many, the switch to NoSQL can be rather painful. IMO that doesn't necessarily mean that NoSQL is wrong in general, but it's a combination of 1) lack of maturity 2) not the right tool for the job. That brings the question of what's the alternative solution? In the following post I tried to summarize the lessons from
5 0.63131475 670 high scalability-2009-08-05-Anti-RDBMS: A list of distributed key-value stores
Introduction: Update 8: Introducing MongoDB by Eliot Horowit . Update 7: The Future of Scalable Databases by Robin Mathew. Update 6: NoSQL : If Only it Was that Easy . BJ Clark lays down the law on which databases are scalable: Tokyo - NO, Redis - NO, Voldemort - YES, MongoDB - Not Yet, Cassandra - Probably, Amazon S3 - YES * 2, MySQL - NO. The real thing to point out is that if you are being held back from making something super awesome because you can’t choose a database, you are doing it wrong. Update 5: Exciting stuff happening in Japan at this Key-Value Storage meeting in Tokyo . Presentations on Groonga, Senna, Lux IO, Tokyo-Cabinet, Tx, repcached, Kai, Cagra, kumofs, ROMA, and Flare. Update 4: NoSQL and the Relational Model: don’t throw the baby out with the bathwater by Matthew Willson. So my key point is, this kind of modelling is WORTH DOING, regardless of which database tool you end up using for physical storage. Update 3: Choosing a non-relational database
6 0.62803471 940 high scalability-2010-11-12-Stuff the Internet Says on Scalability For November 12th, 2010
7 0.61123472 739 high scalability-2009-11-09-10 NoSQL Systems Reviewed
8 0.61082834 745 high scalability-2009-11-25-Brian Aker's Hilarious NoSQL Stand Up Routine
9 0.6012578 935 high scalability-2010-11-05-Hot Scalability Links For November 5th, 2010
10 0.59740853 1129 high scalability-2011-09-30-Stuff The Internet Says On Scalability For September 30, 2011
11 0.58792055 737 high scalability-2009-11-05-A Yes for a NoSQL Taxonomy
12 0.5801211 1089 high scalability-2011-07-29-Stuff The Internet Says On Scalability For July 29, 2011
13 0.57623327 1101 high scalability-2011-08-19-Stuff The Internet Says On Scalability For August 19, 2011
14 0.57439882 980 high scalability-2011-01-28-Stuff The Internet Says On Scalability For January 28, 2011
15 0.56804997 1067 high scalability-2011-06-24-Stuff The Internet Says On Scalability For June 24, 2011
16 0.56486684 875 high scalability-2010-08-09-NoSQL on the Microsoft Platform
17 0.55628723 1180 high scalability-2012-01-24-The State of NoSQL in 2012
18 0.55530435 990 high scalability-2011-02-15-Wordnik - 10 million API Requests a Day on MongoDB and Scala
19 0.54793417 1247 high scalability-2012-05-18-Stuff The Internet Says On Scalability For May 18, 2012
20 0.54769558 1262 high scalability-2012-06-11-Monday Fun: Seven Databases in Song
topicId topicWeight
[(1, 0.037), (2, 0.108), (61, 0.077), (79, 0.629)]
simIndex simValue blogId blogTitle
1 0.9971664 443 high scalability-2008-11-14-Paper: Pig Latin: A Not-So-Foreign Language for Data Processing
Introduction: Yahoo has developed a new language called Pig Latin that fit in a sweet spot between high-level declarative querying in the spirit of SQL, and low-level, procedural programming `a la map-reduce and combines best of both worlds. The accompanying system, Pig, is fully implemented, and compiles Pig Latin into physical plans that are executed over Hadoop, an open-source, map-reduce implementation. Pig has just graduated from the Apache Incubator and joined Hadoop as a subproject. The paper has a few examples of how engineers at Yahoo! are using Pig to dramatically reduce the time required for the development and execution of their data analysis tasks, compared to using Hadoop directly. References: Apache Pig Wiki
2 0.99177337 782 high scalability-2010-02-23-When to migrate your database?
Introduction: Why migrate your database? Efficiency and availability problems are harming your business – reports are out of date, your batch processing window is nearing its limits, outages (unplanned/planned) frequently halt work. Database consolidation – remove the costs that result from a heterogeneous database environment (DBAs time, database vendor pricing, database versions, hardware, OSs, patches, upgrades etc.). OK, so the driving forces for migration are clear, what now? Read more on BigDataMatters.com
3 0.98780191 401 high scalability-2008-10-04-Is MapReduce going mainstream?
Introduction: Compares MapReduce to other parallel processing approaches and suggests new paradigm for clouds and grids
4 0.98695707 107 high scalability-2007-10-02-Some Real Financial Numbers for Your Startup
Introduction: If you are a startup you may find useful Guy Kawasaki's post Financial Models for Underachievers: Two Years of the Real Numbers of a Startup . Part of any business plan are the projected guestimates. They are guestimates because everyone keeps these numbers hidden like a Swiss bank account. But not Redfin . They've bravely shared their initial cost projections, their actual numbers from real life, and the lessons they've learned from the discrepancy between the two... You can find their model estimates and actuals for Rent, Per Employee, Per Month (model: $250, actual: $336); Initial Per-Employee Equipment Cost; Monthly Benefits, Per-Employee; Annual Payroll Tax; Quarterly Bonus Payout, as a % of the Total Possible; Annual Payroll Increase for Existing Employees; All-Company Meeting Cost, Per-Meeting, Per-Employee; Annual Accounting Costs, and a few more. There is also a great lessons section: Focus on headcount; Plan slow, run fast; Run top-down sanity-checks; Fo
5 0.9830898 743 high scalability-2009-11-23-Big Data on Grids or on Clouds?
Introduction: Contributed by Wolfgang Gentzsch: Now that we have a new computing paradigm, Cloud Computing, how can Clouds help our data? Replace our internal data vaults as we hoped Grids would? Are Grids dead now that we have Clouds? Despite all the promising developments in the Grid and Cloud computing space, and the avalanche of publications and talks on this subject, many people still seem to be confused about internal data and compute resources, versus Grids versus Clouds, and they are hesitant to take the next step. I think there are a number of issues driving this uncertainty. read more at: BigDataMatters.com
6 0.9772976 1100 high scalability-2011-08-18-Paper: The Akamai Network - 61,000 servers, 1,000 networks, 70 countries
7 0.97687757 692 high scalability-2009-09-01-Cheap storage: how backblaze takes matters in hand
8 0.97687757 1119 high scalability-2011-09-20-HighScalability is old news. Step your scaling game way up... (NSFW cartoon)
9 0.97532547 8 high scalability-2007-07-12-Should I use LAMP or Windows?
same-blog 10 0.97181541 1169 high scalability-2012-01-05-Shutterfly Saw a Speedup of 500% With Flashcache
11 0.97075218 372 high scalability-2008-08-27-Updating distributed web applications
12 0.96028787 1277 high scalability-2012-07-05-10 Golden Principles For Building Successful Mobile-Web Applications
13 0.94409245 784 high scalability-2010-02-25-Paper: High Performance Scalable Data Stores
14 0.94309384 323 high scalability-2008-05-19-Twitter as a scalability case study
15 0.91191131 871 high scalability-2010-08-04-Dremel: Interactive Analysis of Web-Scale Datasets - Data as a Programming Paradigm
16 0.88271195 1403 high scalability-2013-02-08-Stuff The Internet Says On Scalability For February 8, 2013
17 0.87398362 680 high scalability-2009-08-13-Reconnoiter - Large-Scale Trending and Fault-Detection
18 0.87328923 1181 high scalability-2012-01-25-Google Goes MoreSQL with Tenzing - SQL Over MapReduce
19 0.86798006 75 high scalability-2007-08-28-Google Utilities : An online google guide,tools and Utilities.
20 0.86333716 1420 high scalability-2013-03-08-Stuff The Internet Says On Scalability For March 8, 2013