high_scalability high_scalability-2008 high_scalability-2008-216 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Update: Typical Programmer tackles the technical issues in Relational Database Experts Jump The MapReduce Shark . The culture clash is still what fascinates me. David DeWitt writes in the Database Column that MapReduce is a major step backwards: A giant step backward in the programming paradigm for large-scale data intensive applications A sub-optimal implementation, in that it uses brute force instead of indexing Not novel at all -- it represents a specific implementation of well known techniques developed nearly 25 years ago Missing most of the features that are routinely included in current DBMS Incompatible with all of the tools DBMS users have come to depend on Listening to databasers and map reducers talk is like eavesdropping on your average family holiday mashup. Every holiday people who have virtually nothing in common are thrown together because they incidentally share a little DNA or are married to the shared DNA. In desperation everyone gravitates to som
sentIndex sentText sentNum sentScore
1 Update: Typical Programmer tackles the technical issues in Relational Database Experts Jump The MapReduce Shark . [sent-1, score-0.087]
2 Every holiday people who have virtually nothing in common are thrown together because they incidentally share a little DNA or are married to the shared DNA. [sent-4, score-0.39]
3 In desperation everyone gravitates to some shared enemy they can all confidently bash. [sent-5, score-0.333]
4 But after that moment is relieved and awkward silence once again looms, nothing is left but more drinking and tackling sensitive topics you just know will end badly. [sent-6, score-0.471]
5 Database folks love their schemas, relational purity and their swiss army knife indexes. [sent-7, score-0.333]
6 You soon learn that really map reduce is just another form of an index and indexes really can scale to any heights with just a little tweaking. [sent-8, score-0.487]
7 Map reducers lover their pure functional models, their self-healing clustery filled ecosystems, and the shear joy of the semi-organized chaos of letting a 10,000 CPUs simultaneously bloom. [sent-9, score-0.633]
8 Yet, I too like my map reduce engine, distributed file system combo platter. [sent-13, score-0.492]
9 With map reduce I can implement any complex behavior over any data set. [sent-14, score-0.377]
10 You aren't limit to set logic, SQL types, and tweaked indexes. [sent-16, score-0.097]
11 Much like a staunchly conservative nail crunching father and his too soft pansy liberal son, these two camps will never understand each other. [sent-18, score-0.635]
12 Every sign of beauty in one person's eyes is just another confirmation to the other side of impending senility. [sent-19, score-0.31]
13 Just hug in a manly way and agree to meet again next year. [sent-21, score-0.106]
wordName wordTfidf (topN-words)
[('map', 0.265), ('reducers', 0.23), ('holiday', 0.19), ('awkward', 0.123), ('clash', 0.123), ('desperation', 0.123), ('eavesdropping', 0.123), ('fascinates', 0.123), ('looms', 0.123), ('purity', 0.123), ('silence', 0.123), ('applicationsa', 0.115), ('combo', 0.115), ('confidently', 0.115), ('impending', 0.115), ('lover', 0.115), ('married', 0.115), ('nail', 0.115), ('relieved', 0.115), ('reduce', 0.112), ('heights', 0.11), ('drinking', 0.11), ('confirmation', 0.11), ('father', 0.11), ('knife', 0.11), ('relish', 0.11), ('son', 0.11), ('brute', 0.106), ('conservative', 0.106), ('hug', 0.106), ('shear', 0.106), ('firmly', 0.102), ('camps', 0.102), ('ecosystems', 0.102), ('liberal', 0.102), ('swiss', 0.1), ('crunching', 0.1), ('joy', 0.1), ('routinely', 0.097), ('tweaked', 0.097), ('step', 0.097), ('backward', 0.095), ('enemy', 0.095), ('mapreduce', 0.091), ('tackles', 0.087), ('contributes', 0.086), ('eyes', 0.085), ('thrown', 0.085), ('backwards', 0.082), ('chaos', 0.082)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 216 high scalability-2008-01-17-Database People Hating on MapReduce
Introduction: Update: Typical Programmer tackles the technical issues in Relational Database Experts Jump The MapReduce Shark . The culture clash is still what fascinates me. David DeWitt writes in the Database Column that MapReduce is a major step backwards: A giant step backward in the programming paradigm for large-scale data intensive applications A sub-optimal implementation, in that it uses brute force instead of indexing Not novel at all -- it represents a specific implementation of well known techniques developed nearly 25 years ago Missing most of the features that are routinely included in current DBMS Incompatible with all of the tools DBMS users have come to depend on Listening to databasers and map reducers talk is like eavesdropping on your average family holiday mashup. Every holiday people who have virtually nothing in common are thrown together because they incidentally share a little DNA or are married to the shared DNA. In desperation everyone gravitates to som
2 0.13656694 448 high scalability-2008-11-22-Google Architecture
Introduction: Update 2: Sorting 1 PB with MapReduce . PB is not peanut-butter-and-jelly misspelled. It's 1 petabyte or 1000 terabytes or 1,000,000 gigabytes. It took six hours and two minutes to sort 1PB (10 trillion 100-byte records) on 4,000 computers and the results were replicated thrice on 48,000 disks. Update: Greg Linden points to a new Google article MapReduce: simplified data processing on large clusters . Some interesting stats: 100k MapReduce jobs are executed each day; more than 20 petabytes of data are processed per day; more than 10k MapReduce programs have been implemented; machines are dual processor with gigabit ethernet and 4-8 GB of memory. Google is the King of scalability. Everyone knows Google for their large, sophisticated, and fast searching, but they don't just shine in search. Their platform approach to building scalable applications allows them to roll out internet scale applications at an alarmingly high competition crushing rate. Their goal is always to build
3 0.085131876 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?
Introduction: It's a truism that we should choose the right tool for the job . Everyone says that. And who can disagree? The problem is this is not helpful advice without being able to answer more specific questions like: What jobs are the tools good at? Will they work on jobs like mine? Is it worth the risk to try something new when all my people know something else and we have a deadline to meet? How can I make all the tools work together? In the NoSQL space this kind of real-world data is still a bit vague. When asked, vendors tend to give very general answers like NoSQL is good for BigData or key-value access. What does that mean for for the developer in the trenches faced with the task of solving a specific problem and there are a dozen confusing choices and no obvious winner? Not a lot. It's often hard to take that next step and imagine how their specific problems could be solved in a way that's worth taking the trouble and risk. Let's change that. What problems are you using NoSQL to sol
4 0.08181981 483 high scalability-2009-01-04-Paper: MapReduce: Simplified Data Processing on Large Clusters
Introduction: Update: MapReduce and PageRank Notes from Remzi Arpaci-Dusseau's Fall 2008 class . Collects interesting facts about MapReduce and PageRank. For example, the history of the solution to searching for the term "flu" is traced through multiple generations of technology. With Google entering the cloud space with Google AppEngine and a maturing Hadoop product, the MapReduce scaling approach might finally become a standard programmer practice. This is the best paper on the subject and is an excellent primer on a content-addressable memory future. Some interesting stats from the paper: Google executes 100k MapReduce jobs each day; more than 20 petabytes of data are processed per day; more than 10k MapReduce programs have been implemented; machines are dual processor with gigabit ethernet and 4-8 GB of memory. One common criticism ex-Googlers have is that it takes months to get up and be productive in the Google environment. Hopefully a way will be found to lower the learning curve a
5 0.073425114 515 high scalability-2009-02-19-GIS Application Hosting
Introduction: Share the experience of hosting highly scalable/reliable GIS based application which involves Map Server, Spatially enabled database, j2ee, Routing Applications etc.
6 0.073198937 658 high scalability-2009-07-17-Against all the odds
7 0.072150208 733 high scalability-2009-10-29-Paper: No Relation: The Mixed Blessings of Non-Relational Databases
8 0.071732409 1195 high scalability-2012-02-17-Stuff The Internet Says On Scalability For February 17, 2012
9 0.070608035 931 high scalability-2010-10-28-Notes from A NOSQL Evening in Palo Alto
10 0.069207393 1055 high scalability-2011-06-08-Stuff to Watch from Google IO 2011
11 0.067465067 1064 high scalability-2011-06-20-35+ Use Cases for Choosing Your Next NoSQL Database
12 0.067072786 1173 high scalability-2012-01-12-Peregrine - A Map Reduce Framework for Iterative and Pipelined Jobs
13 0.066199385 1187 high scalability-2012-02-03-Stuff The Internet Says On Scalability For February 3, 2012
14 0.066135779 372 high scalability-2008-08-27-Updating distributed web applications
15 0.063968264 1085 high scalability-2011-07-25-Is NoSQL a Premature Optimization that's Worse than Death? Or the Lady Gaga of the Database World?
16 0.063643582 327 high scalability-2008-05-27-How I Learned to Stop Worrying and Love Using a Lot of Disk Space to Scale
17 0.063004129 589 high scalability-2009-05-05-Drop ACID and Think About Data
18 0.062560484 882 high scalability-2010-08-18-Misco: A MapReduce Framework for Mobile Systems - Start of the Ambient Cloud?
20 0.061661988 750 high scalability-2009-12-16-Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud
topicId topicWeight
[(0, 0.114), (1, 0.066), (2, -0.006), (3, 0.041), (4, 0.025), (5, 0.045), (6, -0.018), (7, 0.031), (8, 0.011), (9, -0.004), (10, 0.011), (11, -0.01), (12, -0.008), (13, -0.011), (14, 0.057), (15, -0.016), (16, -0.047), (17, -0.003), (18, 0.009), (19, 0.023), (20, 0.019), (21, -0.038), (22, 0.005), (23, -0.016), (24, 0.025), (25, 0.003), (26, 0.002), (27, 0.007), (28, -0.007), (29, 0.01), (30, 0.014), (31, 0.032), (32, -0.037), (33, 0.016), (34, -0.011), (35, -0.022), (36, 0.021), (37, -0.021), (38, 0.001), (39, 0.024), (40, -0.012), (41, 0.008), (42, -0.024), (43, -0.0), (44, -0.02), (45, -0.001), (46, 0.027), (47, -0.008), (48, -0.004), (49, -0.021)]
simIndex simValue blogId blogTitle
same-blog 1 0.97036999 216 high scalability-2008-01-17-Database People Hating on MapReduce
Introduction: Update: Typical Programmer tackles the technical issues in Relational Database Experts Jump The MapReduce Shark . The culture clash is still what fascinates me. David DeWitt writes in the Database Column that MapReduce is a major step backwards: A giant step backward in the programming paradigm for large-scale data intensive applications A sub-optimal implementation, in that it uses brute force instead of indexing Not novel at all -- it represents a specific implementation of well known techniques developed nearly 25 years ago Missing most of the features that are routinely included in current DBMS Incompatible with all of the tools DBMS users have come to depend on Listening to databasers and map reducers talk is like eavesdropping on your average family holiday mashup. Every holiday people who have virtually nothing in common are thrown together because they incidentally share a little DNA or are married to the shared DNA. In desperation everyone gravitates to som
2 0.75354838 1607 high scalability-2014-03-07-Stuff The Internet Says On Scalability For March 7th, 2014
Introduction: Hey, it's HighScalability time: Twitter valiantly survived an Oscar DDoS attack by non-state actors. Several Billion : Apple iMessages per Day along with 40 billion notifications and 15 to 20 million FaceTime calls. Take that WhatsApp. Their architecture? Hey, this is Apple, only the Shadow knows. 200 bit quantum computer : more states than atoms in the universe; 10 million matches : Tinder's per day catch; $1 billion : Kickstarter's long tail pledge funding achievement Quotable Quotes: @cstross : Let me repeat that: 100,000 ARM processors will cost you a total of $75,000 and probably fit in your jacket pocket. @openflow : "You can no longer separate compute, storage, and networking." -- @vkhosla #ONS2014 @HackerNewsOnion : New node.js co-working space has 1 table and everyone takes turns @chrismunns : we're reaching the point where ease and low cost of doing DDOS attacks means you shouldn't serve anything directly out of your
3 0.75185215 900 high scalability-2010-09-11-Google's Colossus Makes Search Real-time by Dumping MapReduce
Introduction: As the Kings of scaling, when Google changes its search infrastructure over to do something completely different, it's news. In Google search index splits with MapReduce , an exclusive interview by Cade Metz with Eisar Lipkovitz, a senior director of engineering at Google, we learn a bit more of the secret scaling sauce behind Google Instant , Google's new faster, real-time search system. The challenge for Google has been how to support a real-time world when the core of their search technology, the famous MapReduce, is batch oriented. Simple, they got rid of MapReduce. At least they got rid of MapReduce as the backbone for calculating search indexes. MapReduce still excels as a general query mechanism against masses of data, but real-time search requires a very specialized tool, and Google built one. Internally the successor to Google's famed Google File System, was code named Colossus. Details are slowly coming out about their new goals and approach: Goal is to update the
Introduction: A lot of people seem to passionately dislike the term NewSQL , or pretty much any newly coined term for that matter, but after watching Alex Lloyd, Senior Staff Software Engineer Google, give a great talk on Building Spanner , that’s the term that fits Spanner best. Spanner wraps the SQL + transaction model of OldSQL around the reworked bones of a globally distributed NoSQL system. That seems NewSQL to me. As Spanner is a not so distant cousin of BigTable, the NoSQL component should be no surprise. Spanner is charged with spanning millions of machines inside any number of geographically distributed datacenters. What is surprising is how OldSQL has been embraced. In an earlier 2011 talk given by Alex at the HotStorage conference, the reason for embracing OldSQL was the desire to make it easier and faster for programmers to build applications. The main ideas will seem quite familiar: There’s a false dichotomy between little complicated databases and huge, sca
5 0.73881459 1567 high scalability-2013-12-20-Stuff The Internet Says On Scalability For December 20th, 2013
Introduction: Hey, it's HighScalability time (with so much cool info this week it will blow your mind): Amazing microscope image of a carnivorous bladderwort How many drones would it take to replace Santa? With a fleet of some 80 million or so F-16 drones the entire worldwide delivery could be completed in just over eight hours. Impressive, but a world without Rudolf is not a world I wish to contemplate. Quotable Quotes: @Loh : Always wanted to travel back in time to try fighting a younger version of yourself? Software development is the career for you! @mraleph : often devs still approach performance of JS code as if they are riding a horse cart but the horse had long been replaced with fusion reactor @peakscale : "The c3.large is 40% faster and has more than double the memory than the c1.medium but costs about the same" @techmilind : Conversation with an ex-Yahoo, now at a Telecom company. Replaced $22M of Teradata by $450K
6 0.72817338 871 high scalability-2010-08-04-Dremel: Interactive Analysis of Web-Scale Datasets - Data as a Programming Paradigm
7 0.72514755 1292 high scalability-2012-07-27-Stuff The Internet Says On Scalability For July 27, 2012
8 0.724832 1204 high scalability-2012-03-06-Ask For Forgiveness Programming - Or How We'll Program 1000 Cores
9 0.724123 1431 high scalability-2013-03-29-Stuff The Internet Says On Scalability For March 29, 2013
10 0.71773934 1649 high scalability-2014-05-16-Stuff The Internet Says On Scalability For May 16th, 2014
11 0.71731299 1239 high scalability-2012-05-04-Stuff The Internet Says On Scalability For May 4, 2012
12 0.71653122 844 high scalability-2010-06-18-Paper: The Declarative Imperative: Experiences and Conjectures in Distributed Logic
13 0.71459305 1625 high scalability-2014-04-03-Leslie Lamport to Programmers: You're Doing it Wrong
14 0.71342063 1075 high scalability-2011-07-07-Myth: Google Uses Server Farms So You Should Too - Resurrection of the Big-Ass Machines
15 0.70626831 1315 high scalability-2012-08-30-Stuff The Internet Says On Scalability For August 31, 2012
16 0.70549124 1439 high scalability-2013-04-12-Stuff The Internet Says On Scalability For April 12, 2013
17 0.70221657 483 high scalability-2009-01-04-Paper: MapReduce: Simplified Data Processing on Large Clusters
18 0.70087981 1302 high scalability-2012-08-10-Stuff The Internet Says On Scalability For August 10, 2012
19 0.70022267 1233 high scalability-2012-04-25-The Anatomy of Search Technology: blekko’s NoSQL database
20 0.69918287 849 high scalability-2010-06-28-VoltDB Decapitates Six SQL Urban Myths and Delivers Internet Scale OLTP in the Process
topicId topicWeight
[(1, 0.09), (2, 0.096), (30, 0.022), (61, 0.044), (63, 0.362), (79, 0.217), (85, 0.024), (94, 0.063)]
simIndex simValue blogId blogTitle
same-blog 1 0.86297256 216 high scalability-2008-01-17-Database People Hating on MapReduce
Introduction: Update: Typical Programmer tackles the technical issues in Relational Database Experts Jump The MapReduce Shark . The culture clash is still what fascinates me. David DeWitt writes in the Database Column that MapReduce is a major step backwards: A giant step backward in the programming paradigm for large-scale data intensive applications A sub-optimal implementation, in that it uses brute force instead of indexing Not novel at all -- it represents a specific implementation of well known techniques developed nearly 25 years ago Missing most of the features that are routinely included in current DBMS Incompatible with all of the tools DBMS users have come to depend on Listening to databasers and map reducers talk is like eavesdropping on your average family holiday mashup. Every holiday people who have virtually nothing in common are thrown together because they incidentally share a little DNA or are married to the shared DNA. In desperation everyone gravitates to som
2 0.83862889 193 high scalability-2007-12-26-Finding an excellent LAMP developer
Introduction: Hi... I have this idea to start a really great and scalable website, and I am building it! So far I'm doing everything myself - coding, networking, architecture planning, everything. I haven't even gotten into the legal aspects yet....... It would be MUCH easier if I had a technical person to handle that end of the operation. I'm a good coder, but like Bill Gates at Harvard for Math, I'm not the very best. I'd like to FIND that very best person available, to handle the technical aspects. For worse or better, I don't presently know somebody who fits this bill. I've posted a bazillion ads on Craig's List, with no really qualified responses. I've put out feelers among my own network, same result. Not sure what else I can do. Shoestring budget, so it's sweat equity in the beginning. That can actually be a plus, as it forces people to focus. Any ideas about what else I can do, to attract the right person? Thanks Jason
3 0.78451818 647 high scalability-2009-07-02-Hypertable is a New BigTable Clone that Runs on HDFS or KFS
Introduction: Update 3 : Presentation from the NoSQL conference : slides , video 1 , video 2 . Update 2 : The folks at Hypertable would like you to know that Hypertable is now officially sponsored by Baidu , China’s Leading Search Engine. As a sponsor of Hypertable, Baidu has committed an industrious team of engineers, numerous servers, and support resources to improve the quality and development of the open source technology. Update : InfoQ interview on Hypertable Lead Discusses Hadoop and Distributed Databases . Hypertable differs from HBase in that it is a higher performance implementation of Bigtable. Skrentablog gives the heads up on Hypertable , Zvents' open-source BigTable clone. It's written in C++ and can run on top of either HDFS or KFS. Performance looks encouraging at 28M rows of data inserted at a per-node write rate of 7mb/sec .
4 0.65833521 940 high scalability-2010-11-12-Stuff the Internet Says on Scalability For November 12th, 2010
Introduction: Google – A Study In Scalability And A Little Systems Horse Sense . A nice summary by Krishna Sankar of a version of Jeff Dean's classic talk on Google Scalability given to Stanford's EE380 class . Quotable Quotes: @jkalucki : Getting just 100 servers to work together for the first time is so ridiculously complicated. Horizontal scaling doesn't scale. @simeons : Yahoo's scalability is drivem by lots of asynchronous processing. "You learn to love it." -- @rstata Yahoo's CTO The Economics of the Cloud: Dissecting a Must-Read White Paper by Bernard Golden. I love the depiction of the unseen and unfelt forces that nevertheless organize everything around them: After a brief introduction, the authors lay out a central thesis: despite initial concerns about shortcomings in new technology offerings, "historically, underlying economics have a much stronger impact on the direction and speed of disruptions, as technological challenges are resolved or overcome thro
5 0.64671749 1613 high scalability-2014-03-17-Intuitively Showing How To Scale a Web Application Using a Coffee Shop as an Example
Introduction: This is a guest repost by Sriram Devadas , Engineer at Vistaprint, Web platform group. A fun and well written analogy of how to scale web applications using a familiar coffee shop as an example. No coffee was harmed during the making of this post. I own a small coffee shop. My expense is proportional to resources 100 square feet of built up area with utilities, 1 barista, 1 cup coffee maker. My shop's capacity Serves 1 customer at a time, takes 3 minutes to brew a cup of coffee, a total of 5 minutes to serve a customer. Since my barista works without breaks and the German made coffee maker never breaks down, my shop's maximum throughput = 12 customers per hour. Web server Customers walk away during peak hours. We only serve one customer at a time. There is no space to wait. I upgrade shop. My new setup is better! Expenses Same area and utilities, 3 baristas, 2 cup coffee maker, 2 chairs Capacity 3 minutes to brew 2 cups of coffee, ~7 minutes to serv
6 0.6454953 999 high scalability-2011-03-04-Stuff The Internet Says On Scalability For March 4, 2011
7 0.63354051 1463 high scalability-2013-05-23-Paper: Calvin: Fast Distributed Transactions for Partitioned Database Systems
8 0.57764959 871 high scalability-2010-08-04-Dremel: Interactive Analysis of Web-Scale Datasets - Data as a Programming Paradigm
9 0.57686156 1403 high scalability-2013-02-08-Stuff The Internet Says On Scalability For February 8, 2013
10 0.57519329 1420 high scalability-2013-03-08-Stuff The Internet Says On Scalability For March 8, 2013
11 0.57456404 784 high scalability-2010-02-25-Paper: High Performance Scalable Data Stores
12 0.57431275 680 high scalability-2009-08-13-Reconnoiter - Large-Scale Trending and Fault-Detection
13 0.57247788 1494 high scalability-2013-07-19-Stuff The Internet Says On Scalability For July 19, 2013
14 0.57169217 1277 high scalability-2012-07-05-10 Golden Principles For Building Successful Mobile-Web Applications
15 0.57123661 448 high scalability-2008-11-22-Google Architecture
16 0.56896615 650 high scalability-2009-07-02-Product: Hbase
17 0.56862521 1100 high scalability-2011-08-18-Paper: The Akamai Network - 61,000 servers, 1,000 networks, 70 countries
18 0.5650484 1169 high scalability-2012-01-05-Shutterfly Saw a Speedup of 500% With Flashcache
19 0.56443393 786 high scalability-2010-03-02-Using the Ambient Cloud as an Application Runtime
20 0.56411099 581 high scalability-2009-04-26-Map-Reduce for Machine Learning on Multicore