high_scalability high_scalability-2010 high_scalability-2010-806 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Scalability porn (SFW). Real time meter for the number of ads being served by doubleclick. Amazing. A constant ~390,000 impressions a second are being served and 25 trillion since 1996. Thanks to Mike Rhoads for title idea. Scalability? Don't worry. Application complexity? Worry by Joe McKendrick. The next challenge on enterprise agendas: application complexity. This is something that lots of hardware — whether from the cloud or internal data center — cannot fix Leo Laporte and Steve Gibson talked about how the iPad was a denial of service attack on UPS delivery schedules. UPS trucks were filled with iPads. Cassandra: Fact vs fiction . Jonathan Ellies puts the beatdown on Cassandra misinformation. Don't you dare say Cassandra can't work across datacenters! JIT'd code calling conventions . Cliff Click Jr shows how Java’s calling convention can match compiled C code in speed, but allows for the flexibility of calling (code,slow) non-JIT'd code . Some assembly code re
sentIndex sentText sentNum sentScore
1 Real time meter for the number of ads being served by doubleclick. [sent-2, score-0.2]
2 A constant ~390,000 impressions a second are being served and 25 trillion since 1996. [sent-4, score-0.191]
3 This is something that lots of hardware — whether from the cloud or internal data center — cannot fix Leo Laporte and Steve Gibson talked about how the iPad was a denial of service attack on UPS delivery schedules. [sent-11, score-0.158]
4 Don't you dare say Cassandra can't work across datacenters! [sent-15, score-0.092]
5 Cliff Click Jr shows how Java’s calling convention can match compiled C code in speed, but allows for the flexibility of calling (code,slow) non-JIT'd code . [sent-17, score-0.539]
6 James Hamilton : Don’t throw full consistency out too early. [sent-20, score-0.12]
7 For many applications, it is both affordable and helps reduce application implementation errors. [sent-21, score-0.076]
8 Redundant Arrays of Independent Cloud computing providers - keep multiple, redundant copies of all data with multiple cloud providers. [sent-27, score-0.26]
9 Database in cloud – Amazon RDS or MySQL on ebs? [sent-28, score-0.158]
10 The 4 hour weekly downtime was their the biggest problem with RDS. [sent-30, score-0.141]
11 The emphasis on scale-out and reducing the cost of joins spans the NoSQL and SQL-based worlds. [sent-32, score-0.085]
12 This particular memcached is running an ancient version of Debian on a three year old dual-Xeon 1U box, and it is using TCP. [sent-41, score-0.186]
13 The biggest cloud on the planet is owned by . [sent-42, score-0.532]
14 The biggest cloud on the planet, the network of computers controlled by the Conficker computer worm. [sent-46, score-0.375]
15 4 million computer systems at 230 top level domains globally, more than 18 million CPUs and 28 terabits per second of bandwidth. [sent-48, score-0.275]
16 In contrast to the index-intensive, set-theoretic operations of relational databases, graph databases make use of index-free, local traversals. [sent-52, score-0.171]
17 This article discusses the graph traversal pattern and its use in computing. [sent-53, score-0.359]
wordName wordTfidf (topN-words)
[('conficker', 0.268), ('calling', 0.232), ('traversal', 0.188), ('ups', 0.181), ('graph', 0.171), ('cloud', 0.158), ('planet', 0.154), ('biggest', 0.141), ('robert', 0.137), ('oltp', 0.133), ('consistency', 0.12), ('james', 0.119), ('terabits', 0.114), ('trucks', 0.114), ('vaibhav', 0.114), ('northscale', 0.109), ('laporte', 0.109), ('sfw', 0.109), ('served', 0.106), ('gibson', 0.105), ('memcached', 0.104), ('redundant', 0.102), ('porn', 0.101), ('jit', 0.096), ('systemsby', 0.096), ('meter', 0.094), ('dare', 0.092), ('cliff', 0.092), ('jon', 0.092), ('toadvertisea', 0.092), ('usfor', 0.092), ('debian', 0.088), ('simon', 0.086), ('assembly', 0.086), ('spans', 0.085), ('second', 0.085), ('accuracy', 0.084), ('curt', 0.084), ('cassandra', 0.082), ('ancient', 0.082), ('pleasecontact', 0.082), ('jonathan', 0.08), ('owned', 0.079), ('filled', 0.077), ('computer', 0.076), ('affordable', 0.076), ('theorem', 0.076), ('controls', 0.076), ('arrays', 0.075), ('compiled', 0.075)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 806 high scalability-2010-04-08-Hot Scalability Links for April 8, 2010
Introduction: Scalability porn (SFW). Real time meter for the number of ads being served by doubleclick. Amazing. A constant ~390,000 impressions a second are being served and 25 trillion since 1996. Thanks to Mike Rhoads for title idea. Scalability? Don't worry. Application complexity? Worry by Joe McKendrick. The next challenge on enterprise agendas: application complexity. This is something that lots of hardware — whether from the cloud or internal data center — cannot fix Leo Laporte and Steve Gibson talked about how the iPad was a denial of service attack on UPS delivery schedules. UPS trucks were filled with iPads. Cassandra: Fact vs fiction . Jonathan Ellies puts the beatdown on Cassandra misinformation. Don't you dare say Cassandra can't work across datacenters! JIT'd code calling conventions . Cliff Click Jr shows how Java’s calling convention can match compiled C code in speed, but allows for the flexibility of calling (code,slow) non-JIT'd code . Some assembly code re
2 0.17772891 628 high scalability-2009-06-13-Neo4j - a Graph Database that Kicks Buttox
Introduction: Update: Social networks in the database: using a graph database . A nice post on representing, traversing, and performing other common social network operations using a graph database. If you are Digg or LinkedIn you can build your own speedy graph database to represent your complex social network relationships. For those of more modest means Neo4j , a graph database, is a good alternative. A graph is a collection nodes (things) and edges (relationships) that connect pairs of nodes. Slap properties (key-value pairs) on nodes and relationships and you have a surprisingly powerful way to represent most anything you can think of. In a graph database "relationships are first-class citizens. They connect two nodes and both nodes and relationships can hold an arbitrary amount of key-value pairs. So you can look at a graph database as a key-value store, with full support for relationships." A graph looks something like: For more lovely examples take a look at the Graph Image Gal
3 0.15068585 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?
Introduction: It's a truism that we should choose the right tool for the job . Everyone says that. And who can disagree? The problem is this is not helpful advice without being able to answer more specific questions like: What jobs are the tools good at? Will they work on jobs like mine? Is it worth the risk to try something new when all my people know something else and we have a deadline to meet? How can I make all the tools work together? In the NoSQL space this kind of real-world data is still a bit vague. When asked, vendors tend to give very general answers like NoSQL is good for BigData or key-value access. What does that mean for for the developer in the trenches faced with the task of solving a specific problem and there are a dozen confusing choices and no obvious winner? Not a lot. It's often hard to take that next step and imagine how their specific problems could be solved in a way that's worth taking the trouble and risk. Let's change that. What problems are you using NoSQL to sol
4 0.13243799 827 high scalability-2010-05-14-Hot Scalability Links for May 14, 2010
Introduction: Lots of good ones this week... Scalability, Availability & Stability Patterns . Jonas Boner has 197 slides covering a very wide range of scalability topics. One stop scalability shopping. Horizontal Scalability via Transient, Shardable, and Share-Nothing Resources . Heroku's Adam Wiggins shares what they've learned about scaling based on their experiences building a cloud platform and the hundreds of apps running on it. He describes the next generation architecture he thinks all software should follow in the future. Scalability of the Hadoop Distributed File System . Konstantin V. Shvachko writes a great post analyzing if the limitations imposed on a distributed file system by the single-node namespace server architecture can support 100,000 clients and petabytes of files. Cassandra by Example . Eric Evans created a nice Cassandra tutorial using building a Twitter clone as an example. Many people want to see more data modeling examples. Here you are. UpSizeR: Synthet
5 0.13016878 538 high scalability-2009-03-16-Are Cloud Based Memory Architectures the Next Big Thing?
Introduction: We are on the edge of two potent technological changes: Clouds and Memory Based Architectures. This evolution will rip open a chasm where new players can enter and prosper. Google is the master of disk. You can't beat them at a game they perfected. Disk based databases like SimpleDB and BigTable are complicated beasts, typical last gasp products of any aging technology before a change. The next era is the age of Memory and Cloud which will allow for new players to succeed. The tipping point will be soon. Let's take a short trip down web architecture lane: It's 1993: Yahoo runs on FreeBSD, Apache, Perl scripts and a SQL database It's 1995: Scale-up the database. It's 1998: LAMP It's 1999: Stateless + Load Balanced + Database + SAN It's 2001: In-memory data-grid. It's 2003: Add a caching layer. It's 2004: Add scale-out and partitioning. It's 2005: Add asynchronous job scheduling and maybe a distributed file system. It's 2007: Move it all into the cloud. It's 2008: C
6 0.12704648 835 high scalability-2010-06-03-Hot Scalability Links for June 3, 2010
8 0.12518479 785 high scalability-2010-02-26-MySQL and Memcached: End of an Era?
10 0.12502775 626 high scalability-2009-06-10-Paper: Graph Databases and the Future of Large-Scale Knowledge Management
11 0.1231569 445 high scalability-2008-11-14-Useful Cloud Computing Blogs
14 0.11569713 801 high scalability-2010-03-30-Running Large Graph Algorithms - Evaluation of Current State-of-the-Art and Lessons Learned
16 0.11241044 842 high scalability-2010-06-16-Hot Scalability Links for June 16, 2010
17 0.11221097 797 high scalability-2010-03-19-Hot Scalability Links for March 19, 2010
topicId topicWeight
[(0, 0.218), (1, 0.04), (2, 0.023), (3, 0.095), (4, -0.049), (5, 0.06), (6, -0.073), (7, -0.056), (8, 0.044), (9, 0.01), (10, 0.044), (11, -0.012), (12, -0.047), (13, -0.001), (14, -0.077), (15, -0.008), (16, 0.01), (17, 0.078), (18, 0.023), (19, 0.049), (20, -0.033), (21, 0.012), (22, -0.024), (23, -0.052), (24, 0.017), (25, 0.017), (26, -0.03), (27, 0.075), (28, 0.047), (29, -0.048), (30, -0.007), (31, -0.028), (32, -0.021), (33, 0.009), (34, 0.014), (35, 0.08), (36, 0.003), (37, 0.015), (38, -0.001), (39, 0.038), (40, -0.009), (41, 0.024), (42, -0.034), (43, 0.039), (44, 0.002), (45, -0.065), (46, 0.067), (47, 0.014), (48, -0.024), (49, -0.011)]
simIndex simValue blogId blogTitle
same-blog 1 0.95114613 806 high scalability-2010-04-08-Hot Scalability Links for April 8, 2010
Introduction: Scalability porn (SFW). Real time meter for the number of ads being served by doubleclick. Amazing. A constant ~390,000 impressions a second are being served and 25 trillion since 1996. Thanks to Mike Rhoads for title idea. Scalability? Don't worry. Application complexity? Worry by Joe McKendrick. The next challenge on enterprise agendas: application complexity. This is something that lots of hardware — whether from the cloud or internal data center — cannot fix Leo Laporte and Steve Gibson talked about how the iPad was a denial of service attack on UPS delivery schedules. UPS trucks were filled with iPads. Cassandra: Fact vs fiction . Jonathan Ellies puts the beatdown on Cassandra misinformation. Don't you dare say Cassandra can't work across datacenters! JIT'd code calling conventions . Cliff Click Jr shows how Java’s calling convention can match compiled C code in speed, but allows for the flexibility of calling (code,slow) non-JIT'd code . Some assembly code re
2 0.77499539 797 high scalability-2010-03-19-Hot Scalability Links for March 19, 2010
Introduction: The Changelog Episode 0.1.8 - NoSQL Smackdown! This podcast was recorded at SXSW and features some energetic trash talking by: Stu Hood from Cassandra, Jan Lehnardt from CouchDB, Wynn Netherland from The Changelog, subbing for MongoDB, Werner Vogels CTO at Amazon. It's fun hearing these guys step out of their sober advocacy roles and let loose a little with why they are great and the other products suck, hard. Algorithmic Graph Theory . It's FREE! A GNU-FDL book on algorithmic graph theory by David Joyner, Minh Van Nguyen, and Nathann Cohen. This is an introductory book on algorithmic graph theory. HBase vs Cassandra: why we moved by Dominic Williams. Benchmarking Cloud Serving Systems with YCSB by lots of people from Yahoo! Research. We present the Yahoo! Cloud Serving Benchmark (YCSB) framework, with the goal of facilitating performance comparisons of the new generation of cloud data serving systems. We define a core set of benchmarks and report re- sults for four
3 0.76594329 973 high scalability-2011-01-14-Stuff The Internet Says On Scalability For January 14, 2011
Introduction: Submitted for your reading pleasure... On the new year Twitter set a record with 6,939 Tweets Per Second (TPS). Cool video visualizing New Year's Eve Tweet data across the world. Marko Rodriguez in Memoirs of a Graph Addict: Despair to Redemption tells a stirring tale of how graph programming saved the world from certain destruction by realizing Aritstotle's dream of an eudaimonia-driven society. Could a relational database do that? The tools of the revolution can be found at tinkerprop.com , which describes a databases agnostic stack for working with property graphs, they include Blueprints - a property graph model interface; Pipes - a dataflow netowork using process grapphs; Gremlin - a graph based programming language; Rexster - a RESTful graph shell. The never never ending battle of good versus evil has nothing on programmers arguing about bracket policies or sync vs async programming models. In this node.js thread, I love async, but I can't code like this , the batt
4 0.74827582 1283 high scalability-2012-07-13-Stuff The Internet Says On Scalability For July 13, 2012
Introduction: It's HighScalability Time (Good luck today): A Friday the 13th Postmorterama: James Hamilton with some high powered perspective on the report for the Fukushima Nuclear Accident. Apparently they haven't heard of the blameless post-mortem. Lots of interesting stuff, but this is a potentially disaster saving general lesson learned: operators can’t figure out what is happening or take appropriate action without detailed visibility into the state of the system. Evernote with a nicely detailed note on a recent outage . A kernel panic happened while upgrading two new “shard” servers with 3x as much RAM, SSDs instead of 15krpm disks, bonded networking, and an updated kernel. They had to revert and shite loves to happen when other shite happens. Heroku with their postmortem on what happened when AWS went down. They lost 30% of their instances across 3 AZs in the US-East region. Rich detail on the impact of the AWS, but not much on what they can do about it
5 0.73484987 1137 high scalability-2011-11-04-Stuff The Internet Says On Scalability For November 4, 2011
Introduction: You're in good hands with HighScalability : Netflix - Cassandra, AWS, 288 instances, 3.3 million writes per second . Quotable quotes: @bretlowery : "A #DBA walks into a #NoSQL bar, but turns and leaves because he couldn't find a table." @AdanVali : HP to Deploy Memristor Powered SSD Replacement Within 18 Months @eden : Ori Lahav: "When planning scalability, think x100, design x5 and deploy x1.5 of current traffic" @jkalucki : If you are IO bound, start with your checkbook! Everything I Ever Learned About JVM Performance Tuning @Twitter . Learn how to tune your Hotspot and other Javasutra secrets. By moving off the cloud Mixipanel may have lost their angel status. Why would they do such a thing? Read Why We Moved Off The Cloud for the details. The reason for the fall: highly variable performance. Highly variable performance is incredibly hard to code or design around (think a server that normally does 300 queries per second with low I/
6 0.73169446 1036 high scalability-2011-05-06-Stuff The Internet Says On Scalability For May 6th, 2011
7 0.72786325 1093 high scalability-2011-08-05-Stuff The Internet Says On Scalability For August 5, 2011
8 0.72634661 1015 high scalability-2011-04-01-Stuff The Internet Says On Scalability For April 1, 2011
9 0.7194497 842 high scalability-2010-06-16-Hot Scalability Links for June 16, 2010
10 0.70100707 1365 high scalability-2012-11-30-Stuff The Internet Says On Scalability For November 30, 2012
11 0.69490653 766 high scalability-2010-01-26-Product: HyperGraphDB - A Graph Database
12 0.69009691 387 high scalability-2008-09-22-Paper: On Delivering Embarrassingly Distributed Cloud Services
13 0.68577385 1530 high scalability-2013-10-11-Stuff The Internet Says On Scalability For October 11th, 2013
14 0.67730683 1182 high scalability-2012-01-27-Stuff The Internet Says On Scalability For January 27, 2012
15 0.67329437 1327 high scalability-2012-09-21-Stuff The Internet Says On Scalability For September 21, 2012
16 0.67309195 628 high scalability-2009-06-13-Neo4j - a Graph Database that Kicks Buttox
17 0.67244947 744 high scalability-2009-11-24-Hot Scalability Links for Nov 24 2009
18 0.67130339 1109 high scalability-2011-09-02-Stuff The Internet Says On Scalability For September 2, 2011
19 0.67117894 1244 high scalability-2012-05-11-Stuff The Internet Says On Scalability For May 11, 2012
20 0.66996413 1154 high scalability-2011-12-09-Stuff The Internet Says On Scalability For December 9, 2011
topicId topicWeight
[(1, 0.172), (2, 0.143), (10, 0.052), (28, 0.201), (30, 0.034), (47, 0.016), (56, 0.035), (61, 0.087), (77, 0.029), (79, 0.08), (85, 0.038), (94, 0.045)]
simIndex simValue blogId blogTitle
1 0.9221288 1294 high scalability-2012-08-01-Prismatic Update: Machine Learning on Documents and Users
Introduction: In update to Prismatic Architecture - Using Machine Learning on Social Networks to Figure Out What You Should Read on the Web , Jason Wolfe, even in the face of deadening fatigue from long nights spent getting their iPhone app out, has gallantly agreed to talk a little more about Primatic's approach to Machine Learning. Documents and users are two areas where Prismatic applies ML (machine learning): ML on Documents Given an HTML document:Â learn how to extract the main text of the page (rather than the sidebar, footer, comments, etc), its title, author, best images, etc determine features for relevance (e.g., what the article is about, topics, etc.) The setup for most of these tasks is pretty typical. Models are trained using big batch jobs on other machines that read data from s3, save the learned parameter files to s3, and then read (and periodically refresh) the models from s3 in the ingest pipeline. All of the data that flows out of the system can be
2 0.90708268 606 high scalability-2009-05-25-non-sequential, unique identifier, strategy question
Introduction: (Please bare with me, I'm a new, passionate, confident and terrified programmer :D ) Background: I'm pre-launch and 1 year into the development of my application. My target is to be able to eventually handle millions of registered users with 5-10% of them concurrent. Up to this point I've used auto-increment to assign unique identifiers to rows. I am now considering switching to a non-sequential strategy. Oh, I'm using the LAMP configuration. My reasons for avoiding auto-increment: 1. Complicates replication when scaling horizontally. Risk of collision is significant (when running multiple masters). Note: I've read the other entries in this forum that relate to ID generation and there have been some great suggestions -- including a strategy that uses auto-increment in a way that avoids this pitfall... That said, I'm still nervous about it. 2. Potential bottleneck when retrieving/assigning IDs -- IDs assigned at the database. My reasons for being nervous about
same-blog 3 0.89304274 806 high scalability-2010-04-08-Hot Scalability Links for April 8, 2010
Introduction: Scalability porn (SFW). Real time meter for the number of ads being served by doubleclick. Amazing. A constant ~390,000 impressions a second are being served and 25 trillion since 1996. Thanks to Mike Rhoads for title idea. Scalability? Don't worry. Application complexity? Worry by Joe McKendrick. The next challenge on enterprise agendas: application complexity. This is something that lots of hardware — whether from the cloud or internal data center — cannot fix Leo Laporte and Steve Gibson talked about how the iPad was a denial of service attack on UPS delivery schedules. UPS trucks were filled with iPads. Cassandra: Fact vs fiction . Jonathan Ellies puts the beatdown on Cassandra misinformation. Don't you dare say Cassandra can't work across datacenters! JIT'd code calling conventions . Cliff Click Jr shows how Java’s calling convention can match compiled C code in speed, but allows for the flexibility of calling (code,slow) non-JIT'd code . Some assembly code re
4 0.86511457 1506 high scalability-2013-08-23-Stuff The Internet Says On Scalability For August 23, 2013
Introduction: Hey, it's HighScalability time: ( Parkour is to terrain as programming is to frameworks ) 5x : AWS vs combined size of other cloud vendors; Every Second on The Internet : Why we need so many servers. Quotable Quotes: @chaliy : Today I learned that I do not understand how #azure scaling works, instance scale does not affect requests/sec I can load. @Lariar : Note how crazy this is. An international launch would have been a huge deal. Now it's just another thing you do. smacktoward : The problem with relying on donations is that people don't make donations. @toddhoffious : Programming is a tool built by logical positivists to solve the problems of idealists and pragmatists. We have a fundamental mismatch here. @etherealmind : Me: "Weird, my phone data isn't working" Them: "They turned the 3G off at the tower because it interferes with the particle accelerator" John Carmack : In computer science, just about t
5 0.84992391 903 high scalability-2010-09-17-Hot Scalability Links For Sep 17, 2010
Introduction: Disqus - Scaling the Worlds Largest Django App. Interesting overview of a commenting system with 75 million comments and 250 million visitors. Lots of good details on how they partition their database, testing, continuous integration, feature switches, caching, delayed signals, and more. Things I learnt tracking a billion events in 24 hours : Know your host, Scaling isn't just servers, My servers need to talk to me more, Kill switches for users, What you don't know is the problem, Don't mix server roles, Know your most important users outside of your site. Tweets of Gold: georgebarnett : I read High Scalability for useful articles about large scaling. Sadly though, nothing useful ever shows up. #NoLongerBothering northscale : wow that is fast! :) RT @cgoldberg: was just running > 100k ops/sec against my 2-node #Membase cluster... zazooom #nosql turbofunctor : The root of many (horizontal) scalability problems is an application level access to a writab
7 0.82751966 562 high scalability-2009-04-10-Facebook's Aditya giving presentation on Facebook Architecture
8 0.82695812 630 high scalability-2009-06-14-kngine 'Knowledge Engine' milestone 2
9 0.807823 840 high scalability-2010-06-10-The Four Meta Secrets of Scaling at Facebook
10 0.80143034 1261 high scalability-2012-06-08-Stuff The Internet Says On Scalability For June 8, 2012
11 0.80074602 1395 high scalability-2013-01-28-DuckDuckGo Architecture - 1 Million Deep Searches a Day and Growing
12 0.79908037 304 high scalability-2008-04-19-How to build a real-time analytics system?
13 0.78721666 1611 high scalability-2014-03-12-Paper: Scalable Eventually Consistent Counters over Unreliable Networks
14 0.78220975 1557 high scalability-2013-12-02-Evolution of Bazaarvoice’s Architecture to 500M Unique Users Per Month
15 0.78022587 918 high scalability-2010-10-12-The CIO’s Problem: Cloud “Mess” or Cloud “Mash”
16 0.78005558 853 high scalability-2010-07-08-Cloud AWS Infrastructure vs. Physical Infrastructure
17 0.77988619 106 high scalability-2007-10-02-Secrets to Fotolog's Scaling Success
18 0.77882379 1189 high scalability-2012-02-07-Hypertable Routs HBase in Performance Test -- HBase Overwhelmed by Garbage Collection
19 0.77812505 1093 high scalability-2011-08-05-Stuff The Internet Says On Scalability For August 5, 2011
20 0.776191 752 high scalability-2009-12-17-Oracle and IBM databases: Disk-based vs In-memory databases