high_scalability high_scalability-2007 high_scalability-2007-197 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: From http://directory.fsf.org/project/collectd/ : 'collectd' is a small daemon which collects system information every 10 seconds and writes the results in an RRD-file. The statistics gathered include: CPU and memory usage, system load, network latency (ping), network interface traffic, and system temperatures (using lm-sensors), and disk usage. 'collectd' is not a script; it is written in C for performance and portability. It stays in the memory so there is no need to start up a heavy interpreter every time new values should be logged. From the collectd website: Collectd gathers information about the system it is running on and stores this information. The information can then be used to do find current performance bottlenecks (i. e. performance analysis) and predict future system load (i. e. capacity planning). Or if you just want pretty graphs of your private server and are fed up with some homegrown solution you're at the right place, too ;). While collectd can do a l
sentIndex sentText sentNum sentScore
1 org/project/collectd/ : 'collectd' is a small daemon which collects system information every 10 seconds and writes the results in an RRD-file. [sent-3, score-0.44]
2 The statistics gathered include: CPU and memory usage, system load, network latency (ping), network interface traffic, and system temperatures (using lm-sensors), and disk usage. [sent-4, score-0.603]
3 'collectd' is not a script; it is written in C for performance and portability. [sent-5, score-0.054]
4 It stays in the memory so there is no need to start up a heavy interpreter every time new values should be logged. [sent-6, score-0.506]
5 From the collectd website: Collectd gathers information about the system it is running on and stores this information. [sent-7, score-1.018]
6 The information can then be used to do find current performance bottlenecks (i. [sent-8, score-0.206]
7 Or if you just want pretty graphs of your private server and are fed up with some homegrown solution you're at the right place, too ;). [sent-13, score-0.537]
8 While collectd can do a lot for you and your administrative needs, there are limits to what it does: * It does not generate graphs. [sent-14, score-0.999]
9 It can write to RRD-files, but it cannot generate graphs from these files. [sent-15, score-0.381]
10 There's a tiny sample script included in contrib/, though. [sent-16, score-0.441]
11 Also you can have a look at drraw for a generic solution to generate graphs from RRD-files. [sent-17, score-0.694]
12 The data is collected and stored, but not interpreted and acted upon. [sent-19, score-0.444]
13 There's a plugin for Nagios, so it can use the values collected by collectd, though. [sent-20, score-0.43]
14 It's reportedly a reliable product that doesn't cause a lot load on your system. [sent-21, score-0.301]
15 This enables you to collect data at a faster rate so you can detect problems earlier. [sent-22, score-0.288]
wordName wordTfidf (topN-words)
[('collectd', 0.627), ('generate', 0.204), ('script', 0.18), ('collected', 0.179), ('graphs', 0.177), ('drraw', 0.157), ('gathers', 0.157), ('temperatures', 0.147), ('acted', 0.141), ('interpreter', 0.141), ('values', 0.138), ('homegrown', 0.127), ('interpreted', 0.124), ('reportedly', 0.115), ('plugin', 0.113), ('ping', 0.113), ('gathered', 0.108), ('collects', 0.108), ('stays', 0.104), ('administrative', 0.103), ('fed', 0.098), ('information', 0.096), ('daemon', 0.095), ('nagios', 0.091), ('included', 0.09), ('predict', 0.089), ('generic', 0.087), ('collect', 0.086), ('sample', 0.086), ('tiny', 0.085), ('earlier', 0.084), ('detect', 0.082), ('system', 0.08), ('load', 0.076), ('statistics', 0.074), ('solution', 0.069), ('enables', 0.068), ('private', 0.066), ('limits', 0.065), ('planning', 0.065), ('heavy', 0.062), ('memory', 0.061), ('seconds', 0.061), ('stores', 0.058), ('bottlenecks', 0.056), ('cause', 0.056), ('reliable', 0.054), ('performance', 0.054), ('interface', 0.053), ('rate', 0.052)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999988 197 high scalability-2007-12-31-Product: collectd
Introduction: From http://directory.fsf.org/project/collectd/ : 'collectd' is a small daemon which collects system information every 10 seconds and writes the results in an RRD-file. The statistics gathered include: CPU and memory usage, system load, network latency (ping), network interface traffic, and system temperatures (using lm-sensors), and disk usage. 'collectd' is not a script; it is written in C for performance and portability. It stays in the memory so there is no need to start up a heavy interpreter every time new values should be logged. From the collectd website: Collectd gathers information about the system it is running on and stores this information. The information can then be used to do find current performance bottlenecks (i. e. performance analysis) and predict future system load (i. e. capacity planning). Or if you just want pretty graphs of your private server and are fed up with some homegrown solution you're at the right place, too ;). While collectd can do a l
2 0.078199811 484 high scalability-2009-01-05-Lessons Learned at 208K: Towards Debugging Millions of Cores
Introduction: How do we debug and profile a cloud full of processors and threads? It's a problem more will be seeing as we code big scary programs that run on even bigger scarier clouds. Logging gets you far, but sometimes finding the root cause of problem requires delving deep into a program's execution. I don't know about you, but setting up 200,000+ gdb instances doesn't sound all that appealing. Tools like STAT (Stack Trace Analysis Tool) are being developed to help with this huge task. STAT "gathers and merges stack traces from a parallel application’s processes." So STAT isn't a low level debugger, but it will help you find the needle in a million haystacks. Abstract: Petascale systems will present several new challenges to performance and correctness tools. Such machines may contain millions of cores, requiring that tools use scalable data structures and analysis algorithms to collect and to process application data. In addition, at such scales, each tool itself will become a large paralle
Introduction: Update 2: Velocity 09: John Allspaw, 10+ Deploys Per Day: Dev and Ops Cooperation at Flickr . Insightful talk. Some highlights: Change is good if you can build tools and culture to lower the risk of change. Operations and developers need to become of one mind and respect each other. An automated infrastructure is the one tool you need most. Common source control. One step build. One step deploy. Don't be a pussy, deploy. Always ship trunk. Feature flags - don't branch code, make features runtime configurable in code. Dark launch - release data paths early without UI component. Shared metrics. Adaptive feedback to prioritize important features. IRC for communication for human context. Best solutions occur when dev and op work together and trust each other. Trust is earned by helping each other solve their problems. Look at what new features imply for operations, what can go wrong, and how to recover. Provide knobs and levers to help operations. Devs should have access to production
4 0.075640619 1501 high scalability-2013-08-13-In Memoriam: Lavabit Architecture - Creating a Scalable Email Service
Introduction: With Lavabit shutting down under murky circumstances , it seems fitting to repost an old (2009), yet still very good post by Ladar Levison on Lavabit's architecture. I don't know how much of this information is still current, but it should give you a general idea what Lavabit was all about. Getting to Know You What is the name of your system and where can we find out more about it? Note: these links are no longer valid... Lavabit http://lavabit.com http://lavabit.com/network.html http://lavabit.com/about.html What is your system for? Lavabit is a mid-sized email service provider. We currently have about 140,000 registered users with more than 260,000 email addresses. While most of our accounts belong to individual users, we also provide corporate email services to approximately 70 companies. Why did you decide to build this system? We built the system to compete against the other large free email providers, with an emphasis on serving the privacy c
5 0.074923158 233 high scalability-2008-01-30-How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data
Introduction: How do you query hundreds of gigabytes of new data each day streaming in from over 600 hyperactive servers? If you think this sounds like the perfect battle ground for a head-to-head skirmish in the great MapReduce Versus Database War , you would be correct. Bill Boebel, CTO of Mailtrust (Rackspace's mail division), has generously provided a fascinating account of how they evolved their log processing system from an early amoeba'ic text file stored on each machine approach, to a Neandertholic relational database solution that just couldn't compete, and finally to a Homo sapien'ic Hadoop based solution that works wisely for them and has virtually unlimited scalability potential. Rackspace faced a now familiar problem. Lots and lots of data streaming in. Where do you store all that data? How do you do anything useful with it? In the first version of their system logs were stored in flat text files and had to be manually searched by engineers logging into each individual machine. T
6 0.071158543 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?
7 0.070542715 920 high scalability-2010-10-15-Troubles with Sharding - What can we learn from the Foursquare Incident?
8 0.06898842 1494 high scalability-2013-07-19-Stuff The Internet Says On Scalability For July 19, 2013
9 0.068279773 801 high scalability-2010-03-30-Running Large Graph Algorithms - Evaluation of Current State-of-the-Art and Lessons Learned
10 0.068250827 308 high scalability-2008-04-22-Simple NFS failover solution with symbolic link?
11 0.066891447 788 high scalability-2010-03-04-How MySpace Tested Their Live Site with 1 Million Concurrent Users
12 0.066706583 661 high scalability-2009-07-25-Latency is Everywhere and it Costs You Sales - How to Crush it
13 0.06575568 1413 high scalability-2013-02-27-42 Monster Problems that Attack as Loads Increase
14 0.063645415 1331 high scalability-2012-10-02-An Epic TripAdvisor Update: Why Not Run on the Cloud? The Grand Experiment.
15 0.062699363 1136 high scalability-2011-11-03-Paper: G2 : A Graph Processing System for Diagnosing Distributed Systems
16 0.062431943 1008 high scalability-2011-03-22-Facebook's New Realtime Analytics System: HBase to Process 20 Billion Events Per Day
17 0.06038972 808 high scalability-2010-04-12-Poppen.de Architecture
18 0.060260393 961 high scalability-2010-12-21-SQL + NoSQL = Yes !
19 0.059885655 538 high scalability-2009-03-16-Are Cloud Based Memory Architectures the Next Big Thing?
20 0.058836598 1615 high scalability-2014-03-19-Strategy: Three Techniques to Survive Traffic Surges by Quickly Scaling Your Site
topicId topicWeight
[(0, 0.108), (1, 0.042), (2, -0.017), (3, -0.04), (4, -0.012), (5, 0.018), (6, 0.05), (7, 0.021), (8, -0.015), (9, 0.008), (10, 0.012), (11, -0.013), (12, 0.019), (13, -0.011), (14, 0.027), (15, -0.012), (16, 0.012), (17, 0.026), (18, -0.025), (19, 0.036), (20, -0.023), (21, -0.021), (22, -0.003), (23, 0.048), (24, -0.006), (25, 0.0), (26, -0.034), (27, 0.015), (28, -0.036), (29, 0.007), (30, -0.005), (31, -0.018), (32, 0.015), (33, 0.0), (34, -0.009), (35, 0.031), (36, -0.016), (37, 0.006), (38, -0.034), (39, -0.011), (40, 0.01), (41, 0.002), (42, 0.017), (43, 0.034), (44, 0.002), (45, -0.002), (46, -0.014), (47, 0.002), (48, 0.037), (49, -0.002)]
simIndex simValue blogId blogTitle
same-blog 1 0.95051545 197 high scalability-2007-12-31-Product: collectd
Introduction: From http://directory.fsf.org/project/collectd/ : 'collectd' is a small daemon which collects system information every 10 seconds and writes the results in an RRD-file. The statistics gathered include: CPU and memory usage, system load, network latency (ping), network interface traffic, and system temperatures (using lm-sensors), and disk usage. 'collectd' is not a script; it is written in C for performance and portability. It stays in the memory so there is no need to start up a heavy interpreter every time new values should be logged. From the collectd website: Collectd gathers information about the system it is running on and stores this information. The information can then be used to do find current performance bottlenecks (i. e. performance analysis) and predict future system load (i. e. capacity planning). Or if you just want pretty graphs of your private server and are fed up with some homegrown solution you're at the right place, too ;). While collectd can do a l
2 0.81847358 237 high scalability-2008-02-03-Product: Collectl - Performance Data Collector
Introduction: From their website : There are a number of times in which you find yourself needing performance data. These can include benchmarking, monitoring a system's general heath or trying to determine what your system was doing at some time in the past. Sometimes you just want to know what the system is doing right now. Depending on what you're doing, you often end up using different tools, each designed to for that specific situation. Features include: You are be able to run with non-integral sampling intervals. Collectl uses very little CPU. In fact it has been measured to use <0.1% when run as a daemon using the default sampling interval of 60 seconds for process and slab data and 10 seconds for everything else. Brief, verbose, and plot formats are supported. You can report aggregated performance numbers on many devices such as CPUs, Disks, interconnects such as Infiniband or Quadrics, Networks or even Lustre file systems. Collectl will align its sampling on integral sec
Introduction: Todd had originally posted an entry on collectl here at Collectl - Performance Data Collector . Collectl collects real-time data from a large number of subsystems like buddyinfo, cpu, disk, inodes, infiniband, lustre, memory, network, nfs, processes, quadrics, slabs, sockets and tcp, all using one tool and in one consistent format. Since then a lot has happened. It's now part of both Fedora and Debian distros, not to mention several others. There has also been a pretty good summary written up by Joe Brockmeier . It's also pretty well documented (I like to think) on sourceforge . There have also been a few blog postings by Martin Bach on his blog. Anyhow, awhile back I released a new version of collectl-utils and gave a complete face-lift to one of the utilities, colmux, which is a collectl multiplexor. This tool has the ability to run collectl on multiple systems, which in turn send all their output back to colmux. Colmux then sorts the output on a user-specified column
4 0.72817194 1250 high scalability-2012-05-23-Averages, web performance data, and how your analytics product is lying to you
Introduction: This guest post is written by Josh Fraser , co-founder and CEO of Torbit . Torbit creates tools for measuring, analyzing and optimizing web performance. Â Did you know that 5% of the pageviews on Walmart.com take over 20 seconds to load? Walmart discovered this recently after adding real user measurement (RUM) to analyze their web performance for every single visitor to their site. Walmart used JavaScript to measure their median load time as well as key metrics like their 95th percentile. While 20 seconds is a long time to wait for a website to load, the Walmart story is actually not that uncommon. Remember, this is the worst 5% of their pageviews, not the typical experience. Walmart's median load time was reported at around 4 seconds, meaning half of their visitors loaded Walmart.com faster than 4 seconds and the other half took longer than 4 seconds to load. Using this knowledge, Walmart was prepared to act. By reducing page load times by even one second, Walmart found that
5 0.7036075 1362 high scalability-2012-11-26-BigData using Erlang, C and Lisp to Fight the Tsunami of Mobile Data
Introduction: This is a guest post by Jon Vlachogiannis . Jon is the founder and CTO of BugSense . BugSense, is an error-reporting and quality metrics service that tracks thousand of apps every day. When mobile apps crash, BugSense helps developers pinpoint and fix the problem. The startup delivers first-class service to its customers, which include VMWare, Samsung, Skype and thousands of independent app developers. Tracking more than 200M devices requires fast, fault tolerant and cheap infrastructure. The last six months, we’ve decided to use our BigData infrastructure, to provide the users with metrics about their apps performance and stability and let them know how the errors affect their user base and revenues. We knew that our solution should be scalable from day one, because more than 4% of the smartphones out there, will start DDOSing us with data. We wanted to be able to: Abstract the application logic and feed browsers with JSON Run complex algorithms on the fly Expe
6 0.7024169 77 high scalability-2007-08-30-Log Everything All the Time
7 0.70232904 1279 high scalability-2012-07-09-Data Replication in NoSQL Databases
8 0.69894284 1038 high scalability-2011-05-11-Troubleshooting response time problems – why you cannot trust your system metrics
9 0.6984874 680 high scalability-2009-08-13-Reconnoiter - Large-Scale Trending and Fault-Detection
10 0.69684017 1386 high scalability-2013-01-14-MongoDB and GridFS for Inter and Intra Datacenter Data Replication
11 0.69606769 558 high scalability-2009-04-06-How do you monitor the performance of your cluster?
12 0.69404501 1568 high scalability-2013-12-23-What Happens While Your Brain Sleeps is Surprisingly Like How Computers Stay Sane
13 0.69340897 920 high scalability-2010-10-15-Troubles with Sharding - What can we learn from the Foursquare Incident?
14 0.68694752 221 high scalability-2008-01-24-Mailinator Architecture
15 0.68512946 1222 high scalability-2012-04-05-Big Data Counting: How to count a billion distinct objects using only 1.5KB of Memory
16 0.68482226 553 high scalability-2009-04-03-Collectl interface to Ganglia - any interest?
17 0.67942446 1413 high scalability-2013-02-27-42 Monster Problems that Attack as Loads Increase
18 0.67783672 1408 high scalability-2013-02-19-Puppet monitoring: how to monitor the success or failure of Puppet runs
19 0.67697239 1494 high scalability-2013-07-19-Stuff The Internet Says On Scalability For July 19, 2013
20 0.67136562 971 high scalability-2011-01-10-Riak's Bitcask - A Log-Structured Hash Table for Fast Key-Value Data
topicId topicWeight
[(1, 0.173), (2, 0.243), (10, 0.012), (61, 0.029), (91, 0.397), (94, 0.027)]
simIndex simValue blogId blogTitle
1 0.92095619 921 high scalability-2010-10-18-NoCAP
Introduction: In this post i wanted to spend sometime on the CAP theorem and clarify some of the confusion that i often see when people associate CAP with scalability without fully understanding the implications that comes with it and the alternative approaches You can read the full article here
2 0.79233187 722 high scalability-2009-10-15-Hot Scalability Links for Oct 15 2009
Introduction: Update: Social networks in the database: using a graph database . Anders Nawroth puts graphs through their paces by representing, traversing, and performing other common social network operations using a graph database. Update: Deployment with Capistrano by Charles Max Wood. Simple step-by-step for using Capistrano for deployment. Log-structured file systems: There's one in every SSD by Valerie Aurora. SSDs have totally changed the performance characteristics of storage! Disks are dead! Long live flash! An Engineer's Guide to Bandwidth by DGentry. I t's a rough world out there, and we need to to a better job of thinking about and testing under realistic network conditions. Analyzing air traffic performance with InfoBright and MonetDB by Vadim of the MySQL Performance Blog. Scalable Delivery of Stream Query Result by Zhou, Y ; Salehi, A ; Aberer, K. In this paper, we leverage Distributed Publish/Subscribe System (DPSS), a scalable data dissemination infrastruct
same-blog 3 0.75717288 197 high scalability-2007-12-31-Product: collectd
Introduction: From http://directory.fsf.org/project/collectd/ : 'collectd' is a small daemon which collects system information every 10 seconds and writes the results in an RRD-file. The statistics gathered include: CPU and memory usage, system load, network latency (ping), network interface traffic, and system temperatures (using lm-sensors), and disk usage. 'collectd' is not a script; it is written in C for performance and portability. It stays in the memory so there is no need to start up a heavy interpreter every time new values should be logged. From the collectd website: Collectd gathers information about the system it is running on and stores this information. The information can then be used to do find current performance bottlenecks (i. e. performance analysis) and predict future system load (i. e. capacity planning). Or if you just want pretty graphs of your private server and are fed up with some homegrown solution you're at the right place, too ;). While collectd can do a l
4 0.72356021 712 high scalability-2009-10-01-Moving Beyond End-to-End Path Information to Optimize CDN Performance
Introduction: You go through the expense of installing CDNs all over the globe to make sure users always have a node close by and you notice something curious and furious: clients still experience poor latencies. What's up with that? What do you do to find the problem? If you are Google you build a tool (WhyHigh) to figure out what's up. This paper is about the tool and the unexpected problem of high latencies on CDNs. The main problems they found: inefficient routing to nearby nodes and packet queuing. But more useful is the architecture of WhyHigh and how it goes about identifying bottle necks. And even more useful is the general belief in creating sophisticated tools to understand and improve your service. That's what professionals do. From the abstract: Replicating content across a geographically distributed set of servers and redirecting clients to the closest server in terms of latency has emerged as a common paradigm for improving client performance. In this paper, we analyze latenc
5 0.70842993 1285 high scalability-2012-07-18-Disks Ain't Dead Yet: GraphChi - a disk-based large-scale graph computation
Introduction: GraphChi uses a Parallel Sliding Windows method which can: process a graph with mutable edge values efficiently from disk, with only a small number of non-sequential disk accesses, while supporting the asynchronous model of computation. The result is graphs with billions of edges can be processed on just a single machine. It uses a vertex-centric computation model similar to Pregel , which supports iterative algorithims as apposed to the batch style of MapReduce. Streaming graph updates are supported. About GraphChi, Carlos Guestrin, codirector of Carnegie Mellon's Select Lab, says : A Mac Mini running GraphChi can analyze Twitter's social graph from 2010—which contains 40 million users and 1.2 billion connections—in 59 minutes. "The previous published result on this problem took 400 minutes using a cluster of about 1,000 computers Related Articles Aapo Kyrola Home Page Your Laptop Can Now Analyze Big Data by JOHN PAVLUS Example Applications Runn
6 0.70609146 453 high scalability-2008-12-01-Breakthrough Web-Tier Solutions with Record-Breaking Performance
7 0.66945153 826 high scalability-2010-05-12-The Rise of the Virtual Cellular Machines
8 0.6608876 356 high scalability-2008-07-22-Scaling Bumper Sticker: A 1 Billion Page Per Month Facebook RoR App
9 0.66025966 642 high scalability-2009-06-29-HighScalability Rated #3 Blog for Developers
10 0.65946543 632 high scalability-2009-06-15-starting small with growth in mind
11 0.65395677 742 high scalability-2009-11-17-10 eBay Secrets for Planet Wide Scaling
12 0.64918649 1338 high scalability-2012-10-11-RAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store
13 0.64758408 167 high scalability-2007-11-27-Starting a website from scratch - what technologies should I use?
14 0.63333917 651 high scalability-2009-07-02-Product: Project Voldemort - A Distributed Database
15 0.62684536 1053 high scalability-2011-06-06-Apple iCloud: Syncing and Distributed Storage Over Streaming and Centralized Storage
16 0.61032957 1209 high scalability-2012-03-14-The Azure Outage: Time Is a SPOF, Leap Day Doubly So
17 0.60488296 1093 high scalability-2011-08-05-Stuff The Internet Says On Scalability For August 5, 2011
18 0.60066968 1110 high scalability-2011-09-06-Big Data Application Platform
19 0.59697485 754 high scalability-2009-12-22-Incremental deployment
20 0.59605968 1633 high scalability-2014-04-16-Six Lessons Learned the Hard Way About Scaling a Million User System