high_scalability high_scalability-2009 high_scalability-2009-553 knowledge-graph by maker-knowledge-mining

553 high scalability-2009-04-03-Collectl interface to Ganglia - any interest?


meta infos for this blog

Source: html

Introduction: It's been awhile since I've said anything about collectl and I wanted to let this group know I'm currently working on an interface to ganglia since I've seen a variety of posts ranging from how much data to log and where to log it as well as which tools/mechanism to use for logging. From my perspective there are essentially 2 camps on the monitoring front - one says to have distributed agents all sending their data to a central point, but don't send too much or too often. The other camp (which is the one I'm in) says do it all locally with a highly efficient data collector, because you need a lot of data (I also read a post in here about logging everything) and you can't possibly monitors 100s or 1Ks of nodes remotely at the granularity necessary to get anything meaningful. Enter collectl and its evolving interface for ganglia. This will allow you to log lots of detailed data on local nodes at the usual 10 sec interval (or more frequent if you prefer) at about 0.1% system overhe


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 It's been awhile since I've said anything about collectl and I wanted to let this group know I'm currently working on an interface to ganglia since I've seen a variety of posts ranging from how much data to log and where to log it as well as which tools/mechanism to use for logging. [sent-1, score-2.296]

2 From my perspective there are essentially 2 camps on the monitoring front - one says to have distributed agents all sending their data to a central point, but don't send too much or too often. [sent-2, score-0.984]

3 Enter collectl and its evolving interface for ganglia. [sent-4, score-0.595]

4 This will allow you to log lots of detailed data on local nodes at the usual 10 sec interval (or more frequent if you prefer) at about 0. [sent-5, score-1.001]

5 1% system overhead while sending a subset at a lower rate to the ganglia gmonds. [sent-6, score-0.639]

6 This would give you the best of both worlds but I don't know if people are too married to the centralized concept to try something different. [sent-7, score-0.635]

7 I don't know how many people who follow this forum have actually tried it, I know at least a few of you have, but to learn more just go to http://collectl. [sent-8, score-0.558]

8 net/ and look at some of documentation or just download the rpm and type 'collectl'. [sent-10, score-0.315]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('collectl', 0.339), ('ganglia', 0.269), ('log', 0.201), ('married', 0.19), ('sending', 0.183), ('camps', 0.169), ('sec', 0.169), ('camp', 0.165), ('interval', 0.148), ('awhile', 0.146), ('remotely', 0.144), ('granularity', 0.14), ('interface', 0.138), ('agents', 0.138), ('says', 0.136), ('know', 0.135), ('anything', 0.129), ('collector', 0.126), ('rpm', 0.125), ('monitors', 0.125), ('worlds', 0.125), ('evolving', 0.118), ('forum', 0.115), ('documentation', 0.114), ('frequent', 0.114), ('nodes', 0.112), ('prefer', 0.11), ('since', 0.109), ('locally', 0.107), ('ranging', 0.106), ('subset', 0.104), ('essentially', 0.102), ('usual', 0.1), ('logging', 0.098), ('centralized', 0.097), ('possibly', 0.097), ('tried', 0.093), ('perspective', 0.092), ('central', 0.088), ('concept', 0.088), ('posts', 0.088), ('variety', 0.087), ('wanted', 0.085), ('overhead', 0.083), ('detailed', 0.081), ('follow', 0.08), ('necessary', 0.079), ('said', 0.078), ('data', 0.076), ('download', 0.076)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 553 high scalability-2009-04-03-Collectl interface to Ganglia - any interest?

Introduction: It's been awhile since I've said anything about collectl and I wanted to let this group know I'm currently working on an interface to ganglia since I've seen a variety of posts ranging from how much data to log and where to log it as well as which tools/mechanism to use for logging. From my perspective there are essentially 2 camps on the monitoring front - one says to have distributed agents all sending their data to a central point, but don't send too much or too often. The other camp (which is the one I'm in) says do it all locally with a highly efficient data collector, because you need a lot of data (I also read a post in here about logging everything) and you can't possibly monitors 100s or 1Ks of nodes remotely at the granularity necessary to get anything meaningful. Enter collectl and its evolving interface for ganglia. This will allow you to log lots of detailed data on local nodes at the usual 10 sec interval (or more frequent if you prefer) at about 0.1% system overhe

2 0.36072877 558 high scalability-2009-04-06-How do you monitor the performance of your cluster?

Introduction: I had posted a note the other day about collectl and its ganglia interface but perhaps I wasn't provocative enough to get any responses so let me ask it a different way, specifically how do people monitor their clusters and more importantly how often? Do you monitor to get a general sense of what the system is doing OR do you monitor with the expectation that when something goes wrong you'll have enough data to diagnose the problem? Or both? I suspect both... Many cluster-based monitoring tools tend to have a data collection daemon running on each target node which periodically sends data to some central management station. That machine typically writes the data to some database from which it can then extract historical plots. Some even put up graphics in real-time. From my experience working with large clusters - and I'm talking either many hundreds or even 1000s of nodes, most have to limit both the amount of data they manage centrally as well as the frequency that they

3 0.18829516 77 high scalability-2007-08-30-Log Everything All the Time

Introduction: This JoelOnSoftware thread asks the age old question of what and how to log. The usual trace/error/warning/info advice is totally useless in a large scale distributed system. Instead, you need to log everything all the time so you can solve problems that have already happened across a potentially huge range of servers. Yes, it can be done. To see why the typical logging approach is broken, imagine this scenario: Your site has been up and running great for weeks. No problems. A foreshadowing beeper goes off at 2AM. It seems some users can no longer add comments to threads. Then you hear the debugging deathknell: it's an intermittent problem and customers are pissed. Fix it. Now. So how are you going to debug this? The monitoring system doesn't show any obvious problems or errors. You quickly post a comment and it works fine. This won't be easy. So you think. Commenting involves a bunch of servers and networks. There's the load balancer, spam filter, web server, database server,

4 0.17951819 1104 high scalability-2011-08-25-Colmux - Finding Memory Leaks, High I-O Wait Times, and Hotness on 3000 Node Clusters

Introduction: Todd had originally posted an entry on collectl here at Collectl - Performance Data Collector . Collectl collects real-time data from a large number of subsystems like buddyinfo, cpu, disk, inodes, infiniband, lustre, memory, network, nfs, processes, quadrics, slabs, sockets and tcp, all using one tool and in one consistent format. Since then a lot has happened.  It's now part of both Fedora and Debian distros, not to mention several others. There has also been a pretty good summary written up by Joe Brockmeier . It's also pretty well documented (I like to think) on sourceforge . There have also been a few blog postings by  Martin Bach on his blog. Anyhow, awhile back I released a new version of collectl-utils and gave a complete face-lift to one of the utilities, colmux, which is a collectl multiplexor.  This tool has the ability to run collectl on multiple systems, which in turn send all their output back to colmux.  Colmux then sorts the output on a user-specified column

5 0.17342745 719 high scalability-2009-10-09-Have you collectl'd yet? If not, maybe collectl-utils will make it easier to do so

Introduction: I'm not sure how many people who follow this have even tried collectl but I wanted to let you all know that I just released a set of utilities called strangely enough collectl-utils, which you can get at http://collectl-utils.sourceforge.net . One web-based utility called colplot gives you the ability to very easily plot data from multiple systems in a way that makes correlating them over time very easy. Another utility called colmux lets you look at multiple systems in real time. In fact if you go the page that describes it in more detail you'll see a photo which shows the CPU loads on 192 systems one a second, one set of data/line! in fact the display so wide it takes 3 large monitors side-by-side to see it all and even though you can't actually read the displays you can easily see which systems are loaded and which aren't. Anyhow give it a look and let me know what you think. -mark

6 0.14575593 233 high scalability-2008-01-30-How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data

7 0.12468021 1390 high scalability-2013-01-21-Processing 100 Million Pixels a Day - Small Amounts of Contention Cause Big Problems at Scale

8 0.12037378 323 high scalability-2008-05-19-Twitter as a scalability case study

9 0.11759448 237 high scalability-2008-02-03-Product: Collectl - Performance Data Collector

10 0.11425724 30 high scalability-2007-07-26-Product: AWStats a Log Analyzer

11 0.10644431 37 high scalability-2007-07-28-Product: Web Log Storming

12 0.1050081 541 high scalability-2009-03-16-Product: Smart Inspect

13 0.097988546 449 high scalability-2008-11-24-Product: Scribe - Facebook's Scalable Logging System

14 0.089433089 937 high scalability-2010-11-09-Paper: Hyder - Scaling Out without Partitioning

15 0.086451098 1578 high scalability-2014-01-14-Ask HS: Design and Implementation of scalable services?

16 0.081448123 1008 high scalability-2011-03-22-Facebook's New Realtime Analytics System: HBase to Process 20 Billion Events Per Day

17 0.076701649 105 high scalability-2007-10-01-Statistics Logging Scalability

18 0.076154724 821 high scalability-2010-05-03-MocoSpace Architecture - 3 Billion Mobile Page Views a Month

19 0.074839354 1492 high scalability-2013-07-17-How do you create a 100th Monkey software development culture?

20 0.07434047 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.138), (1, 0.047), (2, -0.023), (3, -0.017), (4, 0.019), (5, 0.008), (6, 0.045), (7, 0.036), (8, 0.029), (9, -0.006), (10, -0.009), (11, 0.044), (12, 0.025), (13, -0.04), (14, 0.094), (15, 0.0), (16, 0.035), (17, 0.001), (18, -0.055), (19, -0.0), (20, -0.009), (21, -0.048), (22, -0.053), (23, 0.132), (24, 0.045), (25, -0.044), (26, -0.043), (27, 0.011), (28, -0.056), (29, -0.022), (30, -0.04), (31, -0.102), (32, 0.03), (33, 0.002), (34, -0.039), (35, 0.044), (36, -0.002), (37, -0.04), (38, 0.047), (39, 0.014), (40, 0.011), (41, 0.029), (42, 0.013), (43, -0.009), (44, -0.01), (45, 0.045), (46, 0.02), (47, -0.003), (48, 0.0), (49, -0.022)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.91502631 553 high scalability-2009-04-03-Collectl interface to Ganglia - any interest?

Introduction: It's been awhile since I've said anything about collectl and I wanted to let this group know I'm currently working on an interface to ganglia since I've seen a variety of posts ranging from how much data to log and where to log it as well as which tools/mechanism to use for logging. From my perspective there are essentially 2 camps on the monitoring front - one says to have distributed agents all sending their data to a central point, but don't send too much or too often. The other camp (which is the one I'm in) says do it all locally with a highly efficient data collector, because you need a lot of data (I also read a post in here about logging everything) and you can't possibly monitors 100s or 1Ks of nodes remotely at the granularity necessary to get anything meaningful. Enter collectl and its evolving interface for ganglia. This will allow you to log lots of detailed data on local nodes at the usual 10 sec interval (or more frequent if you prefer) at about 0.1% system overhe

2 0.8001712 77 high scalability-2007-08-30-Log Everything All the Time

Introduction: This JoelOnSoftware thread asks the age old question of what and how to log. The usual trace/error/warning/info advice is totally useless in a large scale distributed system. Instead, you need to log everything all the time so you can solve problems that have already happened across a potentially huge range of servers. Yes, it can be done. To see why the typical logging approach is broken, imagine this scenario: Your site has been up and running great for weeks. No problems. A foreshadowing beeper goes off at 2AM. It seems some users can no longer add comments to threads. Then you hear the debugging deathknell: it's an intermittent problem and customers are pissed. Fix it. Now. So how are you going to debug this? The monitoring system doesn't show any obvious problems or errors. You quickly post a comment and it works fine. This won't be easy. So you think. Commenting involves a bunch of servers and networks. There's the load balancer, spam filter, web server, database server,

3 0.77556628 1390 high scalability-2013-01-21-Processing 100 Million Pixels a Day - Small Amounts of Contention Cause Big Problems at Scale

Introduction: This is a guest post by Gordon Worley , a Software Engineer at Korrelate , where they correlate (see what they did there) online purchases to offline purchases. Several weeks ago, we came into the office one morning to find every server alarm going off. Pixel log processing was behind by 8 hours and not making headway. Checking the logs, we discovered that a big client had come online during the night and was giving us 10 times more traffic than we were originally told to expect. I wouldn’t say we panicked, but the office was certainly more jittery than usual. Over the next several hours, though, thanks both to foresight and quick thinking, we were able to scale up to handle the added load and clear the backlog to return log processing to a steady state. At Korrelate, we deploy tracking pixels , also known beacons or web bugs, that our partners use to send us information about their users. These tiny web objects contain no visible content, but may include transparent 1 by 1 gif

4 0.7416749 449 high scalability-2008-11-24-Product: Scribe - Facebook's Scalable Logging System

Introduction: In Log Everything All the Time I advocate applications shouldn't bother logging at all. Why waste all that time and code? No, wait, that's not right. I preach logging everything all the time. Doh. Facebook obviously feels similarly which is why they opened sourced Scribe , their internal logging system, capable of logging 10s of billions of messages per day. These messages include access logs, performance statistics, actions that went to News Feed, and many others. Imagine hundreds of thousands of machines across many geographical dispersed datacenters just aching to send their precious log payload to the central repository off all knowledge. Because really, when you combine all the meta data with all the events you pretty much have a complete picture of your operations. Once in the central repository logs can be scanned, indexed, summarized, aggregated, refactored, diced, data cubed, and mined for every scrap of potentially useful information. Just imagine the log stream from a

5 0.71726733 541 high scalability-2009-03-16-Product: Smart Inspect

Introduction: Smart Inspect has added quite a few features specifically tailored to high scalability and high performance environments to our tool over the years. This includes the ability to log to memory and dump log files on demand (when a crash occurs for example), special backlog queue features, a log service application for central log storage and a lot more. Additionally, our SmartInspect Console (the viewer application) makes viewing, filtering and inspecting large amounts of logging data a lot easier/practical.

6 0.68345195 30 high scalability-2007-07-26-Product: AWStats a Log Analyzer

7 0.67179227 233 high scalability-2008-01-30-How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data

8 0.66291225 237 high scalability-2008-02-03-Product: Collectl - Performance Data Collector

9 0.66116112 558 high scalability-2009-04-06-How do you monitor the performance of your cluster?

10 0.65588707 1498 high scalability-2013-08-07-RAFT - In Search of an Understandable Consensus Algorithm

11 0.6527698 937 high scalability-2010-11-09-Paper: Hyder - Scaling Out without Partitioning

12 0.64904606 719 high scalability-2009-10-09-Have you collectl'd yet? If not, maybe collectl-utils will make it easier to do so

13 0.64794129 1104 high scalability-2011-08-25-Colmux - Finding Memory Leaks, High I-O Wait Times, and Hotness on 3000 Node Clusters

14 0.64523852 45 high scalability-2007-07-30-Product: SmarterStats

15 0.61965001 105 high scalability-2007-10-01-Statistics Logging Scalability

16 0.61696446 304 high scalability-2008-04-19-How to build a real-time analytics system?

17 0.61005622 295 high scalability-2008-04-02-Product: Supervisor - Monitor and Control Your Processes

18 0.6009165 36 high scalability-2007-07-28-Product: Web Log Expert

19 0.60015237 488 high scalability-2009-01-08-file synchronization solutions

20 0.59820771 570 high scalability-2009-04-15-Implementing large scale web analytics


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.174), (2, 0.213), (10, 0.071), (57, 0.246), (61, 0.108), (77, 0.025), (94, 0.057)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.95033604 159 high scalability-2007-11-18-Reverse Proxy

Introduction: Hi, I saw an year ago that Netapp sold netcache to blu-coat, my site is a heavy NetCache user and we cached 83% of our site. We tested with Blue-coat and F5 WA and we are not getting same performce as NetCache. Any of you guys have the same issue? or somebody knows another product can handle much traffic? Thanks Rodrigo

2 0.91928804 1144 high scalability-2011-11-17-Five Misconceptions on Cloud Portability

Introduction: The term "cloud portability" is often considered a synonym for "Cloud API portability," which implies a series of misconceptions. If we break away from dogma, we can find that what we really looking for in cloud portability is Application portability between clouds which can be a vastly simpler requirement, as we can achieve application portability without settling on a common Cloud API. In this post i'll be covering five common misconceptions people have WRT to cloud portability. Cloud portability = Cloud API portability . API portability is easy; cloud API portability is not. The main incentive for Cloud Portability is - Avoiding Vendor lock-in .Cloud portability is more about business agility than it is about vendor lock-in. Cloud portability isn’t for startups . Every startup that is expecting rapid growth should re-examine their deployments and plan for cloud portability rather than wait to be forced to make the switch when you are least prepared to do so.

3 0.9064585 731 high scalability-2009-10-28-Need for change in your IT infrastructure

Introduction: Companies earnings outstrip forecasts , consumer confidence is retuning and city bonuses are back . What does this mean for business? Growth! After the recent years of cost cutting in IT budgets, there is the sudden fear induced from increased demand. Pre-existing trouble points in IT infrastructures that have lain dormant will suddenly be exposed. Monthly reporting and real time analytics will suffer as data grows. IT departments across the land will be crying out “The engine canna take no more captain”. What can be done? What we need is a scalable system that grows with the business. A system that can handle sudden increases in data growth without falling over. There are two core principles to a scalable system (1) Users experience constant QoS as demand grows (2) System Architects can grow system capacity proportionally with the available resources. In other words, if demand increases twofold, it is “enough” to purchase twice the hardware. This is linear growth. Is it e

4 0.89887226 968 high scalability-2011-01-04-Map-Reduce With Ruby Using Hadoop

Introduction: A demonstration, with repeatable steps, of how to quickly fire-up a Hadoop cluster on Amazon EC2, load data onto the HDFS (Hadoop Distributed File-System), write map-reduce scripts in Ruby and use them to run a map-reduce job on your Hadoop cluster. You will not need to ssh into the cluster, as all tasks are run from your local machine. Below I am using my MacBook Pro as my local machine, but the steps I have provided should be reproducible on other platforms running bash and Java. Fire-Up Your Hadoop Cluster I choose the Cloudera distribution of Hadoop which is still 100% Apache licensed, but has some additional benefits. One of these benefits is that it is released by Doug Cutting , who started Hadoop and drove it’s development at Yahoo! He also started Lucene , which is another of my favourite Apache Projects, so I have good faith that he knows what he is doing. Another benefit, as you will see, is that it is simple to fire-up a Hadoop cluster. I am going to use C

5 0.87604475 807 high scalability-2010-04-09-Vagrant - Build and Deploy Virtualized Development Environments Using Ruby

Introduction: One of the cool things we are seeing is more tools and tool chains for performing very high level operations quite simply. Vagrant is such a tool for building and distributing virtualized development environments . Web developers use virtual environments every day with their web applications. From EC2 and Rackspace Cloud to specialized solutions such as EngineYard and Heroku, virtualization is the tool of choice for easy deployment and infrastructure management. Vagrant aims to take those very same principles and put them to work in the heart of the application lifecycle. By providing easy to configure, lightweight, reproducible, and portable virtual machines targeted at development environments, Vagrant helps maximize your productivity and flexibility. If you've created a build and deployment system before Vagrant does a lot of the work for you: Automated virtual machine creation using Oracle’s VirtualBox Automated provisioning of virtual environments

6 0.87174439 433 high scalability-2008-10-29-CTL - Distributed Control Dispatching Framework

7 0.86584735 1211 high scalability-2012-03-19-LinkedIn: Creating a Low Latency Change Data Capture System with Databus

same-blog 8 0.84963995 553 high scalability-2009-04-03-Collectl interface to Ganglia - any interest?

9 0.84821159 218 high scalability-2008-01-17-Moving old to new. Do not be afraid of the re-write -- but take some help

10 0.84135979 1138 high scalability-2011-11-07-10 Core Architecture Pattern Variations for Achieving Scalability

11 0.82811427 6 high scalability-2007-07-11-Friendster Architecture

12 0.80056256 855 high scalability-2010-07-11-So, Why is Twitter Really Not Using Cassandra to Store Tweets?

13 0.79445583 232 high scalability-2008-01-29-When things aren't scalable

14 0.77969623 351 high scalability-2008-07-16-The Mother of All Database Normalization Debates on Coding Horror

15 0.77331322 1087 high scalability-2011-07-26-Web 2.0 Killed the Middleware Star

16 0.75732601 64 high scalability-2007-08-10-How do we make a large real-time search engine?

17 0.75388163 1329 high scalability-2012-09-26-WordPress.com Serves 70,000 req-sec and over 15 Gbit-sec of Traffic using NGINX

18 0.75298655 691 high scalability-2009-08-31-Squarespace Architecture - A Grid Handles Hundreds of Millions of Requests a Month

19 0.75218797 1507 high scalability-2013-08-26-Reddit: Lessons Learned from Mistakes Made Scaling to 1 Billion Pageviews a Month

20 0.75135303 1312 high scalability-2012-08-27-Zoosk - The Engineering behind Real Time Communications