high_scalability high_scalability-2007 high_scalability-2007-13 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Lustre速 is a scalable, secure, robust, highly-available cluster file system. It is designed, developed and maintained by Cluster File Systems, Inc. The central goal is the development of a next-generation cluster file system which can serve clusters with 10,000's of nodes, provide petabytes of storage, and move 100's of GB/sec with state-of-the-art security and management infrastructure. Lustre runs on many of the largest Linux clusters in the world, and is included by CFS's partners as a core component of their cluster offering (examples include HP StorageWorks SFS, and the Cray XT3 and XD1 supercomputers). Today's users have also demonstrated that Lustre scales down as well as it scales up, and runs in production on clusters as small as 4 and as large as 25,000 nodes. The latest version of Lustre is always available from Cluster File Systems, Inc. Public Open Source releases of Lustre are available under the GNU General Public License. These releases are found here, and are used
sentIndex sentText sentNum sentScore
1 Lustre速 is a scalable, secure, robust, highly-available cluster file system. [sent-1, score-0.421]
2 It is designed, developed and maintained by Cluster File Systems, Inc. [sent-2, score-0.181]
3 The central goal is the development of a next-generation cluster file system which can serve clusters with 10,000's of nodes, provide petabytes of storage, and move 100's of GB/sec with state-of-the-art security and management infrastructure. [sent-3, score-1.213]
4 Lustre runs on many of the largest Linux clusters in the world, and is included by CFS's partners as a core component of their cluster offering (examples include HP StorageWorks SFS, and the Cray XT3 and XD1 supercomputers). [sent-4, score-1.15]
5 Today's users have also demonstrated that Lustre scales down as well as it scales up, and runs in production on clusters as small as 4 and as large as 25,000 nodes. [sent-5, score-0.882]
6 The latest version of Lustre is always available from Cluster File Systems, Inc. [sent-6, score-0.257]
7 Public Open Source releases of Lustre are available under the GNU General Public License. [sent-7, score-0.322]
8 These releases are found here, and are used in production supercomputing environments worldwide. [sent-8, score-0.647]
wordName wordTfidf (topN-words)
[('lustre', 0.611), ('releases', 0.24), ('cluster', 0.232), ('clusters', 0.217), ('cfs', 0.192), ('file', 0.189), ('cray', 0.181), ('gnu', 0.181), ('supercomputing', 0.166), ('public', 0.145), ('supercomputers', 0.139), ('scales', 0.136), ('demonstrated', 0.126), ('runs', 0.125), ('partners', 0.121), ('included', 0.11), ('hp', 0.11), ('maintained', 0.109), ('production', 0.101), ('petabytes', 0.099), ('secure', 0.094), ('environments', 0.089), ('offering', 0.086), ('robust', 0.085), ('central', 0.084), ('available', 0.082), ('component', 0.077), ('examples', 0.077), ('latest', 0.074), ('developed', 0.072), ('serve', 0.071), ('largest', 0.069), ('goal', 0.067), ('security', 0.066), ('general', 0.063), ('include', 0.062), ('version', 0.06), ('linux', 0.059), ('today', 0.056), ('systems', 0.055), ('nodes', 0.053), ('move', 0.052), ('core', 0.051), ('found', 0.051), ('designed', 0.051), ('provide', 0.047), ('development', 0.045), ('management', 0.044), ('small', 0.041), ('always', 0.041)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000001 13 high scalability-2007-07-15-Lustre cluster file system
Introduction: Lustre速 is a scalable, secure, robust, highly-available cluster file system. It is designed, developed and maintained by Cluster File Systems, Inc. The central goal is the development of a next-generation cluster file system which can serve clusters with 10,000's of nodes, provide petabytes of storage, and move 100's of GB/sec with state-of-the-art security and management infrastructure. Lustre runs on many of the largest Linux clusters in the world, and is included by CFS's partners as a core component of their cluster offering (examples include HP StorageWorks SFS, and the Cray XT3 and XD1 supercomputers). Today's users have also demonstrated that Lustre scales down as well as it scales up, and runs in production on clusters as small as 4 and as large as 25,000 nodes. The latest version of Lustre is always available from Cluster File Systems, Inc. Public Open Source releases of Lustre are available under the GNU General Public License. These releases are found here, and are used
Introduction: Much of the focus of high performance computing (HPC) has centered on CPU performance. However, as computing requirements grow, HPC clusters are demanding higher rates of aggregate data throughput. Today's clusters feature larger numbers of nodes with increased compute speeds. The higher clock rates and operations per clock cycle create increased demand for local data on each node. In addition, InfiniBand and other high-speed, low-latency interconnects increase the data throughput available to each node. Traditional shared file systems such as NFS have not been able to scale to meet this growing demand for data throughput on HPC clusters. Scalable cluster file systems that can provide parallel data access to hundreds of nodes and petabytes of storage are needed to provide the high data throughput required by large HPC applications, including manufacturing, electronic design, and research. This paper describes an implementation of the Sun Lustre file system as a scalable storage
Introduction: Todd had originally posted an entry on collectl here at Collectl - Performance Data Collector . Collectl collects real-time data from a large number of subsystems like buddyinfo, cpu, disk, inodes, infiniband, lustre, memory, network, nfs, processes, quadrics, slabs, sockets and tcp, all using one tool and in one consistent format. Since then a lot has happened. It's now part of both Fedora and Debian distros, not to mention several others. There has also been a pretty good summary written up by Joe Brockmeier . It's also pretty well documented (I like to think) on sourceforge . There have also been a few blog postings by Martin Bach on his blog. Anyhow, awhile back I released a new version of collectl-utils and gave a complete face-lift to one of the utilities, colmux, which is a collectl multiplexor. This tool has the ability to run collectl on multiple systems, which in turn send all their output back to colmux. Colmux then sorts the output on a user-specified column
4 0.15688354 237 high scalability-2008-02-03-Product: Collectl - Performance Data Collector
Introduction: From their website : There are a number of times in which you find yourself needing performance data. These can include benchmarking, monitoring a system's general heath or trying to determine what your system was doing at some time in the past. Sometimes you just want to know what the system is doing right now. Depending on what you're doing, you often end up using different tools, each designed to for that specific situation. Features include: You are be able to run with non-integral sampling intervals. Collectl uses very little CPU. In fact it has been measured to use <0.1% when run as a daemon using the default sampling interval of 60 seconds for process and slab data and 10 seconds for everything else. Brief, verbose, and plot formats are supported. You can report aggregated performance numbers on many devices such as CPUs, Disks, interconnects such as Infiniband or Quadrics, Networks or even Lustre file systems. Collectl will align its sampling on integral sec
5 0.15371439 420 high scalability-2008-10-15-Tokyo Tech Tsubame Grid Storage Implementation
Introduction: This Sun BluePrint article describes the storage architecture of the Tokyo Institute of Technology TSUBAME grid. The Tokyo Institute of Technology is of the world's leading technical institutes, and recently created the fastest supercomputer in Asia, and one of the largest supercomputers outside of the United States. By deploying Sun Fire x64 servers and data servers in a grid architecture, Tokyo Tech built a cost-effective and flexible supercomputer consisting of hundreds of systems, thousands of processors, terabytes of memory and a petabyte of storage that supports users running common off-the-shelf applications. This is the second of a three-article series. It describes the steps to install and configuring the Lustre file system within the storage architecture.
6 0.12025353 254 high scalability-2008-02-19-Hadoop Getting Closer to 1.0 Release
7 0.11520797 889 high scalability-2010-08-30-Pomegranate - Storing Billions and Billions of Tiny Little Files
8 0.11327661 101 high scalability-2007-09-27-Product: Ganglia Monitoring System
9 0.098795086 1075 high scalability-2011-07-07-Myth: Google Uses Server Farms So You Should Too - Resurrection of the Big-Ass Machines
10 0.095782928 263 high scalability-2008-02-27-Product: System Imager - Automate Deployment and Installs
11 0.084816858 50 high scalability-2007-07-31-BerkeleyDB & other distributed high performance key-value databases
12 0.081122756 601 high scalability-2009-05-17-Product: Hadoop
13 0.080583476 414 high scalability-2008-10-15-Hadoop - A Primer
14 0.080086142 1279 high scalability-2012-07-09-Data Replication in NoSQL Databases
15 0.077832446 1642 high scalability-2014-05-02-Stuff The Internet Says On Scalability For May 2nd, 2014
16 0.075421102 161 high scalability-2007-11-20-Product: SmartFrog a Distributed Configuration and Deployment Framework
17 0.07533212 243 high scalability-2008-02-07-clusteradmin.blogspot.com - blog about building and administering clusters
18 0.073814407 1155 high scalability-2011-12-12-Netflix: Developing, Deploying, and Supporting Software According to the Way of the Cloud
20 0.072632387 1042 high scalability-2011-05-17-Facebook: An Example Canonical Architecture for Scaling Billions of Messages
topicId topicWeight
[(0, 0.097), (1, 0.027), (2, 0.018), (3, 0.007), (4, -0.013), (5, 0.021), (6, 0.073), (7, -0.056), (8, -0.011), (9, 0.096), (10, 0.012), (11, 0.006), (12, 0.076), (13, -0.019), (14, 0.039), (15, 0.045), (16, -0.025), (17, 0.011), (18, -0.058), (19, 0.029), (20, 0.063), (21, -0.007), (22, 0.005), (23, 0.022), (24, -0.047), (25, 0.018), (26, 0.036), (27, -0.023), (28, -0.072), (29, -0.013), (30, -0.031), (31, -0.001), (32, 0.008), (33, -0.009), (34, 0.003), (35, 0.056), (36, -0.024), (37, -0.094), (38, -0.016), (39, 0.02), (40, -0.032), (41, -0.052), (42, -0.025), (43, 0.059), (44, -0.051), (45, 0.148), (46, -0.049), (47, -0.018), (48, -0.002), (49, 0.069)]
simIndex simValue blogId blogTitle
same-blog 1 0.97618699 13 high scalability-2007-07-15-Lustre cluster file system
Introduction: Lustre速 is a scalable, secure, robust, highly-available cluster file system. It is designed, developed and maintained by Cluster File Systems, Inc. The central goal is the development of a next-generation cluster file system which can serve clusters with 10,000's of nodes, provide petabytes of storage, and move 100's of GB/sec with state-of-the-art security and management infrastructure. Lustre runs on many of the largest Linux clusters in the world, and is included by CFS's partners as a core component of their cluster offering (examples include HP StorageWorks SFS, and the Cray XT3 and XD1 supercomputers). Today's users have also demonstrated that Lustre scales down as well as it scales up, and runs in production on clusters as small as 4 and as large as 25,000 nodes. The latest version of Lustre is always available from Cluster File Systems, Inc. Public Open Source releases of Lustre are available under the GNU General Public License. These releases are found here, and are used
2 0.71617705 272 high scalability-2008-03-08-Product: FAI - Fully Automatic Installation
Introduction: From their website: FAI is an automated installation tool to install or deploy Debian GNU/Linux and other distributions on a bunch of different hosts or a Cluster. It's more flexible than other tools like kickstart for Red Hat, autoyast and alice for SuSE or Jumpstart for SUN Solaris. FAI can also be used for configuration management of a running system. You can take one or more virgin PCs, turn on the power and after a few minutes Linux is installed, configured and running on all your machines, without any interaction necessary. FAI it's a scalable method for installing and updating all your computers unattended with little effort involved. It's a centralized management system for your Linux deployment. FAI's target group are system administrators who have to install Linux onto one or even hundreds of computers. It's not only a tool for doing a Cluster installation but a general purpose installation tool. It can be used for installing a Beowulf cluster, a rendering farm,
3 0.68335235 114 high scalability-2007-10-07-Product: Wackamole
Introduction: Wackamole is an application that helps with making a cluster highly available. It manages a bunch of virtual IPs, that should be available to the outside world at all times. Wackamole ensures that a single machine within a cluster is listening on each virtual IP address that Wackamole manages. If it discovers that particular machines within the cluster are not alive, it will almost immediately ensure that other machines acquire these public IPs. At no time will more than one machine listen on any virtual IP. Wackamole also works toward achieving a balanced distribution of number IPs on the machine within the cluster it manages. There is no other software like Wackamole. Wackamole is quite unique in that it operates in a completely peer-to-peer mode within the cluster. Other products that provide the same high-availability guarantees use a "VIP" method. Wackamole is an application that runs as root in a cluster to make it highly available. It uses the membership notifications prov
Introduction: Much of the focus of high performance computing (HPC) has centered on CPU performance. However, as computing requirements grow, HPC clusters are demanding higher rates of aggregate data throughput. Today's clusters feature larger numbers of nodes with increased compute speeds. The higher clock rates and operations per clock cycle create increased demand for local data on each node. In addition, InfiniBand and other high-speed, low-latency interconnects increase the data throughput available to each node. Traditional shared file systems such as NFS have not been able to scale to meet this growing demand for data throughput on HPC clusters. Scalable cluster file systems that can provide parallel data access to hundreds of nodes and petabytes of storage are needed to provide the high data throughput required by large HPC applications, including manufacturing, electronic design, and research. This paper describes an implementation of the Sun Lustre file system as a scalable storage
5 0.6714257 263 high scalability-2008-02-27-Product: System Imager - Automate Deployment and Installs
Introduction: From their website: SystemImager is software that makes the installation of Linux to masses of similar machines relatively easy. It makes software distribution, configuration, and operating system updates easy, and can also be used for content distribution. SystemImager makes it easy to do automated installs (clones), software distribution, content or data distribution, configuration changes, and operating system updates to your network of Linux machines. You can even update from one Linux release version to another! It can also be used to ensure safe production deployments. By saving your current production image before updating to your new production image, you have a highly reliable contingency mechanism. If the new production enviroment is found to be flawed, simply roll-back to the last production image with a simple update command! Some typical environments include: Internet server farms, database server farms, high performance clusters, computer labs, and corporate
6 0.66486651 278 high scalability-2008-03-16-Product: GlusterFS
7 0.62539548 237 high scalability-2008-02-03-Product: Collectl - Performance Data Collector
8 0.62280202 25 high scalability-2007-07-25-Paper: Designing Disaster Tolerant High Availability Clusters
10 0.57517731 103 high scalability-2007-09-28-Kosmos File System (KFS) is a New High End Google File System Option
11 0.56538022 326 high scalability-2008-05-25-Product: Condor - Compute Intensive Workload Management
12 0.55981261 1442 high scalability-2013-04-17-Tachyon - Fault Tolerant Distributed File System with 300 Times Higher Throughput than HDFS
13 0.55946469 889 high scalability-2010-08-30-Pomegranate - Storing Billions and Billions of Tiny Little Files
14 0.55682707 101 high scalability-2007-09-27-Product: Ganglia Monitoring System
16 0.55051225 488 high scalability-2009-01-08-file synchronization solutions
17 0.54548383 254 high scalability-2008-02-19-Hadoop Getting Closer to 1.0 Release
18 0.54464263 112 high scalability-2007-10-04-You Can Now Store All Your Stuff on Your Own Google Like File System
19 0.54113179 433 high scalability-2008-10-29-CTL - Distributed Control Dispatching Framework
20 0.54044747 20 high scalability-2007-07-16-Paper: The Clustered Storage Revolution
topicId topicWeight
[(1, 0.12), (2, 0.208), (30, 0.03), (75, 0.271), (79, 0.136), (94, 0.097)]
simIndex simValue blogId blogTitle
1 0.92032146 1320 high scalability-2012-09-11-How big is a Petabyte, Exabyte, Zettabyte, or a Yottabyte?
Introduction: This is an intuitive look at large data sizes By Julian Bunn in Globally Interconnected Object Databases . Bytes(8 bits) 0.1 bytes: A binary decision 1 byte: A single character 10 bytes: A single word 100 bytes: A telegram OR A punched card Kilobyte (1000 bytes) 1 Kilobyte: A very short story 2 Kilobytes: A Typewritten page 10 Kilobytes: An encyclopaedic page OR A deck of punched cards 50 Kilobytes: A compressed document image page 100 Kilobytes: A low-resolution photograph 200 Kilobytes: A box of punched cards 500 Kilobytes: A very heavy box of punched cards Megabyte (1 000 000 bytes) 1 Megabyte: A small novel OR A 3.5 inch floppy disk 2 Megabytes: A high resolution photograph 5 Megabytes: The complete works of Shakespeare OR 30 seconds of TV-quality video 10 Megabytes: A minute of high-fidelity sound OR A digital chest X-ray 20 Megabytes: A box of floppy disks 50 Megabytes: A digital mammogram 100 Megabyte
same-blog 2 0.8585645 13 high scalability-2007-07-15-Lustre cluster file system
Introduction: Lustre速 is a scalable, secure, robust, highly-available cluster file system. It is designed, developed and maintained by Cluster File Systems, Inc. The central goal is the development of a next-generation cluster file system which can serve clusters with 10,000's of nodes, provide petabytes of storage, and move 100's of GB/sec with state-of-the-art security and management infrastructure. Lustre runs on many of the largest Linux clusters in the world, and is included by CFS's partners as a core component of their cluster offering (examples include HP StorageWorks SFS, and the Cray XT3 and XD1 supercomputers). Today's users have also demonstrated that Lustre scales down as well as it scales up, and runs in production on clusters as small as 4 and as large as 25,000 nodes. The latest version of Lustre is always available from Cluster File Systems, Inc. Public Open Source releases of Lustre are available under the GNU General Public License. These releases are found here, and are used
3 0.83116275 864 high scalability-2010-07-24-4 New Podcasts for Scalable Summertime Reading
Introduction: It's trendy today to say "I don't read blogs anymore, I just let the random chance of my social network guide me to new and interesting content." #fail. While someone says this I imagine them flicking their hair back in a "I can't be bothered with true understanding" disdain. And where does random chance get its content? From people like these. So: support your local blog! If you would like to be a part of random chance, here are a few new podcasts/blogs/vidcasts that you may not know about and that I've found interesting: DevOps Cafe . With this new video series where John and Damon visit high performing companies and record an insider's tour of the tools and processes those companies are using to solve their DevOps problems , DevOps is a profession that finally seems to be realizing their own value. In the first episode John Paul Ramirez takes the crew on a tour of Shopzilla's application lifecycle metrics and dashboard. The second episode feature John Allspaw, VP of
4 0.77617317 1173 high scalability-2012-01-12-Peregrine - A Map Reduce Framework for Iterative and Pipelined Jobs
Introduction: The Peregrine falcon is a bird of prey, famous for its high speed diving attacks , feeding primarily on much slower Hadoops. Wait, sorry, it is Kevin Burton of Spinn3r's new Peregrine project -- a new FAST modern map reduce framework optimized for iterative and pipelined map reduce jobs -- that feeds on Hadoops. If you don't know Kevin, he does a lot of excellent technical work that he's kind enough to share it on his blog . Only he hasn't been blogging much lately, he's been heads down working on Peregrine. Now that Peregrine has been released, here's a short email interview with Kevin on why you might want to take up falconry , the ancient sport of MapReduce. What does Spinn3r do that Peregrine is important to you? Ideally it was designed to execute pagerank but many iterative applications that we deploy and WANT to deploy (k-means) would be horribly inefficient under Hadoop as it doesn't have any support for merging and joining IO between tasks. It also doesn't support
5 0.77132303 791 high scalability-2010-03-09-Sponsored Post: Job Openings - Squarespace
Introduction: Squarespace Looking for Full-time Scaling Expert Interested in helping a cutting-edge, high-growth startup scale? Squarespace, which was profiled here last year in Squarespace Architecture - A Grid Handles Hundreds of Millions of Requests a Month and also hosts this blog , is currently in the market for a crack scalability engineer to help build out its cloud infrastructure. Squarespace is very excited about finding a full-time scaling expert. Interested applicants should go to http://www.squarespace.com/jobs-software-engineer for more information. ďťż If you would like to advertise your critical, hard to fill job opeinings on HighScalability, please contact us and we'll get it setup for you.
6 0.76378495 1552 high scalability-2013-11-22-Stuff The Internet Says On Scalability For November 22th, 2013
7 0.75000226 1544 high scalability-2013-11-07-Paper: Tempest: Scalable Time-Critical Web Services Platform
8 0.72831047 1649 high scalability-2014-05-16-Stuff The Internet Says On Scalability For May 16th, 2014
9 0.72090948 312 high scalability-2008-04-30-Rather small site architecture.
10 0.71916789 781 high scalability-2010-02-23-Sponsored Post: Job Openings - Squarespace
11 0.71576929 1309 high scalability-2012-08-22-Cloud Deployment: It’s All About Cloud Automation
12 0.71081704 303 high scalability-2008-04-18-Scaling Mania at MySQL Conference 2008
13 0.71068627 1516 high scalability-2013-09-13-Stuff The Internet Says On Scalability For September 13, 2013
14 0.70882607 229 high scalability-2008-01-29-Building scalable storage into application - Instead of MogileFS OpenAFS etc.
15 0.70517159 964 high scalability-2010-12-28-Netflix: Continually Test by Failing Servers with Chaos Monkey
16 0.70372903 15 high scalability-2007-07-16-Blog: MySQL Performance Blog - Everything about MySQL Performance.
17 0.70161557 266 high scalability-2008-03-04-Manage Downtime Risk by Connecting Multiple Data Centers into a Secure Virtual LAN
18 0.7012912 863 high scalability-2010-07-22-How can we spark the movement of research out of the Ivory Tower and into production?
19 0.70091128 1612 high scalability-2014-03-14-Stuff The Internet Says On Scalability For March 14th, 2014
20 0.70042193 645 high scalability-2009-06-30-Hot New Trend: Linking Clouds Through Cheap IP VPNs Instead of Private Lines