hunch_net hunch_net-2009 hunch_net-2009-366 knowledge-graph by maker-knowledge-mining

366 hunch net-2009-08-03-Carbon in Computer Science Research


meta infos for this blog

Source: html

Introduction: Al Gore ‘s film and gradually more assertive and thorough science has managed to mostly shift the debate on climate change from “Is it happening?” to “What should be done?” In that context, it’s worthwhile to think a bit about what can be done within computer science research. There are two things we can think about: Doing Research At a cartoon level, computer science research consists of some combination of commuting to&from; work, writing programs, running them on computers, writing papers, and presenting them at conferences. A typical computer has a power usage on the order of 100 Watts, which works out to 2.4 kiloWatt-hours/day. Looking up David MacKay ‘s reference on power usage per person , it becomes clear that this is a relatively minor part of the lifestyle, although it could become substantial if many more computers are required. Much larger costs are associated with commuting (which is in common with many people) and attending conferences. Since local commuti


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 There are two things we can think about: Doing Research At a cartoon level, computer science research consists of some combination of commuting to&from; work, writing programs, running them on computers, writing papers, and presenting them at conferences. [sent-4, score-0.924]

2 A typical computer has a power usage on the order of 100 Watts, which works out to 2. [sent-5, score-0.446]

3 Looking up David MacKay ‘s reference on power usage per person , it becomes clear that this is a relatively minor part of the lifestyle, although it could become substantial if many more computers are required. [sent-7, score-0.355]

4 Much larger costs are associated with commuting (which is in common with many people) and attending conferences. [sent-8, score-0.419]

5 Since local commuting is common across many people, and there are known approaches (typically public transportation) for more efficient commuting, I expect researchers can piggyback on improvements in public transportation to reduce commuting costs. [sent-9, score-1.52]

6 In fact, the situation for researchers may be better in general, as the nature of the job may make commuting avoidable, at least on some days. [sent-10, score-0.419]

7 Presenting at conferences is the remaining problem area, essentially due to travel by airplane to and from a conference. [sent-11, score-0.499]

8 Travel by airplane has an energy cost similar to travel by car over the same distance, but we typically take airplanes for very long distances. [sent-12, score-0.813]

9 Unlike cars, typical airplane usage requires stored energy in a dense form. [sent-13, score-0.728]

10 For example, there are no serious proposals I’m aware of for battery-powered airplanes, because all existing rechargeable batteries have a power density around 1/10th that of hydrocarbon fuel (which makes sense given that about 3/4 of the mass for a hydrocarbon fire is oxygen in the air). [sent-14, score-0.445]

11 This suggests airplane transport may be particularly difficult to adapt towards low or zero carbon usage. [sent-15, score-0.53]

12 If these aren’t developed, it seems we should expect fewer conferences, more regional conferences, Europe with it’s extensive fast train network to be less impacted, and more serious effort towards distributed conferences. [sent-18, score-0.379]

13 For the last, it’s easy to imagine with existing technology having simultaneous regional conferences which are mutually videoconferenced, and we aren’t far from being able to handle a fully interactive videobroadcast amongst an indefinitely large number of participants. [sent-19, score-0.529]

14 As a corollary of fewer conferences, other interactive mechanisms (for example research blogs) seems likely to grow. [sent-20, score-0.286]

15 Research Topics They keyword for research topics is efficiency , and it is not a trivial concern on a global scale . [sent-21, score-0.448]

16 In computer science, there have been a few algorithms (such as quicksort and hashing ) developed which substantially and broadly improved real-world efficiency, but the real driver of efficiency so far is the hardware development, which has phenomenally improved efficiency for several decades. [sent-22, score-1.202]

17 Many of the efficiency improvements are sure to remain hardware based, but software is becoming an essential component. [sent-23, score-0.479]

18 One basic observation about efficient algorithms is that for problems admitting an efficient parallel solution (counting is a great example), the parallel algorithm is generally more efficient, because energy use is typically superlinear in clock speed. [sent-24, score-1.011]

19 As an extreme example, the human brain which is deeply optimized by evolution for energy efficiency typically runs at at 100Hz or 100KHz. [sent-25, score-0.6]

20 Although efficiency suggests parallel algorithms, this should not be done blindly. [sent-26, score-0.617]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('commuting', 0.419), ('efficiency', 0.307), ('airplane', 0.248), ('energy', 0.186), ('efficient', 0.179), ('usage', 0.172), ('parallel', 0.149), ('airplanes', 0.14), ('hydrocarbon', 0.14), ('transportation', 0.14), ('travel', 0.132), ('conferences', 0.119), ('computer', 0.11), ('regional', 0.108), ('typically', 0.107), ('science', 0.105), ('power', 0.104), ('hardware', 0.103), ('suggests', 0.098), ('far', 0.098), ('approaches', 0.097), ('presenting', 0.086), ('interactive', 0.08), ('computers', 0.079), ('fewer', 0.079), ('developed', 0.075), ('topics', 0.073), ('expect', 0.071), ('improved', 0.07), ('improvements', 0.069), ('research', 0.068), ('programming', 0.068), ('writing', 0.068), ('done', 0.063), ('public', 0.063), ('lifestyle', 0.062), ('dense', 0.062), ('mutually', 0.062), ('electricity', 0.062), ('carbon', 0.062), ('hydrogen', 0.062), ('quicksort', 0.062), ('indefinitely', 0.062), ('transport', 0.062), ('mahout', 0.062), ('superlinear', 0.062), ('serious', 0.061), ('typical', 0.06), ('towards', 0.06), ('example', 0.059)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 366 hunch net-2009-08-03-Carbon in Computer Science Research

Introduction: Al Gore ‘s film and gradually more assertive and thorough science has managed to mostly shift the debate on climate change from “Is it happening?” to “What should be done?” In that context, it’s worthwhile to think a bit about what can be done within computer science research. There are two things we can think about: Doing Research At a cartoon level, computer science research consists of some combination of commuting to&from; work, writing programs, running them on computers, writing papers, and presenting them at conferences. A typical computer has a power usage on the order of 100 Watts, which works out to 2.4 kiloWatt-hours/day. Looking up David MacKay ‘s reference on power usage per person , it becomes clear that this is a relatively minor part of the lifestyle, although it could become substantial if many more computers are required. Much larger costs are associated with commuting (which is in common with many people) and attending conferences. Since local commuti

2 0.1259231 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

Introduction: Muthu invited me to the workshop on algorithms in the field , with the goal of providing a sense of where near-term research should go. When the time came though, I bargained for a post instead, which provides a chance for many other people to comment. There are several things I didn’t fully understand when I went to Yahoo! about 5 years ago. I’d like to repeat them as people in academia may not yet understand them intuitively. Almost all the big impact algorithms operate in pseudo-linear or better time. Think about caching, hashing, sorting, filtering, etc… and you have a sense of what some of the most heavily used algorithms are. This matters quite a bit to Machine Learning research, because people often work with superlinear time algorithms and languages. Two very common examples of this are graphical models, where inference is often a superlinear operation—think about the n 2 dependence on the number of states in a Hidden Markov Model and Kernelized Support Vecto

3 0.11797839 295 hunch net-2008-04-12-It Doesn’t Stop

Introduction: I’ve enjoyed the Terminator movies and show. Neglecting the whacky aspects (time travel and associated paradoxes), there is an enduring topic of discussion: how do people deal with intelligent machines (and vice versa)? In Terminator-land, the primary method for dealing with intelligent machines is to prevent them from being made. This approach works pretty badly, because a new angle on building an intelligent machine keeps coming up. This is partly a ploy for writer’s to avoid writing themselves out of a job, but there is a fundamental truth to it as well: preventing progress in research is hard. The United States, has been experimenting with trying to stop research on stem cells . It hasn’t worked very well—the net effect has been retarding research programs a bit, and exporting some research to other countries. Another less recent example was encryption technology, for which the United States generally did not encourage early public research and even discouraged as a mu

4 0.11360949 282 hunch net-2008-01-06-Research Political Issues

Introduction: I’ve avoided discussing politics here, although not for lack of interest. The problem with discussing politics is that it’s customary for people to say much based upon little information. Nevertheless, politics can have a substantial impact on science (and we might hope for the vice-versa). It’s primary election time in the United States, so the topic is timely, although the issues are not. There are several policy decisions which substantially effect development of science and technology in the US. Education The US has great contrasts in education. The top universities are very good places, yet the grade school education system produces mediocre results. For me, the contrast between a public education and Caltech was bracing. For many others attending Caltech, it clearly was not. Upgrading the k-12 education system in the US is a long-standing chronic problem which I know relatively little about. My own experience is that a basic attitude of “no child unrealized” i

5 0.11270771 346 hunch net-2009-03-18-Parallel ML primitives

Introduction: Previously, we discussed parallel machine learning a bit. As parallel ML is rather difficult, I’d like to describe my thinking at the moment, and ask for advice from the rest of the world. This is particularly relevant right now, as I’m attending a workshop tomorrow on parallel ML. Parallelizing slow algorithms seems uncompelling. Parallelizing many algorithms also seems uncompelling, because the effort required to parallelize is substantial. This leaves the question: Which one fast algorithm is the best to parallelize? What is a substantially different second? One compellingly fast simple algorithm is online gradient descent on a linear representation. This is the core of Leon’s sgd code and Vowpal Wabbit . Antoine Bordes showed a variant was competitive in the large scale learning challenge . It’s also a decades old primitive which has been reused in many algorithms, and continues to be reused. It also applies to online learning rather than just online optimiz

6 0.1124887 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

7 0.10683773 404 hunch net-2010-08-20-The Workshop on Cores, Clusters, and Clouds

8 0.10669318 228 hunch net-2007-01-15-The Machine Learning Department

9 0.10059794 426 hunch net-2011-03-19-The Ideal Large Scale Learning Class

10 0.10045073 450 hunch net-2011-12-02-Hadoop AllReduce and Terascale Learning

11 0.099509269 451 hunch net-2011-12-13-Vowpal Wabbit version 6.1 & the NIPS tutorial

12 0.097807012 378 hunch net-2009-11-15-The Other Online Learning

13 0.094605431 120 hunch net-2005-10-10-Predictive Search is Coming

14 0.094199874 296 hunch net-2008-04-21-The Science 2.0 article

15 0.090221554 229 hunch net-2007-01-26-Parallel Machine Learning Problems

16 0.089946263 344 hunch net-2009-02-22-Effective Research Funding

17 0.088299595 134 hunch net-2005-12-01-The Webscience Future

18 0.087746292 293 hunch net-2008-03-23-Interactive Machine Learning

19 0.087727301 193 hunch net-2006-07-09-The Stock Prediction Machine Learning Problem

20 0.087652594 64 hunch net-2005-04-28-Science Fiction and Research


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.219), (1, -0.012), (2, -0.118), (3, 0.081), (4, -0.01), (5, 0.014), (6, -0.036), (7, 0.031), (8, -0.036), (9, 0.123), (10, -0.005), (11, -0.058), (12, 0.015), (13, 0.002), (14, -0.031), (15, -0.071), (16, 0.04), (17, 0.003), (18, 0.031), (19, -0.028), (20, 0.041), (21, -0.018), (22, -0.06), (23, 0.137), (24, -0.034), (25, -0.041), (26, 0.057), (27, 0.035), (28, 0.041), (29, 0.025), (30, -0.01), (31, 0.038), (32, 0.004), (33, 0.017), (34, 0.084), (35, 0.02), (36, -0.052), (37, -0.015), (38, -0.047), (39, 0.036), (40, 0.058), (41, -0.034), (42, 0.021), (43, 0.071), (44, -0.076), (45, -0.054), (46, -0.015), (47, -0.017), (48, 0.005), (49, -0.013)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96784413 366 hunch net-2009-08-03-Carbon in Computer Science Research

Introduction: Al Gore ‘s film and gradually more assertive and thorough science has managed to mostly shift the debate on climate change from “Is it happening?” to “What should be done?” In that context, it’s worthwhile to think a bit about what can be done within computer science research. There are two things we can think about: Doing Research At a cartoon level, computer science research consists of some combination of commuting to&from; work, writing programs, running them on computers, writing papers, and presenting them at conferences. A typical computer has a power usage on the order of 100 Watts, which works out to 2.4 kiloWatt-hours/day. Looking up David MacKay ‘s reference on power usage per person , it becomes clear that this is a relatively minor part of the lifestyle, although it could become substantial if many more computers are required. Much larger costs are associated with commuting (which is in common with many people) and attending conferences. Since local commuti

2 0.66693145 229 hunch net-2007-01-26-Parallel Machine Learning Problems

Introduction: Parallel machine learning is a subject rarely addressed at machine learning conferences. Nevertheless, it seems likely to increase in importance because: Data set sizes appear to be growing substantially faster than computation. Essentially, this happens because more and more sensors of various sorts are being hooked up to the internet. Serial speedups of processors seem are relatively stalled. The new trend is to make processors more powerful by making them multicore . Both AMD and Intel are making dual core designs standard, with plans for more parallelism in the future. IBM’s Cell processor has (essentially) 9 cores. Modern graphics chips can have an order of magnitude more separate execution units. The meaning of ‘core’ varies a bit from processor to processor, but the overall trend seems quite clear. So, how do we parallelize machine learning algorithms? The simplest and most common technique is to simply run the same learning algorithm with di

3 0.66691279 282 hunch net-2008-01-06-Research Political Issues

Introduction: I’ve avoided discussing politics here, although not for lack of interest. The problem with discussing politics is that it’s customary for people to say much based upon little information. Nevertheless, politics can have a substantial impact on science (and we might hope for the vice-versa). It’s primary election time in the United States, so the topic is timely, although the issues are not. There are several policy decisions which substantially effect development of science and technology in the US. Education The US has great contrasts in education. The top universities are very good places, yet the grade school education system produces mediocre results. For me, the contrast between a public education and Caltech was bracing. For many others attending Caltech, it clearly was not. Upgrading the k-12 education system in the US is a long-standing chronic problem which I know relatively little about. My own experience is that a basic attitude of “no child unrealized” i

4 0.65707809 106 hunch net-2005-09-04-Science in the Government

Introduction: I found the article on “ Political Science ” at the New York Times interesting. Essentially the article is about allegations that the US government has been systematically distorting scientific views. With a petition by some 7000+ scientists alleging such behavior this is clearly a significant concern. One thing not mentioned explicitly in this discussion is that there are fundamental cultural differences between academic research and the rest of the world. In academic research, careful, clear thought is valued. This value is achieved by both formal and informal mechanisms. One example of a formal mechanism is peer review. In contrast, in the land of politics, the basic value is agreement. It is only with some amount of agreement that a new law can be passed or other actions can be taken. Since Science (with a capitol ‘S’) has accomplished many things, it can be a significant tool in persuading people. This makes it compelling for a politician to use science as a mec

5 0.65181971 128 hunch net-2005-11-05-The design of a computing cluster

Introduction: This is about the design of a computing cluster from the viewpoint of applied machine learning using current technology. We just built a small one at TTI so this is some evidence of what is feasible and thoughts about the design choices. Architecture There are several architectural choices. AMD Athlon64 based system. This seems to have the cheapest bang/buck. Maximum RAM is typically 2-3GB. AMD Opteron based system. Opterons provide the additional capability to buy an SMP motherboard with two chips, and the motherboards often support 16GB of RAM. The RAM is also the more expensive error correcting type. Intel PIV or Xeon based system. The PIV and Xeon based systems are the intel analog of the above 2. Due to architectural design reasons, these chips tend to run a bit hotter and be a bit more expensive. Dual core chips. Both Intel and AMD have chips that actually have 2 processors embedded in them. In the end, we decided to go with option (2). Roughly speaking,

6 0.65091234 193 hunch net-2006-07-09-The Stock Prediction Machine Learning Problem

7 0.6370495 450 hunch net-2011-12-02-Hadoop AllReduce and Terascale Learning

8 0.63508016 228 hunch net-2007-01-15-The Machine Learning Department

9 0.63434029 295 hunch net-2008-04-12-It Doesn’t Stop

10 0.6307112 64 hunch net-2005-04-28-Science Fiction and Research

11 0.62287062 262 hunch net-2007-09-16-Optimizing Machine Learning Programs

12 0.61265874 346 hunch net-2009-03-18-Parallel ML primitives

13 0.60864955 222 hunch net-2006-12-05-Recruitment Conferences

14 0.59274709 300 hunch net-2008-04-30-Concerns about the Large Scale Learning Challenge

15 0.57887626 314 hunch net-2008-08-24-Mass Customized Medicine in the Future?

16 0.56818795 250 hunch net-2007-06-23-Machine Learning Jobs are Growing on Trees

17 0.56812096 370 hunch net-2009-09-18-Necessary and Sufficient Research

18 0.56757259 352 hunch net-2009-05-06-Machine Learning to AI

19 0.56663138 255 hunch net-2007-07-13-The View From China

20 0.54838771 112 hunch net-2005-09-14-The Predictionist Viewpoint


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(0, 0.016), (3, 0.025), (10, 0.022), (17, 0.289), (27, 0.157), (38, 0.053), (48, 0.022), (53, 0.044), (55, 0.077), (68, 0.015), (94, 0.114), (95, 0.067), (99, 0.017)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.84969109 377 hunch net-2009-11-09-NYAS ML Symposium this year.

Introduction: The NYAS ML symposium grew again this year to 170 participants, despite the need to outsmart or otherwise tunnel through a crowd . Perhaps the most distinct talk was by Bob Bell on various aspects of the Netflix prize competition. I also enjoyed several student posters including Matt Hoffman ‘s cool examples of blind source separation for music. I’m somewhat surprised how much the workshop has grown, as it is now comparable in size to a small conference, although in style more similar to a workshop. At some point as an event grows, it becomes owned by the community rather than the organizers, so if anyone has suggestions on improving it, speak up and be heard.

same-blog 2 0.83571237 366 hunch net-2009-08-03-Carbon in Computer Science Research

Introduction: Al Gore ‘s film and gradually more assertive and thorough science has managed to mostly shift the debate on climate change from “Is it happening?” to “What should be done?” In that context, it’s worthwhile to think a bit about what can be done within computer science research. There are two things we can think about: Doing Research At a cartoon level, computer science research consists of some combination of commuting to&from; work, writing programs, running them on computers, writing papers, and presenting them at conferences. A typical computer has a power usage on the order of 100 Watts, which works out to 2.4 kiloWatt-hours/day. Looking up David MacKay ‘s reference on power usage per person , it becomes clear that this is a relatively minor part of the lifestyle, although it could become substantial if many more computers are required. Much larger costs are associated with commuting (which is in common with many people) and attending conferences. Since local commuti

3 0.75779134 253 hunch net-2007-07-06-Idempotent-capable Predictors

Introduction: One way to distinguish different learning algorithms is by their ability or inability to easily use an input variable as the predicted output. This is desirable for at least two reasons: Modularity If we want to build complex learning systems via reuse of a subsystem, it’s important to have compatible I/O. “Prior” knowledge Machine learning is often applied in situations where we do have some knowledge of what the right solution is, often in the form of an existing system. In such situations, it’s good to start with a learning algorithm that can be at least as good as any existing system. When doing classification, most learning algorithms can do this. For example, a decision tree can split on a feature, and then classify. The real differences come up when we attempt regression. Many of the algorithms we know and commonly use are not idempotent predictors. Logistic regressors can not be idempotent, because all input features are mapped through a nonlinearity.

4 0.75037622 313 hunch net-2008-08-18-Radford Neal starts a blog

Introduction: here on statistics, ML, CS, and other things he knows well.

5 0.72419548 143 hunch net-2005-12-27-Automated Labeling

Introduction: One of the common trends in machine learning has been an emphasis on the use of unlabeled data. The argument goes something like “there aren’t many labeled web pages out there, but there are a huge number of web pages, so we must find a way to take advantage of them.” There are several standard approaches for doing this: Unsupervised Learning . You use only unlabeled data. In a typical application, you cluster the data and hope that the clusters somehow correspond to what you care about. Semisupervised Learning. You use both unlabeled and labeled data to build a predictor. The unlabeled data influences the learned predictor in some way. Active Learning . You have unlabeled data and access to a labeling oracle. You interactively choose which examples to label so as to optimize prediction accuracy. It seems there is a fourth approach worth serious investigation—automated labeling. The approach goes as follows: Identify some subset of observed values to predict

6 0.60311294 419 hunch net-2010-12-04-Vowpal Wabbit, version 5.0, and the second heresy

7 0.6005637 286 hunch net-2008-01-25-Turing’s Club for Machine Learning

8 0.60034251 221 hunch net-2006-12-04-Structural Problems in NIPS Decision Making

9 0.59517986 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

10 0.59494311 95 hunch net-2005-07-14-What Learning Theory might do

11 0.59429026 136 hunch net-2005-12-07-Is the Google way the way for machine learning?

12 0.59207624 423 hunch net-2011-02-02-User preferences for search engines

13 0.59152156 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models

14 0.59042251 445 hunch net-2011-09-28-Somebody’s Eating Your Lunch

15 0.58951229 371 hunch net-2009-09-21-Netflix finishes (and starts)

16 0.58788693 450 hunch net-2011-12-02-Hadoop AllReduce and Terascale Learning

17 0.58595198 343 hunch net-2009-02-18-Decision by Vetocracy

18 0.58558547 229 hunch net-2007-01-26-Parallel Machine Learning Problems

19 0.58475667 360 hunch net-2009-06-15-In Active Learning, the question changes

20 0.58420962 75 hunch net-2005-05-28-Running A Machine Learning Summer School