hunch_net hunch_net-2007 hunch_net-2007-250 knowledge-graph by maker-knowledge-mining

250 hunch net-2007-06-23-Machine Learning Jobs are Growing on Trees


meta infos for this blog

Source: html

Introduction: The consensus of several discussions at ICML is that the number of jobs for people knowing machine learning well substantially exceeds supply. This is my experience as well. Demand comes from many places, but I’ve seen particularly strong demand from trading companies and internet startups. Like all interest bursts, this one will probably pass because of economic recession or other distractions. Nevertheless, the general outlook for machine learning in business seems to be good. Machine learning is all about optimization when there is uncertainty and lots of data. The quantity of data available is growing quickly as computer-run processes and sensors become more common, and the quality of the data is dropping since there is little editorial control in it’s collection. Machine Learning is a difficult subject to master (*), so those who do should remain in demand over the long term. (*) In fact, it would be reasonable to claim that no one has mastered it—there are just some peo


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 The consensus of several discussions at ICML is that the number of jobs for people knowing machine learning well substantially exceeds supply. [sent-1, score-0.984]

2 Demand comes from many places, but I’ve seen particularly strong demand from trading companies and internet startups. [sent-3, score-1.141]

3 Like all interest bursts, this one will probably pass because of economic recession or other distractions. [sent-4, score-0.655]

4 Nevertheless, the general outlook for machine learning in business seems to be good. [sent-5, score-0.264]

5 Machine learning is all about optimization when there is uncertainty and lots of data. [sent-6, score-0.375]

6 The quantity of data available is growing quickly as computer-run processes and sensors become more common, and the quality of the data is dropping since there is little editorial control in it’s collection. [sent-7, score-1.779]

7 Machine Learning is a difficult subject to master (*), so those who do should remain in demand over the long term. [sent-8, score-0.928]

8 (*) In fact, it would be reasonable to claim that no one has mastered it—there are just some people who know a bit more than others. [sent-9, score-0.35]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('demand', 0.425), ('editorial', 0.205), ('dropping', 0.205), ('sensors', 0.19), ('recession', 0.19), ('consensus', 0.179), ('mastered', 0.179), ('exceeds', 0.171), ('jobs', 0.159), ('trading', 0.154), ('economic', 0.154), ('business', 0.145), ('knowing', 0.138), ('pass', 0.138), ('uncertainty', 0.138), ('master', 0.138), ('lots', 0.138), ('quantity', 0.135), ('discussions', 0.133), ('companies', 0.13), ('remain', 0.13), ('control', 0.128), ('processes', 0.123), ('places', 0.123), ('quickly', 0.123), ('growing', 0.121), ('machine', 0.119), ('internet', 0.108), ('data', 0.105), ('claim', 0.104), ('optimization', 0.099), ('fact', 0.099), ('subject', 0.096), ('probably', 0.096), ('quality', 0.093), ('available', 0.089), ('comes', 0.089), ('seen', 0.088), ('nevertheless', 0.087), ('become', 0.085), ('substantially', 0.085), ('strong', 0.081), ('experience', 0.081), ('interest', 0.077), ('others', 0.077), ('long', 0.073), ('little', 0.072), ('reasonable', 0.067), ('particularly', 0.066), ('difficult', 0.066)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 250 hunch net-2007-06-23-Machine Learning Jobs are Growing on Trees

Introduction: The consensus of several discussions at ICML is that the number of jobs for people knowing machine learning well substantially exceeds supply. This is my experience as well. Demand comes from many places, but I’ve seen particularly strong demand from trading companies and internet startups. Like all interest bursts, this one will probably pass because of economic recession or other distractions. Nevertheless, the general outlook for machine learning in business seems to be good. Machine learning is all about optimization when there is uncertainty and lots of data. The quantity of data available is growing quickly as computer-run processes and sensors become more common, and the quality of the data is dropping since there is little editorial control in it’s collection. Machine Learning is a difficult subject to master (*), so those who do should remain in demand over the long term. (*) In fact, it would be reasonable to claim that no one has mastered it—there are just some peo

2 0.11185073 352 hunch net-2009-05-06-Machine Learning to AI

Introduction: I recently had fun discussions with both Vikash Mansinghka and Thomas Breuel about approaching AI with machine learning. The general interest in taking a crack at AI with machine learning seems to be rising on many fronts including DARPA . As a matter of history, there was a great deal of interest in AI which died down before I began research. There remain many projects and conferences spawned in this earlier AI wave, as well as a good bit of experience about what did not work, or at least did not work yet. Here are a few examples of failure modes that people seem to run into: Supply/Product confusion . Sometimes we think “Intelligences use X, so I’ll create X and have an Intelligence.” An example of this is the Cyc Project which inspires some people as “intelligences use ontologies, so I’ll create an ontology and a system using it to have an Intelligence.” The flaw here is that Intelligences create ontologies, which they use, and without the ability to create ont

3 0.099904396 437 hunch net-2011-07-10-ICML 2011 and the future

Introduction: Unfortunately, I ended up sick for much of this ICML. I did manage to catch one interesting paper: Richard Socher , Cliff Lin , Andrew Y. Ng , and Christopher D. Manning Parsing Natural Scenes and Natural Language with Recursive Neural Networks . I invited Richard to share his list of interesting papers, so hopefully we’ll hear from him soon. In the meantime, Paul and Hal have posted some lists. the future Joelle and I are program chairs for ICML 2012 in Edinburgh , which I previously enjoyed visiting in 2005 . This is a huge responsibility, that we hope to accomplish well. A part of this (perhaps the most fun part), is imagining how we can make ICML better. A key and critical constraint is choosing things that can be accomplished. So far we have: Colocation . The first thing we looked into was potential colocations. We quickly discovered that many other conferences precomitted their location. For the future, getting a colocation with ACL or SIGI

4 0.092260152 344 hunch net-2009-02-22-Effective Research Funding

Introduction: With a worldwide recession on, my impression is that the carnage in research has not been as severe as might be feared, at least in the United States. I know of two notable negative impacts: It’s quite difficult to get a job this year, as many companies and universities simply aren’t hiring. This is particularly tough on graduating students. Perhaps 10% of IBM research was fired. In contrast, around the time of the dot com bust, ATnT Research and Lucent had one or several 50% size firings wiping out much of the remainder of Bell Labs , triggering a notable diaspora for the respected machine learning group there. As the recession progresses, we may easily see more firings as companies in particular reach a point where they can no longer support research. There are a couple positives to the recession as well. Both the implosion of Wall Street (which siphoned off smart people) and the general difficulty of getting a job coming out of an undergraduate education s

5 0.091322988 142 hunch net-2005-12-22-Yes , I am applying

Introduction: Every year about now hundreds of applicants apply for a research/teaching job with the timing governed by the university recruitment schedule. This time, it’s my turn—the hat’s in the ring, I am a contender, etc… What I have heard is that this year is good in both directions—both an increased supply and an increased demand for machine learning expertise. I consider this post a bit of an abuse as it is neither about general research nor machine learning. Please forgive me this once. My hope is that I will learn about new places interested in funding basic research—it’s easy to imagine that I have overlooked possibilities. I am not dogmatic about where I end up in any particular way. Several earlier posts detail what I think of as a good research environment, so I will avoid a repeat. A few more details seem important: Application. There is often a tension between basic research and immediate application. This tension is not as strong as might be expected in my case. As

6 0.08575806 400 hunch net-2010-06-13-The Good News on Exploration and Learning

7 0.084662125 342 hunch net-2009-02-16-KDNuggets

8 0.084335014 335 hunch net-2009-01-08-Predictive Analytics World

9 0.081914909 464 hunch net-2012-05-03-Microsoft Research, New York City

10 0.081171587 109 hunch net-2005-09-08-Online Learning as the Mathematics of Accountability

11 0.078434035 478 hunch net-2013-01-07-NYU Large Scale Machine Learning Class

12 0.077796131 444 hunch net-2011-09-07-KDD and MUCMD 2011

13 0.073226735 378 hunch net-2009-11-15-The Other Online Learning

14 0.070713505 235 hunch net-2007-03-03-All Models of Learning have Flaws

15 0.070261613 452 hunch net-2012-01-04-Why ICML? and the summer conferences

16 0.068283305 225 hunch net-2007-01-02-Retrospective

17 0.068019137 383 hunch net-2009-12-09-Inherent Uncertainty

18 0.067305155 260 hunch net-2007-08-25-The Privacy Problem

19 0.067003325 329 hunch net-2008-11-28-A Bumper Crop of Machine Learning Graduates

20 0.066523589 193 hunch net-2006-07-09-The Stock Prediction Machine Learning Problem


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.162), (1, -0.034), (2, -0.061), (3, 0.037), (4, 0.001), (5, 0.01), (6, -0.032), (7, 0.01), (8, -0.001), (9, -0.02), (10, -0.004), (11, -0.014), (12, -0.0), (13, 0.017), (14, -0.019), (15, 0.025), (16, -0.007), (17, -0.077), (18, 0.001), (19, -0.015), (20, 0.051), (21, -0.033), (22, -0.044), (23, 0.047), (24, -0.066), (25, 0.002), (26, 0.056), (27, -0.017), (28, 0.014), (29, 0.04), (30, -0.039), (31, -0.059), (32, -0.066), (33, 0.046), (34, 0.07), (35, -0.071), (36, -0.017), (37, -0.015), (38, 0.013), (39, -0.078), (40, 0.002), (41, 0.016), (42, 0.051), (43, 0.012), (44, -0.148), (45, -0.032), (46, -0.036), (47, -0.022), (48, -0.087), (49, 0.092)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9205156 250 hunch net-2007-06-23-Machine Learning Jobs are Growing on Trees

Introduction: The consensus of several discussions at ICML is that the number of jobs for people knowing machine learning well substantially exceeds supply. This is my experience as well. Demand comes from many places, but I’ve seen particularly strong demand from trading companies and internet startups. Like all interest bursts, this one will probably pass because of economic recession or other distractions. Nevertheless, the general outlook for machine learning in business seems to be good. Machine learning is all about optimization when there is uncertainty and lots of data. The quantity of data available is growing quickly as computer-run processes and sensors become more common, and the quality of the data is dropping since there is little editorial control in it’s collection. Machine Learning is a difficult subject to master (*), so those who do should remain in demand over the long term. (*) In fact, it would be reasonable to claim that no one has mastered it—there are just some peo

2 0.60779351 136 hunch net-2005-12-07-Is the Google way the way for machine learning?

Introduction: Urs Hoelzle from Google gave an invited presentation at NIPS . In the presentation, he strongly advocates interacting with data in a particular scalable manner which is something like the following: Make a cluster of machines. Build a unified filesystem. (Google uses GFS, but NFS or other approaches work reasonably well for smaller clusters.) Interact with data via MapReduce . Creating a cluster of machines is, by this point, relatively straightforward. Unified filesystems are a little bit tricky—GFS is capable by design of essentially unlimited speed throughput to disk. NFS can bottleneck because all of the data has to move through one machine. Nevertheless, this may not be a limiting factor for smaller clusters. MapReduce is a programming paradigm. Essentially, it is a combination of a data element transform (map) and an agreggator/selector (reduce). These operations are highly parallelizable and the claim is that they support the forms of data interacti

3 0.59503728 335 hunch net-2009-01-08-Predictive Analytics World

Introduction: Carla Vicens and Eric Siegel contacted me about Predictive Analytics World in San Francisco February 18&19, which I wasn’t familiar with. A quick look at the agenda reveals several people I know working on applications of machine learning in businesses, covering deployed applications topics. It’s interesting to see a business-focused machine learning conference, as it says that we are succeeding as a field. If you are interested in deployed applications, you might attend. Eric and I did a quick interview by email. John > I’ve mostly published and participated in academic machine learning conferences like ICML, COLT, and NIPS. When I look at the set of speakers and subjects for your conference I think “machine learning for business”. Is that your understanding of things? What I’m trying to ask is: what do you view as the primary goal for this conference? Eric > You got it. This is the business event focused on the commercial deployment of technology developed at

4 0.58384657 128 hunch net-2005-11-05-The design of a computing cluster

Introduction: This is about the design of a computing cluster from the viewpoint of applied machine learning using current technology. We just built a small one at TTI so this is some evidence of what is feasible and thoughts about the design choices. Architecture There are several architectural choices. AMD Athlon64 based system. This seems to have the cheapest bang/buck. Maximum RAM is typically 2-3GB. AMD Opteron based system. Opterons provide the additional capability to buy an SMP motherboard with two chips, and the motherboards often support 16GB of RAM. The RAM is also the more expensive error correcting type. Intel PIV or Xeon based system. The PIV and Xeon based systems are the intel analog of the above 2. Due to architectural design reasons, these chips tend to run a bit hotter and be a bit more expensive. Dual core chips. Both Intel and AMD have chips that actually have 2 processors embedded in them. In the end, we decided to go with option (2). Roughly speaking,

5 0.54843086 260 hunch net-2007-08-25-The Privacy Problem

Introduction: Machine Learning is rising in importance because data is being collected for all sorts of tasks where it either wasn’t previously collected, or for tasks that did not previously exist. While this is great for Machine Learning, it has a downside—the massive data collection which is so useful can also lead to substantial privacy problems. It’s important to understand that this is a much harder problem than many people appreciate. The AOL data release is a good example. To those doing machine learning, the following strategies might be obvious: Just delete any names or other obviously personally identifiable information. The logic here seems to be “if I can’t easily find the person then no one can”. That doesn’t work as demonstrated by the people who were found circumstantially from the AOL data. … then just hash all the search terms! The logic here is “if I can’t read it, then no one can”. It’s also trivially broken by a dictionary attack—just hash all the strings

6 0.54636061 366 hunch net-2009-08-03-Carbon in Computer Science Research

7 0.54250938 193 hunch net-2006-07-09-The Stock Prediction Machine Learning Problem

8 0.53693438 137 hunch net-2005-12-09-Machine Learning Thoughts

9 0.53267401 475 hunch net-2012-10-26-ML Symposium and Strata-Hadoop World

10 0.51865733 37 hunch net-2005-03-08-Fast Physics for Learning

11 0.50925851 229 hunch net-2007-01-26-Parallel Machine Learning Problems

12 0.50517714 397 hunch net-2010-05-02-What’s the difference between gambling and rewarding good prediction?

13 0.50445497 49 hunch net-2005-03-30-What can Type Theory teach us about Machine Learning?

14 0.50116378 142 hunch net-2005-12-22-Yes , I am applying

15 0.49870503 442 hunch net-2011-08-20-The Large Scale Learning Survey Tutorial

16 0.4963412 464 hunch net-2012-05-03-Microsoft Research, New York City

17 0.49228024 329 hunch net-2008-11-28-A Bumper Crop of Machine Learning Graduates

18 0.49098885 352 hunch net-2009-05-06-Machine Learning to AI

19 0.48594558 406 hunch net-2010-08-22-KDD 2010

20 0.47563162 469 hunch net-2012-07-09-Videolectures


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(27, 0.207), (55, 0.105), (56, 0.451), (94, 0.091), (95, 0.03)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.88752383 187 hunch net-2006-06-25-Presentation of Proofs is Hard.

Introduction: When presenting part of the Reinforcement Learning theory tutorial at ICML 2006 , I was forcibly reminded of this. There are several difficulties. When creating the presentation, the correct level of detail is tricky. With too much detail, the proof takes too much time and people may be lost to boredom. With too little detail, the steps of the proof involve too-great a jump. This is very difficult to judge. What may be an easy step in the careful thought of a quiet room is not so easy when you are occupied by the process of presentation. What may be easy after having gone over this (and other) proofs is not so easy to follow in the first pass by a viewer. These problems seem only correctable by process of repeated test-and-revise. When presenting the proof, simply speaking with sufficient precision is substantially harder than in normal conversation (where precision is not so critical). Practice can help here. When presenting the proof, going at the right p

same-blog 2 0.83810627 250 hunch net-2007-06-23-Machine Learning Jobs are Growing on Trees

Introduction: The consensus of several discussions at ICML is that the number of jobs for people knowing machine learning well substantially exceeds supply. This is my experience as well. Demand comes from many places, but I’ve seen particularly strong demand from trading companies and internet startups. Like all interest bursts, this one will probably pass because of economic recession or other distractions. Nevertheless, the general outlook for machine learning in business seems to be good. Machine learning is all about optimization when there is uncertainty and lots of data. The quantity of data available is growing quickly as computer-run processes and sensors become more common, and the quality of the data is dropping since there is little editorial control in it’s collection. Machine Learning is a difficult subject to master (*), so those who do should remain in demand over the long term. (*) In fact, it would be reasonable to claim that no one has mastered it—there are just some peo

3 0.80081666 307 hunch net-2008-07-04-More Presentation Preparation

Introduction: We’ve discussed presentation preparation before , but I have one more thing to add: transitioning . For a research presentation, it is substantially helpful for the audience if transitions are clear. A common outline for a research presentation in machine leanring is: The problem . Presentations which don’t describe the problem almost immediately lose people, because the context is missing to understand the detail. Prior relevant work . In many cases, a paper builds on some previous bit of work which must be understood in order to understand what the paper does. A common failure mode seems to be spending too much time on prior work. Discuss just the relevant aspects of prior work in the language of your work. Sometimes this is missing when unneeded. What we did . For theory papers in particular, it is often not possible to really cover the details. Prioritizing what you present can be very important. How it worked . Many papers in Machine Learning have some sor

4 0.76296824 460 hunch net-2012-03-24-David Waltz

Introduction: has died . He lived a full life. I know him personally as a founder of the Center for Computational Learning Systems and the New York Machine Learning Symposium , both of which have sheltered and promoted the advancement of machine learning. I expect much of the New York area machine learning community will miss him, as well as many others around the world.

5 0.69745988 202 hunch net-2006-08-10-Precision is not accuracy

Introduction: In my experience, there are two different groups of people who believe the same thing: the mathematics encountered in typical machine learning conference papers is often of questionable value. The two groups who agree on this are applied machine learning people who have given up on math, and mature theoreticians who understand the limits of theory. Partly, this is just a statement about where we are with respect to machine learning. In particular, we have no mechanism capable of generating a prescription for how to solve all learning problems. In the absence of such certainty, people try to come up with formalisms that partially describe and motivate how and why they do things. This is natural and healthy—we might hope that it will eventually lead to just such a mechanism. But, part of this is simply an emphasis on complexity over clarity. A very natural and simple theoretical statement is often obscured by complexifications. Common sources of complexification include:

6 0.66431952 356 hunch net-2009-05-24-2009 ICML discussion site

7 0.5245108 379 hunch net-2009-11-23-ICML 2009 Workshops (and Tutorials)

8 0.49913219 249 hunch net-2007-06-21-Presentation Preparation

9 0.48634842 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006

10 0.47710118 464 hunch net-2012-05-03-Microsoft Research, New York City

11 0.47262603 416 hunch net-2010-10-29-To Vidoelecture or not

12 0.46863785 51 hunch net-2005-04-01-The Producer-Consumer Model of Research

13 0.46773422 325 hunch net-2008-11-10-ICML Reviewing Criteria

14 0.46686152 228 hunch net-2007-01-15-The Machine Learning Department

15 0.46601301 343 hunch net-2009-02-18-Decision by Vetocracy

16 0.46503744 452 hunch net-2012-01-04-Why ICML? and the summer conferences

17 0.46267515 315 hunch net-2008-09-03-Bidding Problems

18 0.46263394 259 hunch net-2007-08-19-Choice of Metrics

19 0.46226624 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

20 0.46145287 360 hunch net-2009-06-15-In Active Learning, the question changes