hunch_net hunch_net-2009 hunch_net-2009-349 knowledge-graph by maker-knowledge-mining

349 hunch net-2009-04-21-Interesting Presentations at Snowbird


meta infos for this blog

Source: html

Introduction: Here are a few of presentations interesting me at the snowbird learning workshop (which, amusingly, was in Florida with AIStat ). Thomas Breuel described machine learning problems within OCR and an open source OCR software/research platform with modular learning components as well has a 60Million size dataset derived from Google ‘s scanned books. Kristen Grauman and Fei-Fei Li discussed using active learning with different cost labels and large datasets for image ontology . Both of them used Mechanical Turk as a labeling system , which looks to become routine, at least for vision problems. Russ Tedrake discussed using machine learning for control, with a basic claim that it was the way to go for problems involving a medium Reynold’s number such as in bird flight, where simulation is extremely intense. Yann LeCun presented a poster on an FPGA for convolutional neural networks yielding a factor of 100 speedup in processing. In addition to the graphi


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Here are a few of presentations interesting me at the snowbird learning workshop (which, amusingly, was in Florida with AIStat ). [sent-1, score-0.245]

2 Thomas Breuel described machine learning problems within OCR and an open source OCR software/research platform with modular learning components as well has a 60Million size dataset derived from Google ‘s scanned books. [sent-2, score-0.692]

3 Kristen Grauman and Fei-Fei Li discussed using active learning with different cost labels and large datasets for image ontology . [sent-3, score-0.6]

4 Both of them used Mechanical Turk as a labeling system , which looks to become routine, at least for vision problems. [sent-4, score-0.314]

5 Russ Tedrake discussed using machine learning for control, with a basic claim that it was the way to go for problems involving a medium Reynold’s number such as in bird flight, where simulation is extremely intense. [sent-5, score-0.744]

6 Yann LeCun presented a poster on an FPGA for convolutional neural networks yielding a factor of 100 speedup in processing. [sent-6, score-0.775]

7 In addition to the graphics processor approach Rajat has worked on, this seems like an effective approach to deal with the need to compute many dot products. [sent-7, score-0.827]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('ocr', 0.373), ('florida', 0.166), ('breuel', 0.166), ('products', 0.166), ('discussed', 0.161), ('speedup', 0.153), ('ontology', 0.153), ('flight', 0.153), ('graphics', 0.153), ('modular', 0.153), ('simulation', 0.153), ('rajat', 0.145), ('processor', 0.145), ('dot', 0.145), ('turk', 0.145), ('snowbird', 0.138), ('mechanical', 0.138), ('aistat', 0.133), ('routine', 0.133), ('thomas', 0.133), ('medium', 0.128), ('derived', 0.128), ('convolutional', 0.124), ('components', 0.124), ('involving', 0.124), ('described', 0.12), ('li', 0.12), ('lecun', 0.114), ('labeling', 0.114), ('yielding', 0.114), ('image', 0.109), ('poster', 0.109), ('presentations', 0.107), ('yann', 0.105), ('control', 0.103), ('compute', 0.1), ('looks', 0.1), ('vision', 0.1), ('approach', 0.099), ('presented', 0.098), ('google', 0.094), ('extremely', 0.094), ('labels', 0.092), ('factor', 0.091), ('addition', 0.086), ('neural', 0.086), ('datasets', 0.085), ('dataset', 0.084), ('claim', 0.084), ('within', 0.083)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 349 hunch net-2009-04-21-Interesting Presentations at Snowbird

Introduction: Here are a few of presentations interesting me at the snowbird learning workshop (which, amusingly, was in Florida with AIStat ). Thomas Breuel described machine learning problems within OCR and an open source OCR software/research platform with modular learning components as well has a 60Million size dataset derived from Google ‘s scanned books. Kristen Grauman and Fei-Fei Li discussed using active learning with different cost labels and large datasets for image ontology . Both of them used Mechanical Turk as a labeling system , which looks to become routine, at least for vision problems. Russ Tedrake discussed using machine learning for control, with a basic claim that it was the way to go for problems involving a medium Reynold’s number such as in bird flight, where simulation is extremely intense. Yann LeCun presented a poster on an FPGA for convolutional neural networks yielding a factor of 100 speedup in processing. In addition to the graphi

2 0.18611445 431 hunch net-2011-04-18-A paper not at Snowbird

Introduction: Unfortunately, a scheduling failure meant I missed all of AIStat and most of the learning workshop , otherwise known as Snowbird, when it’s at Snowbird . At snowbird, the talk on Sum-Product networks by Hoifung Poon stood out to me ( Pedro Domingos is a coauthor.). The basic point was that by appropriately constructing networks based on sums and products, the normalization problem in probabilistic models is eliminated, yielding a highly tractable yet flexible representation+learning algorithm. As an algorithm, this is noticeably cleaner than deep belief networks with a claim to being an order of magnitude faster and working better on an image completion task. Snowbird doesn’t have real papers—just the abstract above. I look forward to seeing the paper. (added: Rodrigo points out the deep learning workshop draft .)

3 0.11796457 277 hunch net-2007-12-12-Workshop Summary—Principles of Learning Problem Design

Introduction: This is a summary of the workshop on Learning Problem Design which Alina and I ran at NIPS this year. The first question many people have is “What is learning problem design?” This workshop is about admitting that solving learning problems does not start with labeled data, but rather somewhere before. When humans are hired to produce labels, this is usually not a serious problem because you can tell them precisely what semantics you want the labels to have, and we can fix some set of features in advance. However, when other methods are used this becomes more problematic. This focus is important for Machine Learning because there are very large quantities of data which are not labeled by a hired human. The title of the workshop was a bit ambitious, because a workshop is not long enough to synthesize a diversity of approaches into a coherent set of principles. For me, the posters at the end of the workshop were quite helpful in getting approaches to gel. Here are some an

4 0.11258009 300 hunch net-2008-04-30-Concerns about the Large Scale Learning Challenge

Introduction: The large scale learning challenge for ICML interests me a great deal, although I have concerns about the way it is structured. From the instructions page , several issues come up: Large Definition My personal definition of dataset size is: small A dataset is small if a human could look at the dataset and plausibly find a good solution. medium A dataset is mediumsize if it fits in the RAM of a reasonably priced computer. large A large dataset does not fit in the RAM of a reasonably priced computer. By this definition, all of the datasets are medium sized. This might sound like a pissing match over dataset size, but I believe it is more than that. The fundamental reason for these definitions is that they correspond to transitions in the sorts of approaches which are feasible. From small to medium, the ability to use a human as the learning algorithm degrades. From medium to large, it becomes essential to have learning algorithms that don’t require ran

5 0.10373066 20 hunch net-2005-02-15-ESPgame and image labeling

Introduction: Luis von Ahn has been running the espgame for awhile now. The espgame provides a picture to two randomly paired people across the web, and asks them to agree on a label. It hasn’t managed to label the web yet, but it has produced a large dataset of (image, label) pairs. I organized the dataset so you could explore the implied bipartite graph (requires much bandwidth). Relative to other image datasets, this one is quite large—67000 images, 358,000 labels (average of 5/image with variation from 1 to 19), and 22,000 unique labels (one every 3 images). The dataset is also very ‘natural’, consisting of images spidered from the internet. The multiple label characteristic is intriguing because ‘learning to learn’ and metalearning techniques may be applicable. The ‘natural’ quality means that this dataset varies greatly in difficulty from easy (predicting “red”) to hard (predicting “funny”) and potentially more rewarding to tackle. The open problem here is, of course, to make

6 0.1004534 426 hunch net-2011-03-19-The Ideal Large Scale Learning Class

7 0.096523419 229 hunch net-2007-01-26-Parallel Machine Learning Problems

8 0.085845955 16 hunch net-2005-02-09-Intuitions from applied learning

9 0.085518755 438 hunch net-2011-07-11-Interesting Neural Network Papers at ICML 2011

10 0.084958151 201 hunch net-2006-08-07-The Call of the Deep

11 0.08465457 281 hunch net-2007-12-21-Vowpal Wabbit Code Release

12 0.084471256 432 hunch net-2011-04-20-The End of the Beginning of Active Learning

13 0.075616181 159 hunch net-2006-02-27-The Peekaboom Dataset

14 0.074403673 360 hunch net-2009-06-15-In Active Learning, the question changes

15 0.069747992 152 hunch net-2006-01-30-Should the Input Representation be a Vector?

16 0.068164662 311 hunch net-2008-07-26-Compositional Machine Learning Algorithm Design

17 0.067520186 347 hunch net-2009-03-26-Machine Learning is too easy

18 0.067244872 105 hunch net-2005-08-23-(Dis)similarities between academia and open source programmers

19 0.066943564 450 hunch net-2011-12-02-Hadoop AllReduce and Terascale Learning

20 0.065909527 143 hunch net-2005-12-27-Automated Labeling


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.142), (1, 0.026), (2, -0.081), (3, -0.016), (4, 0.087), (5, 0.038), (6, -0.089), (7, -0.008), (8, 0.0), (9, -0.006), (10, -0.08), (11, -0.028), (12, -0.051), (13, -0.01), (14, -0.07), (15, 0.053), (16, 0.001), (17, 0.055), (18, -0.132), (19, 0.067), (20, -0.011), (21, -0.006), (22, -0.003), (23, 0.047), (24, 0.062), (25, 0.051), (26, -0.051), (27, 0.038), (28, -0.052), (29, -0.008), (30, 0.06), (31, 0.083), (32, 0.014), (33, 0.054), (34, 0.068), (35, -0.057), (36, -0.061), (37, 0.006), (38, -0.019), (39, 0.021), (40, -0.0), (41, -0.061), (42, -0.02), (43, -0.016), (44, 0.13), (45, -0.123), (46, -0.033), (47, -0.008), (48, 0.015), (49, 0.096)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.91958702 349 hunch net-2009-04-21-Interesting Presentations at Snowbird

Introduction: Here are a few of presentations interesting me at the snowbird learning workshop (which, amusingly, was in Florida with AIStat ). Thomas Breuel described machine learning problems within OCR and an open source OCR software/research platform with modular learning components as well has a 60Million size dataset derived from Google ‘s scanned books. Kristen Grauman and Fei-Fei Li discussed using active learning with different cost labels and large datasets for image ontology . Both of them used Mechanical Turk as a labeling system , which looks to become routine, at least for vision problems. Russ Tedrake discussed using machine learning for control, with a basic claim that it was the way to go for problems involving a medium Reynold’s number such as in bird flight, where simulation is extremely intense. Yann LeCun presented a poster on an FPGA for convolutional neural networks yielding a factor of 100 speedup in processing. In addition to the graphi

2 0.6471135 431 hunch net-2011-04-18-A paper not at Snowbird

Introduction: Unfortunately, a scheduling failure meant I missed all of AIStat and most of the learning workshop , otherwise known as Snowbird, when it’s at Snowbird . At snowbird, the talk on Sum-Product networks by Hoifung Poon stood out to me ( Pedro Domingos is a coauthor.). The basic point was that by appropriately constructing networks based on sums and products, the normalization problem in probabilistic models is eliminated, yielding a highly tractable yet flexible representation+learning algorithm. As an algorithm, this is noticeably cleaner than deep belief networks with a claim to being an order of magnitude faster and working better on an image completion task. Snowbird doesn’t have real papers—just the abstract above. I look forward to seeing the paper. (added: Rodrigo points out the deep learning workshop draft .)

3 0.54681516 438 hunch net-2011-07-11-Interesting Neural Network Papers at ICML 2011

Introduction: Maybe it’s too early to call, but with four separate Neural Network sessions at this year’s ICML , it looks like Neural Networks are making a comeback. Here are my highlights of these sessions. In general, my feeling is that these papers both demystify deep learning and show its broader applicability. The first observation I made is that the once disreputable “Neural” nomenclature is being used again in lieu of “deep learning”. Maybe it’s because Adam Coates et al. showed that single layer networks can work surprisingly well. An Analysis of Single-Layer Networks in Unsupervised Feature Learning , Adam Coates , Honglak Lee , Andrew Y. Ng (AISTATS 2011) The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization , Adam Coates , Andrew Y. Ng (ICML 2011) Another surprising result out of Andrew Ng’s group comes from Andrew Saxe et al. who show that certain convolutional pooling architectures can obtain close to state-of-the-art pe

4 0.51995587 114 hunch net-2005-09-20-Workshop Proposal: Atomic Learning

Introduction: This is a proposal for a workshop. It may or may not happen depending on the level of interest. If you are interested, feel free to indicate so (by email or comments). Description: Assume(*) that any system for solving large difficult learning problems must decompose into repeated use of basic elements (i.e. atoms). There are many basic questions which remain: What are the viable basic elements? What makes a basic element viable? What are the viable principles for the composition of these basic elements? What are the viable principles for learning in such systems? What problems can this approach handle? Hal Daume adds: Can composition of atoms be (semi-) automatically constructed[?] When atoms are constructed through reductions, is there some notion of the “naturalness” of the created leaning problems? Other than Markov fields/graphical models/Bayes nets, is there a good language for representing atoms and their compositions? The answer to these a

5 0.51317114 16 hunch net-2005-02-09-Intuitions from applied learning

Introduction: Since learning is far from an exact science, it’s good to pay attention to basic intuitions of applied learning. Here are a few I’ve collected. Integration In Bayesian learning, the posterior is computed by an integral, and the optimal thing to do is to predict according to this integral. This phenomena seems to be far more general. Bagging, Boosting, SVMs, and Neural Networks all take advantage of this idea to some extent. The phenomena is more general: you can average over many different classification predictors to improve performance. Sources: Zoubin , Caruana Differentiation Different pieces of an average should differentiate to achieve good performance by different methods. This is know as the ‘symmetry breaking’ problem for neural networks, and it’s why weights are initialized randomly. Boosting explicitly attempts to achieve good differentiation by creating new, different, learning problems. Sources: Yann LeCun , Phil Long Deep Representation Ha

6 0.50397146 201 hunch net-2006-08-07-The Call of the Deep

7 0.5012548 20 hunch net-2005-02-15-ESPgame and image labeling

8 0.48785543 300 hunch net-2008-04-30-Concerns about the Large Scale Learning Challenge

9 0.47721463 399 hunch net-2010-05-20-Google Predict

10 0.46976531 54 hunch net-2005-04-08-Fast SVMs

11 0.46617684 277 hunch net-2007-12-12-Workshop Summary—Principles of Learning Problem Design

12 0.45892447 418 hunch net-2010-12-02-Traffic Prediction Problem

13 0.45303148 152 hunch net-2006-01-30-Should the Input Representation be a Vector?

14 0.42856735 281 hunch net-2007-12-21-Vowpal Wabbit Code Release

15 0.4261665 426 hunch net-2011-03-19-The Ideal Large Scale Learning Class

16 0.4051199 229 hunch net-2007-01-26-Parallel Machine Learning Problems

17 0.40244907 420 hunch net-2010-12-26-NIPS 2010

18 0.39599317 224 hunch net-2006-12-12-Interesting Papers at NIPS 2006

19 0.39386722 128 hunch net-2005-11-05-The design of a computing cluster

20 0.39071909 382 hunch net-2009-12-09-Future Publication Models @ NIPS


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(27, 0.152), (38, 0.044), (53, 0.062), (55, 0.098), (58, 0.441), (94, 0.069), (95, 0.032)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.92673463 470 hunch net-2012-07-17-MUCMD and BayLearn

Introduction: The workshop on the Meaningful Use of Complex Medical Data is happening again, August 9-12 in LA, near UAI on Catalina Island August 15-17. I enjoyed my visit last year, and expect this year to be interesting also. The first Bay Area Machine Learning Symposium is August 30 at Google . Abstracts are due July 30.

same-blog 2 0.84462488 349 hunch net-2009-04-21-Interesting Presentations at Snowbird

Introduction: Here are a few of presentations interesting me at the snowbird learning workshop (which, amusingly, was in Florida with AIStat ). Thomas Breuel described machine learning problems within OCR and an open source OCR software/research platform with modular learning components as well has a 60Million size dataset derived from Google ‘s scanned books. Kristen Grauman and Fei-Fei Li discussed using active learning with different cost labels and large datasets for image ontology . Both of them used Mechanical Turk as a labeling system , which looks to become routine, at least for vision problems. Russ Tedrake discussed using machine learning for control, with a basic claim that it was the way to go for problems involving a medium Reynold’s number such as in bird flight, where simulation is extremely intense. Yann LeCun presented a poster on an FPGA for convolutional neural networks yielding a factor of 100 speedup in processing. In addition to the graphi

3 0.71717811 342 hunch net-2009-02-16-KDNuggets

Introduction: Eric Zaetsch points out KDNuggets which is a well-developed mailing list/news site with a KDD flavor. This might particularly interest people looking for industrial jobs in machine learning, as the mailing list has many such.

4 0.64291835 149 hunch net-2006-01-18-Is Multitask Learning Black-Boxable?

Introduction: Multitask learning is the learning to predict multiple outputs given the same input. Mathematically, we might think of this as trying to learn a function f:X -> {0,1} n . Structured learning is similar at this level of abstraction. Many people have worked on solving multitask learning (for example Rich Caruana ) using methods which share an internal representation. On other words, the the computation and learning of the i th prediction is shared with the computation and learning of the j th prediction. Another way to ask this question is: can we avoid sharing the internal representation? For example, it might be feasible to solve multitask learning by some process feeding the i th prediction f(x) i into the j th predictor f(x,f(x) i ) j , If the answer is “no”, then it implies we can not take binary classification as a basic primitive in the process of solving prediction problems. If the answer is “yes”, then we can reuse binary classification algorithms to

5 0.42010745 437 hunch net-2011-07-10-ICML 2011 and the future

Introduction: Unfortunately, I ended up sick for much of this ICML. I did manage to catch one interesting paper: Richard Socher , Cliff Lin , Andrew Y. Ng , and Christopher D. Manning Parsing Natural Scenes and Natural Language with Recursive Neural Networks . I invited Richard to share his list of interesting papers, so hopefully we’ll hear from him soon. In the meantime, Paul and Hal have posted some lists. the future Joelle and I are program chairs for ICML 2012 in Edinburgh , which I previously enjoyed visiting in 2005 . This is a huge responsibility, that we hope to accomplish well. A part of this (perhaps the most fun part), is imagining how we can make ICML better. A key and critical constraint is choosing things that can be accomplished. So far we have: Colocation . The first thing we looked into was potential colocations. We quickly discovered that many other conferences precomitted their location. For the future, getting a colocation with ACL or SIGI

6 0.41470999 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006

7 0.4144814 297 hunch net-2008-04-22-Taking the next step

8 0.41263312 95 hunch net-2005-07-14-What Learning Theory might do

9 0.41254675 343 hunch net-2009-02-18-Decision by Vetocracy

10 0.41173714 44 hunch net-2005-03-21-Research Styles in Machine Learning

11 0.41151428 286 hunch net-2008-01-25-Turing’s Club for Machine Learning

12 0.41145825 256 hunch net-2007-07-20-Motivation should be the Responsibility of the Reviewer

13 0.41138086 225 hunch net-2007-01-02-Retrospective

14 0.41092759 423 hunch net-2011-02-02-User preferences for search engines

15 0.4105978 403 hunch net-2010-07-18-ICML & COLT 2010

16 0.41019905 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

17 0.40907809 116 hunch net-2005-09-30-Research in conferences

18 0.40903291 1 hunch net-2005-01-19-Why I decided to run a weblog.

19 0.40859565 207 hunch net-2006-09-12-Incentive Compatible Reviewing

20 0.40837947 452 hunch net-2012-01-04-Why ICML? and the summer conferences