hunch_net hunch_net-2006 knowledge-graph by maker-knowledge-mining

hunch_net 2006 knowledge graph

similar blogs computed by tfidf model

similar blogs computed by lsi model

similar blogs computed by lda model

blogs list:

1 hunch net-2006-12-12-Interesting Papers at NIPS 2006

Introduction: Here are some papers that I found surprisingly interesting. Yoshua Bengio , Pascal Lamblin, Dan Popovici, Hugo Larochelle, Greedy Layer-wise Training of Deep Networks . Empirically investigates some of the design choices behind deep belief networks. Long Zhu , Yuanhao Chen, Alan Yuille Unsupervised Learning of a Probabilistic Grammar for Object Detection and Parsing. An unsupervised method for detecting objects using simple feature filters that works remarkably well on the (supervised) caltech-101 dataset . Shai Ben-David , John Blitzer , Koby Crammer , and Fernando Pereira , Analysis of Representations for Domain Adaptation . This is the first analysis I’ve seen of learning with respect to samples drawn differently from the evaluation distribution which depends on reasonable measurable quantities. All of these papers turn out to have a common theme—the power of unlabeled data to do generically useful things.

2 hunch net-2006-12-06-The Spam Problem

Introduction: The New York Times has an article on the growth of spam . Interesting facts include: 9/10 of all email is spam, spam source identification is nearly useless due to botnet spam senders, and image based spam (emails which consist of an image only) are on the growth. Estimates of the cost of spam are almost certainly far to low, because they do not account for the cost in time lost by people. The image based spam which is currently penetrating many filters should be catchable with a more sophisticated application of machine learning technology. For the spam I see, the rendered images come in only a few formats, which would be easy to recognize via a support vector machine (with RBF kernel), neural network, or even nearest-neighbor architecture. The mechanics of setting this up to run efficiently is the only real challenge. This is the next step in the spam war. The response to this system is to make the image based spam even more random. We should (essentially) expect to see

3 hunch net-2006-12-05-Recruitment Conferences

Introduction: One of the subsidiary roles of conferences is recruitment. NIPS is optimally placed in time for this because it falls right before the major recruitment season. I personally found job hunting embarrassing, and was relatively inept at it. I expect this is true of many people, because it is not something done often. The basic rule is: make the plausible hirers aware of your interest. Any corporate sponsor is a “plausible”, regardless of whether or not there is a booth. CRA and the acm job center are other reasonable sources. There are substantial differences between the different possibilities. Putting some effort into understanding the distinctions is a good idea, although you should always remember where the other person is coming from.

4 hunch net-2006-12-04-Structural Problems in NIPS Decision Making

Introduction: This is a very difficult post to write, because it is about a perenially touchy subject. Nevertheless, it is an important one which needs to be thought about carefully. There are a few things which should be understood: The system is changing and responsive. We-the-authors are we-the-reviewers, we-the-PC, and even we-the-NIPS-board. NIPS has implemented ‘secondary program chairs’, ‘author response’, and ‘double blind reviewing’ in the last few years to help with the decision process, and more changes may happen in the future. Agreement creates a perception of correctness. When any PC meets and makes a group decision about a paper, there is a strong tendency for the reinforcement inherent in a group decision to create the perception of correctness. For the many people who have been on the NIPS PC it’s reasonable to entertain a healthy skepticism in the face of this reinforcing certainty. This post is about structural problems. What problems arise because of the structure

5 hunch net-2006-11-27-Continuizing Solutions

Introduction: This post is about a general technique for problem solving which I’ve never seen taught (in full generality), but which I’ve found very useful. Many problems in computer science turn out to be discretely difficult. The best known version of such problems are NP-hard problems, but I mean ‘discretely difficult’ in a much more general way, which I only know how to capture by examples. ERM In empirical risk minimization, you choose a minimum error rate classifier from a set of classifiers. This is NP hard for common sets, but it can be much harder, depending on the set. Experts In the online learning with experts setting, you try to predict well so as to compete with a set of (adversarial) experts. Here the alternating quantifiers of you and an adversary playing out a game can yield a dynamic programming problem that grows exponentially. Policy Iteration The problem with policy iteration is that you learn a new policy with respect to an old policy, which implies that sim

6 hunch net-2006-11-22-Explicit Randomization in Learning algorithms

Introduction: There are a number of learning algorithms which explicitly incorporate randomness into their execution. This includes at amongst others: Neural Networks. Neural networks use randomization to assign initial weights. Boltzmann Machines/ Deep Belief Networks . Boltzmann machines are something like a stochastic version of multinode logistic regression. The use of randomness is more essential in Boltzmann machines, because the predicted value at test time also uses randomness. Bagging. Bagging is a process where a learning algorithm is run several different times on several different datasets, creating a final predictor which makes a majority vote. Policy descent. Several algorithms in reinforcement learning such as Conservative Policy Iteration use random bits to create stochastic policies. Experts algorithms. Randomized weighted majority use random bits as a part of the prediction process to achieve better theoretical guarantees. A basic question is: “Should there

7 hunch net-2006-11-20-Context and the calculation misperception

Introduction: This post is really for people not in machine learning (or related fields). It is about a common misperception which affects people who have not thought about the process of trying to predict somethinng. Hopefully, by precisely stating it, we can remove it. Suppose we have a set of events, each described by a vector of features. 0 1 0 1 1 1 0 1 0 1 1 1 0 1 0 0 0 1 1 1 1 1 0 0 1 1 0 0 0 1 0 1 1 1 0 Suppose we want to predict the value of the first feature given the others. One approach is to bin the data by one feature. For the above example, we might partition the data according to feature 2, then observe that when feature 2 is 0 the label (feature 1) is mostly 1. On the other hand, when feature 2 is 1, the label (feature 1) is mostly 0. Using this simple rule we get an observed error rate of 3/7. There are two issues here. The first is that this is really a training

8 hunch net-2006-11-06-Data Linkage Problems

Introduction: Data linkage is a problem which seems to come up in various applied machine learning problems. I have heard it mentioned in various data mining contexts, but it seems relatively less studied for systemic reasons. A very simple version of the data linkage problem is a cross hospital patient record merge. Suppose a patient (John Doe) is admitted to a hospital (General Health), treated, and released. Later, John Doe is admitted to a second hospital (Health General), treated, and released. Given a large number of records of this sort, it becomes very tempting to try and predict the outcomes of treatments. This is reasonably straightforward as a machine learning problem if there is a shared unique identifier for John Doe used by General Health and Health General along with time stamps. We can merge the records and create examples of the form “Given symptoms and treatment, did the patient come back to a hospital within the next year?” These examples could be fed into a learning algo

9 hunch net-2006-11-02-2006 NIPS workshops

Introduction: I expect the NIPS 2006 workshops to be quite interesting, and recommend going for anyone interested in machine learning research. (Most or all of the workshops webpages can be found two links deep.)

10 hunch net-2006-10-22-Exemplar programming

Introduction: There are many different abstractions for problem definition and solution. Here are a few examples: Functional programming: a set of functions are defined. The composed execution of these functions yields the solution. Linear programming: a set of constraints and a linear objective function are defined. An LP solver finds the constrained optimum. Quadratic programming: Like linear programming, but the language is a little more flexible (and the solution slower). Convex programming: like quadratic programming, but the language is more flexible (and the solutions even slower). Dynamic programming: a recursive definition of the problem is defined and then solved efficiently via caching tricks. SAT programming: A problem is specified as a satisfiability involving a conjunction of a disjunction of boolean variables. A general engine attempts to find a good satisfying assignment. For example Kautz’s blackbox planner. These abstractions have different tradeoffs betw

11 hunch net-2006-10-13-David Pennock starts Oddhead

Introduction: his blog on information markets and other research topics .

12 hunch net-2006-10-08-Incompatibilities between classical confidence intervals and learning.

Introduction: Classical confidence intervals satisfy a theorem of the form: For some data sources D , Pr S ~ D (f(D) > g(S)) > 1-d where f is some function of the distribution (such as the mean) and g is some function of the observed sample S . The constraints on D can vary between “Independent and identically distributed (IID) samples from a gaussian with an unknown mean” to “IID samples from an arbitrary distribution D “. There are even some confidence intervals which do not require IID samples. Classical confidence intervals often confuse people. They do not say “with high probability, for my observed sample, the bounds holds”. Instead, they tell you that if you reason according to the confidence interval in the future (and the constraints on D are satisfied), then you are not often wrong. Restated, they tell you something about what a safe procedure is in a stochastic world where d is the safety parameter. There are a number of results in theoretical machine learn

13 hunch net-2006-10-04-Health of Conferences Wiki

Introduction: Aaron Hertzmann points out the health of conferences wiki , which has a great deal of information about how many different conferences function.

14 hunch net-2006-10-02-$1M Netflix prediction contest

Introduction: Netflix is running a contest to improve recommender prediction systems. A 10% improvement over their current system yields a $1M prize. Failing that, the best smaller improvement yields a smaller $50K prize. This contest looks quite real, and the $50K prize money is almost certainly achievable with a bit of thought. The contest also comes with a dataset which is apparently 2 orders of magnitude larger than any other public recommendation system datasets.

15 hunch net-2006-09-28-Programming Languages for Machine Learning Implementations

Introduction: Machine learning algorithms have a much better chance of being widely adopted if they are implemented in some easy-to-use code. There are several important concerns associated with machine learning which stress programming languages on the ease-of-use vs. speed frontier. Speed The rate at which data sources are growing seems to be outstripping the rate at which computational power is growing, so it is important that we be able to eak out every bit of computational power. Garbage collected languages ( java , ocaml , perl and python ) often have several issues here. Garbage collection often implies that floating point numbers are “boxed”: every float is represented by a pointer to a float. Boxing can cause an order of magnitude slowdown because an extra nonlocalized memory reference is made, and accesses to main memory can are many CPU cycles long. Garbage collection often implies that considerably more memory is used than is necessary. This has a variable effect. I

16 hunch net-2006-09-19-Luis von Ahn is awarded a MacArthur fellowship.

Introduction: For his work on the subject of human computation including ESPGame , Peekaboom , and Phetch . The new MacArthur fellows .

17 hunch net-2006-09-18-What is missing for online collaborative research?

Introduction: The internet has recently made the research process much smoother: papers are easy to obtain, citations are easy to follow, and unpublished “tutorials” are often available. Yet, new research fields can look very complicated to outsiders or newcomers. Every paper is like a small piece of an unfinished jigsaw puzzle: to understand just one publication, a researcher without experience in the field will typically have to follow several layers of citations, and many of the papers he encounters have a great deal of repeated information. Furthermore, from one publication to the next, notation and terminology may not be consistent which can further confuse the reader. But the internet is now proving to be an extremely useful medium for collaboration and knowledge aggregation. Online forums allow users to ask and answer questions and to share ideas. The recent phenomenon of Wikipedia provides a proof-of-concept for the “anyone can edit” system. Can such models be used to facilitate research a

18 hunch net-2006-09-12-Incentive Compatible Reviewing

Introduction: Reviewing is a fairly formal process which is integral to the way academia is run. Given this integral nature, the quality of reviewing is often frustrating. I’ve seen plenty of examples of false statements, misbeliefs, reading what isn’t written, etc…, and I’m sure many other people have as well. Recently, mechanisms like double blind review and author feedback have been introduced to try to make the process more fair and accurate in many machine learning (and related) conferences. My personal experience is that these mechanisms help, especially the author feedback. Nevertheless, some problems remain. The game theory take on reviewing is that the incentive for truthful reviewing isn’t there. Since reviewers are also authors, there are sometimes perverse incentives created and acted upon. (Incidentially, these incentives can be both positive and negative.) Setting up a truthful reviewing system is tricky because their is no final reference truth available in any acce

19 hunch net-2006-09-09-How to solve an NP hard problem in quadratic time

Introduction: This title is a lie, but it is a special lie which has a bit of truth. If n players each play each other, you have a tournament. How do you order the players from weakest to strongest? The standard first attempt is “find the ordering which agrees with the tournament on as many player pairs as possible”. This is called the “minimum feedback arcset” problem in the CS theory literature and it is a well known NP-hard problem. A basic guarantee holds for the solution to this problem: if there is some “true” intrinsic ordering, and the outcome of the tournament disagrees k times (due to noise for instance), then the output ordering will disagree with the original ordering on at most 2k edges (and no solution can be better). One standard approach to tractably solving an NP-hard problem is to find another algorithm with an approximation guarantee. For example, Don Coppersmith , Lisa Fleischer and Atri Rudra proved that ordering players according to the number of wins is

20 hunch net-2006-09-07-Objective and subjective interpretations of probability

Introduction: An amusing tidbit (reproduced without permission) from Herman Chernoff’s delightful monograph, “Sequential analysis and optimal design”: The use of randomization raises a philosophical question which is articulated by the following probably apocryphal anecdote. The metallurgist told his friend the statistician how he planned to test the effect of heat on the strength of a metal bar by sawing the bar into six pieces. The first two would go into the hot oven, the next two into the medium oven, and the last two into the cool oven. The statistician, horrified, explained how he should randomize to avoid the effect of a possible gradient of strength in the metal bar. The method of randomization was applied, and it turned out that the randomized experiment called for putting the first two pieces into the hot oven, the next two into the medium oven, and the last two into the cool oven. “Obviously, we can’t do that,” said the metallurgist. “On the contrary, you have to do that,” said the st

21 hunch net-2006-08-28-Learning Theory standards for NIPS 2006

22 hunch net-2006-08-18-Report of MLSS 2006 Taipei

23 hunch net-2006-08-10-Precision is not accuracy

24 hunch net-2006-08-07-The Call of the Deep

25 hunch net-2006-08-03-AOL’s data drop

26 hunch net-2006-07-26-Two more UAI papers of interest

27 hunch net-2006-07-25-Upcoming conference

28 hunch net-2006-07-17-A Winner

29 hunch net-2006-07-13-Regression vs. Classification as a Primitive

30 hunch net-2006-07-12-Who is having visa problems reaching US conferences?

31 hunch net-2006-07-11-New Models

32 hunch net-2006-07-09-The Stock Prediction Machine Learning Problem