hunch_net hunch_net-2005 knowledge-graph by maker-knowledge-mining

hunch_net 2005 knowledge graph


similar blogs computed by tfidf model


similar blogs computed by lsi model


similar blogs computed by lda model


blogs list:

1 hunch net-2005-12-29-Deadline Season

Introduction: Many different paper deadlines are coming up soon so I made a little reference table. Out of curiosity, I also computed the interval between submission deadline and conference. Conference Location Date Deadline interval COLT Pittsburgh June 22-25 January 21 152 ICML Pittsburgh June 26-28 January 30/February 6 140 UAI MIT July 13-16 March 9/March 16 119 AAAI Boston July 16-20 February 16/21 145 KDD Philadelphia August 23-26 March 3/March 10 166 It looks like the northeastern US is the big winner as far as location this year.

2 hunch net-2005-12-28-Yet more nips thoughts

Introduction: I only managed to make it out to the NIPS workshops this year so I’ll give my comments on what I saw there. The Learing and Robotics workshops lives again. I hope it continues and gets more high quality papers in the future. The most interesting talk for me was Larry Jackel’s on the LAGR program (see John’s previous post on said program). I got some ideas as to what progress has been made. Larry really explained the types of benchmarks and the tradeoffs that had to be made to make the goals achievable but challenging. Hal Daume gave a very interesting talk about structured prediction using RL techniques, something near and dear to my own heart. He achieved rather impressive results using only a very greedy search. The non-parametric Bayes workshop was great. I enjoyed the entire morning session I spent there, and particularly (the usually desultory) discussion periods. One interesting topic was the Gibbs/Variational inference divide. I won’t try to summarize espe

3 hunch net-2005-12-27-Automated Labeling

Introduction: One of the common trends in machine learning has been an emphasis on the use of unlabeled data. The argument goes something like “there aren’t many labeled web pages out there, but there are a huge number of web pages, so we must find a way to take advantage of them.” There are several standard approaches for doing this: Unsupervised Learning . You use only unlabeled data. In a typical application, you cluster the data and hope that the clusters somehow correspond to what you care about. Semisupervised Learning. You use both unlabeled and labeled data to build a predictor. The unlabeled data influences the learned predictor in some way. Active Learning . You have unlabeled data and access to a labeling oracle. You interactively choose which examples to label so as to optimize prediction accuracy. It seems there is a fourth approach worth serious investigation—automated labeling. The approach goes as follows: Identify some subset of observed values to predict

4 hunch net-2005-12-22-Yes , I am applying

Introduction: Every year about now hundreds of applicants apply for a research/teaching job with the timing governed by the university recruitment schedule. This time, it’s my turn—the hat’s in the ring, I am a contender, etc… What I have heard is that this year is good in both directions—both an increased supply and an increased demand for machine learning expertise. I consider this post a bit of an abuse as it is neither about general research nor machine learning. Please forgive me this once. My hope is that I will learn about new places interested in funding basic research—it’s easy to imagine that I have overlooked possibilities. I am not dogmatic about where I end up in any particular way. Several earlier posts detail what I think of as a good research environment, so I will avoid a repeat. A few more details seem important: Application. There is often a tension between basic research and immediate application. This tension is not as strong as might be expected in my case. As

5 hunch net-2005-12-17-Workshops as Franchise Conferences

Introduction: Founding a successful new conference is extraordinarily difficult. As a conference founder, you must manage to attract a significant number of good papers—enough to entice the participants into participating next year and to (generally) to grow the conference. For someone choosing to participate in a new conference, there is a very significant decision to make: do you send a paper to some new conference with no guarantee that the conference will work out? Or do you send it to another (possibly less related) conference that you are sure will work? The conference founding problem is a joint agreement problem with a very significant barrier. Workshops are a way around this problem, and workshops attached to conferences are a particularly effective means for this. A workshop at a conference is sure to have people available to speak and attend and is sure to have a large audience available. Presenting work at a workshop is not generally exclusive: it can also be presented at a confe

6 hunch net-2005-12-14-More NIPS Papers II

Introduction: I thought this was a very good NIPS with many excellent papers. The following are a few NIPS papers which I liked and I hope to study more carefully when I get the chance. The list is not exhaustive and in no particular order… Preconditioner Approximations for Probabilistic Graphical Models. Pradeeep Ravikumar and John Lafferty. I thought the use of preconditioner methods from solving linear systems in the context of approximate inference was novel and interesting. The results look good and I’d like to understand the limitations. Rodeo: Sparse nonparametric regression in high dimensions. John Lafferty and Larry Wasserman. A very interesting approach to feature selection in nonparametric regression from a frequentist framework. The use of lengthscale variables in each dimension reminds me a lot of ‘Automatic Relevance Determination’ in Gaussian process regression — it would be interesting to compare Rodeo to ARD in GPs. Interpolating between types and tokens by estimating

7 hunch net-2005-12-11-More NIPS Papers

Introduction: Let me add to John’s post with a few of my own favourites from this year’s conference. First, let me say that Sanjoy’s talk, Coarse Sample Complexity Bounds for Active Learning was also one of my favourites, as was the Forgettron paper . I also really enjoyed the last third of Christos’ talk on the complexity of finding Nash equilibria. And, speaking of tagging, I think the U.Mass Citeseer replacement system Rexa from the demo track is very cool. Finally, let me add my recommendations for specific papers: Z. Ghahramani, K. Heller: Bayesian Sets [no preprint] (A very elegant probabilistic information retrieval style model of which objects are “most like” a given subset of objects.) T. Griffiths, Z. Ghahramani: Infinite Latent Feature Models and the Indian Buffet Process [ preprint ] (A Dirichlet style prior over infinite binary matrices with beautiful exchangeability properties.) K. Weinberger, J. Blitzer, L. Saul: Distance Metric Lea

8 hunch net-2005-12-09-Some NIPS papers

Introduction: Here is a set of papers that I found interesting (and why). A PAC-Bayes approach to the Set Covering Machine improves the set covering machine. The set covering machine approach is a new way to do classification characterized by a very close connection between theory and algorithm. At this point, the approach seems to be competing well with SVMs in about all dimensions: similar computational speed, similar accuracy, stronger learning theory guarantees, more general information source (a kernel has strictly more structure than a metric), and more sparsity. Developing a classification algorithm is not very easy, but the results so far are encouraging. Off-Road Obstacle Avoidance through End-to-End Learning and Learning Depth from Single Monocular Images both effectively showed that depth information can be predicted from camera images (using notably different techniques). This ability is strongly enabling because cameras are cheap, tiny, light, and potentially provider lo

9 hunch net-2005-12-09-Machine Learning Thoughts

Introduction: I added a link to Olivier Bousquet’s machine learning thoughts blog. Several of the posts may be of interest.

10 hunch net-2005-12-07-Is the Google way the way for machine learning?

Introduction: Urs Hoelzle from Google gave an invited presentation at NIPS . In the presentation, he strongly advocates interacting with data in a particular scalable manner which is something like the following: Make a cluster of machines. Build a unified filesystem. (Google uses GFS, but NFS or other approaches work reasonably well for smaller clusters.) Interact with data via MapReduce . Creating a cluster of machines is, by this point, relatively straightforward. Unified filesystems are a little bit tricky—GFS is capable by design of essentially unlimited speed throughput to disk. NFS can bottleneck because all of the data has to move through one machine. Nevertheless, this may not be a limiting factor for smaller clusters. MapReduce is a programming paradigm. Essentially, it is a combination of a data element transform (map) and an agreggator/selector (reduce). These operations are highly parallelizable and the claim is that they support the forms of data interacti

11 hunch net-2005-12-04-Watchword: model

Introduction: In everyday use a model is a system which explains the behavior of some system, hopefully at the level where some alteration of the model predicts some alteration of the real-world system. In machine learning “model” has several variant definitions. Everyday . The common definition is sometimes used. Parameterized . Sometimes model is a short-hand for “parameterized model”. Here, it refers to a model with unspecified free parameters. In the Bayesian learning approach, you typically have a prior over (everyday) models. Predictive . Even further from everyday use is the predictive model. Examples of this are “my model is a decision tree” or “my model is a support vector machine”. Here, there is no real sense in which an SVM explains the underlying process. For example, an SVM tells us nothing in particular about how alterations to the real-world system would create a change. Which definition is being used at any particular time is important information. For examp

12 hunch net-2005-12-01-The Webscience Future

Introduction: The internet has significantly effected the way we do research but it’s capabilities have not yet been fully realized. First, let’s acknowledge some known effects. Self-publishing By default, all researchers in machine learning (and more generally computer science and physics) place their papers online for anyone to download. The exact mechanism differs—physicists tend to use a central repository ( Arxiv ) while computer scientists tend to place the papers on their webpage. Arxiv has been slowly growing in subject breadth so it now sometimes used by computer scientists. Collaboration Email has enabled working remotely with coauthors. This has allowed collaborationis which would not otherwise have been possible and generally speeds research. Now, let’s look at attempts to go further. Blogs (like this one) allow public discussion about topics which are not easily categorized as “a new idea in machine learning” (like this topic). Organization of some subfield

13 hunch net-2005-11-28-A question of quantification

Introduction: This is about methods for phrasing and think about the scope of some theorems in learning theory. The basic claim is that there are several different ways of quantifying the scope which sound different yet are essentially the same. For all sequences of examples . This is the standard quantification in online learning analysis. Standard theorems would say something like “for all sequences of predictions by experts, the algorithm A will perform almost as well as the best expert.” For all training sets . This is the standard quantification for boosting analysis such as adaboost or multiclass boosting . Standard theorems have the form “for all training sets the error rate inequalities … hold”. For all distributions over examples . This is the one that we have been using for reductions analysis. Standard theorem statements have the form “For all distributions over examples, the error rate inequalities … hold”. It is not quite true that each of these is equivalent. F

14 hunch net-2005-11-26-The Design of an Optimal Research Environment

Introduction: How do you create an optimal environment for research? Here are some essential ingredients that I see. Stability . University-based research is relatively good at this. On any particular day, researchers face choices in what they will work on. A very common tradeoff is between: easy small difficult big For researchers without stability, the ‘easy small’ option wins. This is often “ok”—a series of incremental improvements on the state of the art can add up to something very beneficial. However, it misses one of the big potentials of research: finding entirely new and better ways of doing things. Stability comes in many forms. The prototypical example is tenure at a university—a tenured professor is almost imposssible to fire which means that the professor has the freedom to consider far horizon activities. An iron-clad guarantee of a paycheck is not necessary—industrial research labs have succeeded well with research positions of indefinite duration. Atnt rese

15 hunch net-2005-11-16-The Everything Ensemble Edge

Introduction: Rich Caruana , Alexandru Niculescu , Geoff Crew, and Alex Ksikes have done a lot of empirical testing which shows that using all methods to make a prediction is more powerful than using any single method. This is in rough agreement with the Bayesian way of solving problems, but based upon a different (essentially empirical) motivation. A rough summary is: Take all of {decision trees, boosted decision trees, bagged decision trees, boosted decision stumps, K nearest neighbors, neural networks, SVM} with all reasonable parameter settings. Run the methods on each problem of 8 problems with a large test set, calibrating margins using either sigmoid fitting or isotonic regression . For each loss of {accuracy, area under the ROC curve, cross entropy, squared error, etc…} evaluate the average performance of the method. A series of conclusions can be drawn from the observations. ( Calibrated ) boosted decision trees appear to perform best, in general although support v

16 hunch net-2005-11-16-MLSS 2006

Introduction: There will be two machine learning summer schools in 2006. One is in Canberra, Australia from February 6 to February 17 (Aussie summer). The webpage is fully ‘live’ so you should actively consider it now. The other is in Taipei, Taiwan from July 24 to August 4. This one is still in the planning phase, but that should be settled soon. Attending an MLSS is probably the quickest and easiest way to bootstrap yourself into a reasonable initial understanding of the field of machine learning.

17 hunch net-2005-11-07-Prediction Competitions

Introduction: There are two prediction competitions currently in the air. The Performance Prediction Challenge by Isabelle Guyon . Good entries minimize a weighted 0/1 loss + the difference between a prediction of this loss and the observed truth on 5 datasets. Isabelle tells me all of the problems are “real world” and the test datasets are large enough (17K minimum) that the winner should be well determined by ability rather than luck. This is due March 1. The Predictive Uncertainty Challenge by Gavin Cawley . Good entries minimize log loss on real valued output variables for one synthetic and 3 “real” datasets related to atmospheric prediction. The use of log loss (which can be infinite and hence is never convergent) and smaller test sets of size 1K to 7K examples makes the winner of this contest more luck dependent. Nevertheless, the contest may be of some interest particularly to the branch of learning (typically Bayes learning) which prefers to optimize log loss. May the

18 hunch net-2005-11-05-The design of a computing cluster

Introduction: This is about the design of a computing cluster from the viewpoint of applied machine learning using current technology. We just built a small one at TTI so this is some evidence of what is feasible and thoughts about the design choices. Architecture There are several architectural choices. AMD Athlon64 based system. This seems to have the cheapest bang/buck. Maximum RAM is typically 2-3GB. AMD Opteron based system. Opterons provide the additional capability to buy an SMP motherboard with two chips, and the motherboards often support 16GB of RAM. The RAM is also the more expensive error correcting type. Intel PIV or Xeon based system. The PIV and Xeon based systems are the intel analog of the above 2. Due to architectural design reasons, these chips tend to run a bit hotter and be a bit more expensive. Dual core chips. Both Intel and AMD have chips that actually have 2 processors embedded in them. In the end, we decided to go with option (2). Roughly speaking,

19 hunch net-2005-11-02-Progress in Active Learning

Introduction: Several bits of progress have been made since Sanjoy pointed out the significant lack of theoretical understanding of active learning . This is an update on the progress I know of. As a refresher, active learning as meant here is: There is a source of unlabeled data. There is an oracle from which labels can be requested for unlabeled data produced by the source. The goal is to perform well with minimal use of the oracle. Here is what I’ve learned: Sanjoy has developed sufficient and semi-necessary conditions for active learning given the assumptions of IID data and “realizability” (that one of the classifiers is a correct classifier). Nina , Alina , and I developed an algorithm for active learning relying on only the assumption of IID data. A draft is here . Nicolo , Claudio , and Luca showed that it is possible to do active learning in an entirely adversarial setting for linear threshold classifiers here . This was published a year or two ago and I r

20 hunch net-2005-10-26-Fallback Analysis is a Secret to Useful Algorithms

Introduction: The ideal of theoretical algorithm analysis is to construct an algorithm with accompanying optimality theorems proving that it is a useful algorithm. This ideal often fails, particularly for learning algorithms and theory. The general form of a theorem is: If preconditions Then postconditions When we design learning algorithms it is very common to come up with precondition assumptions such as “the data is IID”, “the learning problem is drawn from a known distribution over learning problems”, or “there is a perfect classifier”. All of these example preconditions can be false for real-world problems in ways that are not easily detectable. This means that algorithms derived and justified by these very common forms of analysis may be prone to catastrophic failure in routine (mis)application. We can hope for better. Several different kinds of learning algorithm analysis have been developed some of which have fewer preconditions. Simply demanding that these forms of analysi

21 hunch net-2005-10-20-Machine Learning in the News

22 hunch net-2005-10-19-Workshop: Atomic Learning

23 hunch net-2005-10-16-Complexity: It’s all in your head

24 hunch net-2005-10-13-Site tweak

25 hunch net-2005-10-12-The unrealized potential of the research lab

26 hunch net-2005-10-10-Predictive Search is Coming

27 hunch net-2005-10-08-We have a winner

28 hunch net-2005-10-07-On-line learning of regular decision rules

29 hunch net-2005-10-03-Not ICML

30 hunch net-2005-09-30-Research in conferences

31 hunch net-2005-09-26-Prediction Bounds as the Mathematics of Science

32 hunch net-2005-09-20-Workshop Proposal: Atomic Learning

33 hunch net-2005-09-19-NIPS Workshops

34 hunch net-2005-09-14-The Predictionist Viewpoint

35 hunch net-2005-09-12-Fast Gradient Descent

36 hunch net-2005-09-10-“Failure” is an option

37 hunch net-2005-09-08-Online Learning as the Mathematics of Accountability

38 hunch net-2005-09-06-A link

39 hunch net-2005-09-05-Site Update

40 hunch net-2005-09-04-Science in the Government

41 hunch net-2005-08-23-(Dis)similarities between academia and open source programmers

42 hunch net-2005-08-22-Do you believe in induction?

43 hunch net-2005-08-18-SVM Adaptability

44 hunch net-2005-08-11-Why Manifold-Based Dimension Reduction Techniques?

45 hunch net-2005-08-08-Apprenticeship Reinforcement Learning for Control

46 hunch net-2005-08-04-Why Reinforcement Learning is Important

47 hunch net-2005-08-01-Peekaboom

48 hunch net-2005-07-27-Not goal metrics

49 hunch net-2005-07-23-Interesting papers at ACL

50 hunch net-2005-07-21-Six Months

51 hunch net-2005-07-14-What Learning Theory might do

52 hunch net-2005-07-13-Text Entailment at AAAI

53 hunch net-2005-07-13-“Sister Conference” presentations

54 hunch net-2005-07-11-AAAI blog

55 hunch net-2005-07-10-Thinking the Unthought

56 hunch net-2005-07-07-The Limits of Learning Theory

57 hunch net-2005-07-04-The Health of COLT

58 hunch net-2005-07-01-The Role of Impromptu Talks

59 hunch net-2005-06-29-Not EM for clustering at COLT

60 hunch net-2005-06-28-The cross validation problem: cash reward

61 hunch net-2005-06-28-A COLT paper

62 hunch net-2005-06-22-Languages of Learning

63 hunch net-2005-06-18-Lower Bounds for Learning Reductions

64 hunch net-2005-06-17-Reopening RL->Classification

65 hunch net-2005-06-13-Wikis for Summer Schools and Workshops

66 hunch net-2005-06-10-Workshops are not Conferences

67 hunch net-2005-06-08-Question: “When is the right time to insert the loss function?”

68 hunch net-2005-06-06-Exact Online Learning for Classification

69 hunch net-2005-05-29-Maximum Margin Mismatch?

70 hunch net-2005-05-29-Bad ideas

71 hunch net-2005-05-28-Running A Machine Learning Summer School

72 hunch net-2005-05-21-What is the right form of modularity in structured prediction?

73 hunch net-2005-05-17-A Short Guide to PhD Graduate Study

74 hunch net-2005-05-16-Regret minimizing vs error limiting reductions

75 hunch net-2005-05-14-NIPS

76 hunch net-2005-05-12-Math on the Web

77 hunch net-2005-05-11-Visa Casualties

78 hunch net-2005-05-10-Learning Reductions are Reductionist

79 hunch net-2005-05-06-Don’t mix the solution into the problem

80 hunch net-2005-05-03-Conference attendance is mandatory

81 hunch net-2005-05-02-Reviewing techniques for conferences

82 hunch net-2005-04-28-Science Fiction and Research

83 hunch net-2005-04-27-DARPA project: LAGR

84 hunch net-2005-04-26-To calibrate or not?

85 hunch net-2005-04-25-Embeddings: what are they good for?

86 hunch net-2005-04-23-Advantages and Disadvantages of Bayesian Learning

87 hunch net-2005-04-22-New Blog: [Lowerbounds,Upperbounds]

88 hunch net-2005-04-21-Dynamic Programming Generalizations and Their Use

89 hunch net-2005-04-16-Which Assumptions are Reasonable?

90 hunch net-2005-04-14-Families of Learning Theory Statements

91 hunch net-2005-04-10-Is the Goal Understanding or Prediction?

92 hunch net-2005-04-08-Fast SVMs

93 hunch net-2005-04-06-Structured Regret Minimization

94 hunch net-2005-04-04-Grounds for Rejection

95 hunch net-2005-04-01-The Producer-Consumer Model of Research

96 hunch net-2005-04-01-Basic computer science research takes a hit

97 hunch net-2005-03-30-What can Type Theory teach us about Machine Learning?

98 hunch net-2005-03-29-Academic Mechanism Design

99 hunch net-2005-03-28-Open Problems for Colt

100 hunch net-2005-03-24-The Role of Workshops

101 hunch net-2005-03-22-Active learning

102 hunch net-2005-03-21-Research Styles in Machine Learning

103 hunch net-2005-03-18-Binomial Weighting

104 hunch net-2005-03-17-Going all the Way, Sometimes

105 hunch net-2005-03-15-The State of Tight Bounds

106 hunch net-2005-03-13-Avoiding Bad Reviewing

107 hunch net-2005-03-10-Breaking Abstractions

108 hunch net-2005-03-09-Bad Reviewing

109 hunch net-2005-03-08-Fast Physics for Learning

110 hunch net-2005-03-05-Funding Research

111 hunch net-2005-03-04-The Big O and Constants in Learning

112 hunch net-2005-03-02-Prior, “Prior” and Bias

113 hunch net-2005-02-28-Regularization

114 hunch net-2005-02-27-Antilearning: When proximity goes bad

115 hunch net-2005-02-26-Problem: Reductions and Relative Ranking Metrics

116 hunch net-2005-02-25-Why Papers?

117 hunch net-2005-02-25-Solution: Reinforcement Learning with Classification

118 hunch net-2005-02-25-Problem: Online Learning

119 hunch net-2005-02-23-Problem: Reinforcement Learning with Classification

120 hunch net-2005-02-21-Problem: Cross Validation

121 hunch net-2005-02-20-At One Month

122 hunch net-2005-02-19-Machine learning reading groups

123 hunch net-2005-02-19-Loss Functions for Discriminative Training of Energy-Based Models

124 hunch net-2005-02-18-What it means to do research.

125 hunch net-2005-02-17-Learning Research Programs

126 hunch net-2005-02-15-ESPgame and image labeling

127 hunch net-2005-02-14-Clever Methods of Overfitting

128 hunch net-2005-02-12-ROC vs. Accuracy vs. AROC

129 hunch net-2005-02-10-Conferences, Dates, Locations

130 hunch net-2005-02-09-Intuitions from applied learning

131 hunch net-2005-02-08-Some Links

132 hunch net-2005-02-07-The State of the Reduction

133 hunch net-2005-02-04-JMLG

134 hunch net-2005-02-03-Learning Theory, by assumption

135 hunch net-2005-02-02-Paper Deadlines

136 hunch net-2005-02-02-Kolmogorov Complexity and Googling

137 hunch net-2005-02-01-Watchword: Loss

138 hunch net-2005-02-01-NIPS: Online Bayes

139 hunch net-2005-01-31-Watchword: Assumption

140 hunch net-2005-01-27-Learning Complete Problems

141 hunch net-2005-01-26-Watchword: Probability

142 hunch net-2005-01-26-Summer Schools

143 hunch net-2005-01-24-The Humanloop Spectrum of Machine Learning

144 hunch net-2005-01-24-Holy grails of machine learning?

145 hunch net-2005-01-19-Why I decided to run a weblog.