hunch_net hunch_net-2005 hunch_net-2005-97 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: A recent discussion indicated that one goal of this blog might be to allow people to post comments about recent papers that they liked. I think this could potentially be very useful, especially for those with diverse interests but only finite time to read through conference proceedings. ACL 2005 recently completed, and here are four papers from that conference that I thought were either good or perhaps of interest to a machine learning audience. David Chiang, A Hierarchical Phrase-Based Model for Statistical Machine Translation . (Best paper award.) This paper takes the standard phrase-based MT model that is popular in our field (basically, translate a sentence by individually translating phrases and reordering them according to a complicated statistical model) and extends it to take into account hierarchy in phrases, so that you can learn things like “X ‘s Y” -> “Y de X” in chinese, where X and Y are arbitrary phrases. This takes a step toward linguistic syntax for MT, whic
sentIndex sentText sentNum sentScore
1 A recent discussion indicated that one goal of this blog might be to allow people to post comments about recent papers that they liked. [sent-1, score-0.326]
2 I think this could potentially be very useful, especially for those with diverse interests but only finite time to read through conference proceedings. [sent-2, score-0.085]
3 ACL 2005 recently completed, and here are four papers from that conference that I thought were either good or perhaps of interest to a machine learning audience. [sent-3, score-0.092]
4 This takes a step toward linguistic syntax for MT, which our group is working strongly on, but doesn’t require any linguists to sit down and write out grammars or parse sentences. [sent-7, score-0.565]
5 This is more of a machine learning style paper, where they improve a sequence labeling task by augmenting it with models from related tasks for which data is free. [sent-9, score-0.516]
6 , I might train a model that, given a context with a missing word, will predict the word (eg. [sent-12, score-0.383]
7 , “The ____ gave a speech” might want you to insert “president”. [sent-13, score-0.153]
8 ) By doing so, you can use these other models to give additional useful information to your main task. [sent-14, score-0.162]
9 This paper talks about training sequence labeling models in an unsupervised fashion, basically by contrasting what the model does on the correct string with what the model does on a corrupted version of the string. [sent-17, score-1.445]
10 They get significantly better results than just by using EM in an HMM, and the idea is pretty nice. [sent-18, score-0.136]
11 This is a pretty neat idea (though I’m biased — Patrick is a friend) where one attempts to come up with feature vectors that describe nodes in a semantic hierarchy (ontology) that could enable you to figure out where to insert new words that are not in your ontology. [sent-20, score-0.989]
12 The results are pretty good, and the method is fairly simple; I’d imagine that a more complex model/learning framework could improve the model even further. [sent-21, score-0.59]
wordName wordTfidf (topN-words)
[('model', 0.264), ('mt', 0.223), ('patrick', 0.223), ('phrases', 0.223), ('hierarchy', 0.184), ('vectors', 0.173), ('models', 0.162), ('insert', 0.153), ('labeling', 0.137), ('pretty', 0.136), ('basically', 0.131), ('word', 0.119), ('recent', 0.117), ('statistical', 0.115), ('sequence', 0.112), ('improve', 0.105), ('takes', 0.105), ('smith', 0.099), ('translating', 0.099), ('ando', 0.099), ('chunking', 0.099), ('corrupted', 0.099), ('grammars', 0.099), ('parse', 0.099), ('reordering', 0.099), ('paper', 0.094), ('syntax', 0.092), ('chinese', 0.092), ('completed', 0.092), ('de', 0.092), ('four', 0.092), ('indicated', 0.092), ('jason', 0.092), ('neat', 0.092), ('ontology', 0.092), ('string', 0.092), ('training', 0.09), ('semantic', 0.087), ('sentence', 0.087), ('toward', 0.087), ('could', 0.085), ('sit', 0.083), ('extends', 0.083), ('translate', 0.083), ('hierarchical', 0.083), ('em', 0.083), ('enable', 0.079), ('acl', 0.079), ('popular', 0.079), ('individually', 0.079)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999958 97 hunch net-2005-07-23-Interesting papers at ACL
Introduction: A recent discussion indicated that one goal of this blog might be to allow people to post comments about recent papers that they liked. I think this could potentially be very useful, especially for those with diverse interests but only finite time to read through conference proceedings. ACL 2005 recently completed, and here are four papers from that conference that I thought were either good or perhaps of interest to a machine learning audience. David Chiang, A Hierarchical Phrase-Based Model for Statistical Machine Translation . (Best paper award.) This paper takes the standard phrase-based MT model that is popular in our field (basically, translate a sentence by individually translating phrases and reordering them according to a complicated statistical model) and extends it to take into account hierarchy in phrases, so that you can learn things like “X ‘s Y” -> “Y de X” in chinese, where X and Y are arbitrary phrases. This takes a step toward linguistic syntax for MT, whic
2 0.16840719 194 hunch net-2006-07-11-New Models
Introduction: How should we, as researchers in machine learning, organize ourselves? The most immediate measurable objective of computer science research is publishing a paper. The most difficult aspect of publishing a paper is having reviewers accept and recommend it for publication. The simplest mechanism for doing this is to show theoretical progress on some standard, well-known easily understood problem. In doing this, we often fall into a local minima of the research process. The basic problem in machine learning is that it is very unclear that the mathematical model is the right one for the (or some) real problem. A good mathematical model in machine learning should have one fundamental trait: it should aid the design of effective learning algorithms. To date, our ability to solve interesting learning problems (speech recognition, machine translation, object recognition, etc…) remains limited (although improving), so the “rightness” of our models is in doubt. If our mathematical mod
3 0.12722728 301 hunch net-2008-05-23-Three levels of addressing the Netflix Prize
Introduction: In October 2006, the online movie renter, Netflix, announced the Netflix Prize contest. They published a comprehensive dataset including more than 100 million movie ratings, which were performed by about 480,000 real customers on 17,770 movies.  Competitors in the challenge are required to estimate a few million ratings.  To win the “grand prize,” they need to deliver a 10% improvement in the prediction error compared with the results of Cinematch, Netflix’s proprietary recommender system. Best current results deliver 9.12% improvement , which is quite close to the 10% goal, yet painfully distant.  The Netflix Prize breathed new life and excitement into recommender systems research. The competition allowed the wide research community to access a large scale, real life dataset. Beyond this, the competition changed the rules of the game. Claiming that your nice idea could outperform some mediocre algorithms on some toy dataset is no longer acceptable. Now researcher
4 0.12033739 135 hunch net-2005-12-04-Watchword: model
Introduction: In everyday use a model is a system which explains the behavior of some system, hopefully at the level where some alteration of the model predicts some alteration of the real-world system. In machine learning “model” has several variant definitions. Everyday . The common definition is sometimes used. Parameterized . Sometimes model is a short-hand for “parameterized model”. Here, it refers to a model with unspecified free parameters. In the Bayesian learning approach, you typically have a prior over (everyday) models. Predictive . Even further from everyday use is the predictive model. Examples of this are “my model is a decision tree” or “my model is a support vector machine”. Here, there is no real sense in which an SVM explains the underlying process. For example, an SVM tells us nothing in particular about how alterations to the real-world system would create a change. Which definition is being used at any particular time is important information. For examp
5 0.11292071 189 hunch net-2006-07-05-more icml papers
Introduction: Here are a few other papers I enjoyed from ICML06. Topic Models: Dynamic Topic Models David Blei, John Lafferty A nice model for how topics in LDA type models can evolve over time, using a linear dynamical system on the natural parameters and a very clever structured variational approximation (in which the mean field parameters are pseudo-observations of a virtual LDS). Like all Blei papers, he makes it look easy, but it is extremely impressive. Pachinko Allocation Wei Li, Andrew McCallum A very elegant (but computationally challenging) model which induces correlation amongst topics using a multi-level DAG whose interior nodes are “super-topics” and “sub-topics” and whose leaves are the vocabulary words. Makes the slumbering monster of structure learning stir. Sequence Analysis (I missed these talks since I was chairing another session) Online Decoding of Markov Models with Latency Constraints Mukund Narasimhan, Paul Viola, Michael Shilman An “a
6 0.11035453 95 hunch net-2005-07-14-What Learning Theory might do
7 0.10825863 438 hunch net-2011-07-11-Interesting Neural Network Papers at ICML 2011
8 0.1029337 406 hunch net-2010-08-22-KDD 2010
9 0.10248584 23 hunch net-2005-02-19-Loss Functions for Discriminative Training of Energy-Based Models
10 0.098194636 143 hunch net-2005-12-27-Automated Labeling
11 0.095267132 45 hunch net-2005-03-22-Active learning
12 0.092851125 256 hunch net-2007-07-20-Motivation should be the Responsibility of the Reviewer
13 0.09237285 347 hunch net-2009-03-26-Machine Learning is too easy
14 0.092350967 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006
15 0.091118313 385 hunch net-2009-12-27-Interesting things at NIPS 2009
16 0.088594079 8 hunch net-2005-02-01-NIPS: Online Bayes
17 0.085721396 161 hunch net-2006-03-05-“Structural” Learning
18 0.084336884 51 hunch net-2005-04-01-The Producer-Consumer Model of Research
19 0.084009983 235 hunch net-2007-03-03-All Models of Learning have Flaws
20 0.08316198 233 hunch net-2007-02-16-The Forgetting
topicId topicWeight
[(0, 0.204), (1, 0.012), (2, 0.008), (3, 0.024), (4, 0.075), (5, -0.009), (6, -0.015), (7, -0.078), (8, 0.116), (9, -0.104), (10, 0.033), (11, -0.008), (12, -0.138), (13, 0.012), (14, 0.029), (15, -0.023), (16, 0.008), (17, 0.146), (18, 0.126), (19, -0.123), (20, -0.031), (21, -0.057), (22, -0.027), (23, -0.026), (24, 0.045), (25, 0.045), (26, -0.037), (27, 0.008), (28, -0.041), (29, -0.111), (30, -0.009), (31, 0.131), (32, 0.058), (33, -0.031), (34, 0.015), (35, -0.002), (36, 0.049), (37, -0.046), (38, 0.016), (39, -0.056), (40, -0.043), (41, -0.082), (42, -0.007), (43, 0.066), (44, -0.017), (45, -0.006), (46, 0.013), (47, 0.014), (48, -0.005), (49, -0.081)]
simIndex simValue blogId blogTitle
same-blog 1 0.97900397 97 hunch net-2005-07-23-Interesting papers at ACL
Introduction: A recent discussion indicated that one goal of this blog might be to allow people to post comments about recent papers that they liked. I think this could potentially be very useful, especially for those with diverse interests but only finite time to read through conference proceedings. ACL 2005 recently completed, and here are four papers from that conference that I thought were either good or perhaps of interest to a machine learning audience. David Chiang, A Hierarchical Phrase-Based Model for Statistical Machine Translation . (Best paper award.) This paper takes the standard phrase-based MT model that is popular in our field (basically, translate a sentence by individually translating phrases and reordering them according to a complicated statistical model) and extends it to take into account hierarchy in phrases, so that you can learn things like “X ‘s Y” -> “Y de X” in chinese, where X and Y are arbitrary phrases. This takes a step toward linguistic syntax for MT, whic
2 0.79692072 189 hunch net-2006-07-05-more icml papers
Introduction: Here are a few other papers I enjoyed from ICML06. Topic Models: Dynamic Topic Models David Blei, John Lafferty A nice model for how topics in LDA type models can evolve over time, using a linear dynamical system on the natural parameters and a very clever structured variational approximation (in which the mean field parameters are pseudo-observations of a virtual LDS). Like all Blei papers, he makes it look easy, but it is extremely impressive. Pachinko Allocation Wei Li, Andrew McCallum A very elegant (but computationally challenging) model which induces correlation amongst topics using a multi-level DAG whose interior nodes are “super-topics” and “sub-topics” and whose leaves are the vocabulary words. Makes the slumbering monster of structure learning stir. Sequence Analysis (I missed these talks since I was chairing another session) Online Decoding of Markov Models with Latency Constraints Mukund Narasimhan, Paul Viola, Michael Shilman An “a
3 0.7251963 135 hunch net-2005-12-04-Watchword: model
Introduction: In everyday use a model is a system which explains the behavior of some system, hopefully at the level where some alteration of the model predicts some alteration of the real-world system. In machine learning “model” has several variant definitions. Everyday . The common definition is sometimes used. Parameterized . Sometimes model is a short-hand for “parameterized model”. Here, it refers to a model with unspecified free parameters. In the Bayesian learning approach, you typically have a prior over (everyday) models. Predictive . Even further from everyday use is the predictive model. Examples of this are “my model is a decision tree” or “my model is a support vector machine”. Here, there is no real sense in which an SVM explains the underlying process. For example, an SVM tells us nothing in particular about how alterations to the real-world system would create a change. Which definition is being used at any particular time is important information. For examp
4 0.6917901 194 hunch net-2006-07-11-New Models
Introduction: How should we, as researchers in machine learning, organize ourselves? The most immediate measurable objective of computer science research is publishing a paper. The most difficult aspect of publishing a paper is having reviewers accept and recommend it for publication. The simplest mechanism for doing this is to show theoretical progress on some standard, well-known easily understood problem. In doing this, we often fall into a local minima of the research process. The basic problem in machine learning is that it is very unclear that the mathematical model is the right one for the (or some) real problem. A good mathematical model in machine learning should have one fundamental trait: it should aid the design of effective learning algorithms. To date, our ability to solve interesting learning problems (speech recognition, machine translation, object recognition, etc…) remains limited (although improving), so the “rightness” of our models is in doubt. If our mathematical mod
5 0.69135642 280 hunch net-2007-12-20-Cool and Interesting things at NIPS, take three
Introduction: Following up on Hal Daume’s post and John’s post on cool and interesting things seen at NIPS I’ll post my own little list of neat papers here as well. Of course it’s going to be biased towards what I think is interesting. Also, I have to say that I hadn’t been able to see many papers this year at nips due to myself being too busy, so please feel free to contribute the papers that you liked 1. P. Mudigonda, V. Kolmogorov, P. Torr. An Analysis of Convex Relaxations for MAP Estimation. A surprising paper which shows that many of the more sophisticated convex relaxations that had been proposed recently turns out to be subsumed by the simplest LP relaxation. Be careful next time you try a cool new convex relaxation! 2. D. Sontag, T. Jaakkola. New Outer Bounds on the Marginal Polytope. The title says it all. The marginal polytope is the set of local marginal distributions over subsets of variables that are globally consistent in the sense that there is at least one distributio
6 0.68895191 139 hunch net-2005-12-11-More NIPS Papers
7 0.65086329 301 hunch net-2008-05-23-Three levels of addressing the Netflix Prize
8 0.63033861 440 hunch net-2011-08-06-Interesting thing at UAI 2011
9 0.61854577 188 hunch net-2006-06-30-ICML papers
10 0.54268557 95 hunch net-2005-07-14-What Learning Theory might do
11 0.53867632 140 hunch net-2005-12-14-More NIPS Papers II
12 0.5284788 77 hunch net-2005-05-29-Maximum Margin Mismatch?
13 0.52496415 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006
14 0.51814967 330 hunch net-2008-12-07-A NIPS paper
15 0.5176425 310 hunch net-2008-07-15-Interesting papers at COLT (and a bit of UAI & workshops)
16 0.51094741 23 hunch net-2005-02-19-Loss Functions for Discriminative Training of Energy-Based Models
17 0.50750929 406 hunch net-2010-08-22-KDD 2010
18 0.50605279 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009
19 0.50380921 398 hunch net-2010-05-10-Aggregation of estimators, sparsity in high dimension and computational feasibility
20 0.49994847 233 hunch net-2007-02-16-The Forgetting
topicId topicWeight
[(9, 0.03), (10, 0.016), (26, 0.412), (27, 0.175), (38, 0.047), (53, 0.066), (55, 0.079), (64, 0.01), (94, 0.018), (95, 0.069)]
simIndex simValue blogId blogTitle
1 0.98078203 171 hunch net-2006-04-09-Progress in Machine Translation
Introduction: I just visited ISI where Daniel Marcu and others are working on machine translation. Apparently, machine translation is rapidly improving. A particularly dramatic year was 2002->2003 when systems switched from word-based translation to phrase-based translation. From a (now famous) slide by Charles Wayne at DARPA (which funds much of the work on machine translation) here is some anecdotal evidence: 2002 2003 insistent Wednesday may recurred her trips to Libya tomorrow for flying. Cairo 6-4 ( AFP ) – An official announced today in the Egyptian lines company for flying Tuesday is a company “insistent for flying” may resumed a consideration of a day Wednesday tomorrow her trips to Libya of Security Council decision trace international the imposed ban comment. And said the official “the institution sent a speech to Ministry of Foreign Affairs of lifting on Libya air, a situation her recieving replying are so a trip will pull to Libya a morning Wednesday.” E
2 0.96616817 413 hunch net-2010-10-08-An easy proof of the Chernoff-Hoeffding bound
Introduction: Textbooks invariably seem to carry the proof that uses Markov’s inequality, moment-generating functions, and Taylor approximations. Here’s an easier way. For , let be the KL divergence between a coin of bias and one of bias : Theorem: Suppose you do independent tosses of a coin of bias . The probability of seeing heads or more, for , is at most . So is the probability of seeing heads or less, for . Remark: By Pinsker’s inequality, . Proof Let’s do the case; the other is identical. Let be the distribution over induced by a coin of bias , and likewise for a coin of bias . Let be the set of all sequences of tosses which contain heads or more. We’d like to show that is unlikely under . Pick any , with say heads. Then: Since for every , we have and we’re done.
3 0.83041888 17 hunch net-2005-02-10-Conferences, Dates, Locations
Introduction: Conference Locate Date COLT Bertinoro, Italy June 27-30 AAAI Pittsburgh, PA, USA July 9-13 UAI Edinburgh, Scotland July 26-29 IJCAI Edinburgh, Scotland July 30 – August 5 ICML Bonn, Germany August 7-11 KDD Chicago, IL, USA August 21-24 The big winner this year is Europe. This is partly a coincidence, and partly due to the general internationalization of science over the last few years. With cuts to basic science in the US and increased hassle for visitors, conferences outside the US become more attractive. Europe and Australia/New Zealand are the immediate winners because they have the science, infrastructure, and english in place. China and India are possible future winners.
same-blog 4 0.80341452 97 hunch net-2005-07-23-Interesting papers at ACL
Introduction: A recent discussion indicated that one goal of this blog might be to allow people to post comments about recent papers that they liked. I think this could potentially be very useful, especially for those with diverse interests but only finite time to read through conference proceedings. ACL 2005 recently completed, and here are four papers from that conference that I thought were either good or perhaps of interest to a machine learning audience. David Chiang, A Hierarchical Phrase-Based Model for Statistical Machine Translation . (Best paper award.) This paper takes the standard phrase-based MT model that is popular in our field (basically, translate a sentence by individually translating phrases and reordering them according to a complicated statistical model) and extends it to take into account hierarchy in phrases, so that you can learn things like “X ‘s Y” -> “Y de X” in chinese, where X and Y are arbitrary phrases. This takes a step toward linguistic syntax for MT, whic
5 0.79782009 305 hunch net-2008-06-30-ICML has a comment system
Introduction: Mark Reid has stepped up and created a comment system for ICML papers which Greger Linden has tightly integrated. My understanding is that Mark spent quite a bit of time on the details, and there are some cool features like working latex math mode. This is an excellent chance for the ICML community to experiment with making ICML year-round, so I hope it works out. Please do consider experimenting with it.
6 0.73670202 25 hunch net-2005-02-20-At One Month
7 0.61414653 43 hunch net-2005-03-18-Binomial Weighting
8 0.45612857 478 hunch net-2013-01-07-NYU Large Scale Machine Learning Class
9 0.45427841 225 hunch net-2007-01-02-Retrospective
10 0.45411387 194 hunch net-2006-07-11-New Models
11 0.45369023 466 hunch net-2012-06-05-ICML acceptance statistics
12 0.45157343 406 hunch net-2010-08-22-KDD 2010
13 0.45131937 360 hunch net-2009-06-15-In Active Learning, the question changes
14 0.45123503 343 hunch net-2009-02-18-Decision by Vetocracy
15 0.45056507 160 hunch net-2006-03-02-Why do people count for learning?
16 0.4495936 12 hunch net-2005-02-03-Learning Theory, by assumption
17 0.44851345 403 hunch net-2010-07-18-ICML & COLT 2010
18 0.44728589 370 hunch net-2009-09-18-Necessary and Sufficient Research
19 0.44683889 36 hunch net-2005-03-05-Funding Research
20 0.44666913 309 hunch net-2008-07-10-Interesting papers, ICML 2008