hunch_net hunch_net-2005 hunch_net-2005-97 knowledge-graph by maker-knowledge-mining

97 hunch net-2005-07-23-Interesting papers at ACL


meta infos for this blog

Source: html

Introduction: A recent discussion indicated that one goal of this blog might be to allow people to post comments about recent papers that they liked. I think this could potentially be very useful, especially for those with diverse interests but only finite time to read through conference proceedings. ACL 2005 recently completed, and here are four papers from that conference that I thought were either good or perhaps of interest to a machine learning audience. David Chiang, A Hierarchical Phrase-Based Model for Statistical Machine Translation . (Best paper award.) This paper takes the standard phrase-based MT model that is popular in our field (basically, translate a sentence by individually translating phrases and reordering them according to a complicated statistical model) and extends it to take into account hierarchy in phrases, so that you can learn things like “X ‘s Y” -> “Y de X” in chinese, where X and Y are arbitrary phrases. This takes a step toward linguistic syntax for MT, whic


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 A recent discussion indicated that one goal of this blog might be to allow people to post comments about recent papers that they liked. [sent-1, score-0.326]

2 I think this could potentially be very useful, especially for those with diverse interests but only finite time to read through conference proceedings. [sent-2, score-0.085]

3 ACL 2005 recently completed, and here are four papers from that conference that I thought were either good or perhaps of interest to a machine learning audience. [sent-3, score-0.092]

4 This takes a step toward linguistic syntax for MT, which our group is working strongly on, but doesn’t require any linguists to sit down and write out grammars or parse sentences. [sent-7, score-0.565]

5 This is more of a machine learning style paper, where they improve a sequence labeling task by augmenting it with models from related tasks for which data is free. [sent-9, score-0.516]

6 , I might train a model that, given a context with a missing word, will predict the word (eg. [sent-12, score-0.383]

7 , “The ____ gave a speech” might want you to insert “president”. [sent-13, score-0.153]

8 ) By doing so, you can use these other models to give additional useful information to your main task. [sent-14, score-0.162]

9 This paper talks about training sequence labeling models in an unsupervised fashion, basically by contrasting what the model does on the correct string with what the model does on a corrupted version of the string. [sent-17, score-1.445]

10 They get significantly better results than just by using EM in an HMM, and the idea is pretty nice. [sent-18, score-0.136]

11 This is a pretty neat idea (though I’m biased — Patrick is a friend) where one attempts to come up with feature vectors that describe nodes in a semantic hierarchy (ontology) that could enable you to figure out where to insert new words that are not in your ontology. [sent-20, score-0.989]

12 The results are pretty good, and the method is fairly simple; I’d imagine that a more complex model/learning framework could improve the model even further. [sent-21, score-0.59]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('model', 0.264), ('mt', 0.223), ('patrick', 0.223), ('phrases', 0.223), ('hierarchy', 0.184), ('vectors', 0.173), ('models', 0.162), ('insert', 0.153), ('labeling', 0.137), ('pretty', 0.136), ('basically', 0.131), ('word', 0.119), ('recent', 0.117), ('statistical', 0.115), ('sequence', 0.112), ('improve', 0.105), ('takes', 0.105), ('smith', 0.099), ('translating', 0.099), ('ando', 0.099), ('chunking', 0.099), ('corrupted', 0.099), ('grammars', 0.099), ('parse', 0.099), ('reordering', 0.099), ('paper', 0.094), ('syntax', 0.092), ('chinese', 0.092), ('completed', 0.092), ('de', 0.092), ('four', 0.092), ('indicated', 0.092), ('jason', 0.092), ('neat', 0.092), ('ontology', 0.092), ('string', 0.092), ('training', 0.09), ('semantic', 0.087), ('sentence', 0.087), ('toward', 0.087), ('could', 0.085), ('sit', 0.083), ('extends', 0.083), ('translate', 0.083), ('hierarchical', 0.083), ('em', 0.083), ('enable', 0.079), ('acl', 0.079), ('popular', 0.079), ('individually', 0.079)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999958 97 hunch net-2005-07-23-Interesting papers at ACL

Introduction: A recent discussion indicated that one goal of this blog might be to allow people to post comments about recent papers that they liked. I think this could potentially be very useful, especially for those with diverse interests but only finite time to read through conference proceedings. ACL 2005 recently completed, and here are four papers from that conference that I thought were either good or perhaps of interest to a machine learning audience. David Chiang, A Hierarchical Phrase-Based Model for Statistical Machine Translation . (Best paper award.) This paper takes the standard phrase-based MT model that is popular in our field (basically, translate a sentence by individually translating phrases and reordering them according to a complicated statistical model) and extends it to take into account hierarchy in phrases, so that you can learn things like “X ‘s Y” -> “Y de X” in chinese, where X and Y are arbitrary phrases. This takes a step toward linguistic syntax for MT, whic

2 0.16840719 194 hunch net-2006-07-11-New Models

Introduction: How should we, as researchers in machine learning, organize ourselves? The most immediate measurable objective of computer science research is publishing a paper. The most difficult aspect of publishing a paper is having reviewers accept and recommend it for publication. The simplest mechanism for doing this is to show theoretical progress on some standard, well-known easily understood problem. In doing this, we often fall into a local minima of the research process. The basic problem in machine learning is that it is very unclear that the mathematical model is the right one for the (or some) real problem. A good mathematical model in machine learning should have one fundamental trait: it should aid the design of effective learning algorithms. To date, our ability to solve interesting learning problems (speech recognition, machine translation, object recognition, etc…) remains limited (although improving), so the “rightness” of our models is in doubt. If our mathematical mod

3 0.12722728 301 hunch net-2008-05-23-Three levels of addressing the Netflix Prize

Introduction: In October 2006, the online movie renter, Netflix, announced the Netflix Prize contest. They published a comprehensive dataset including more than 100 million movie ratings, which were performed by about 480,000 real customers on 17,770 movies.   Competitors in the challenge are required to estimate a few million ratings.   To win the “grand prize,” they need to deliver a 10% improvement in the prediction error compared with the results of Cinematch, Netflix’s proprietary recommender system. Best current results deliver 9.12% improvement , which is quite close to the 10% goal, yet painfully distant.   The Netflix Prize breathed new life and excitement into recommender systems research. The competition allowed the wide research community to access a large scale, real life dataset. Beyond this, the competition changed the rules of the game. Claiming that your nice idea could outperform some mediocre algorithms on some toy dataset is no longer acceptable. Now researcher

4 0.12033739 135 hunch net-2005-12-04-Watchword: model

Introduction: In everyday use a model is a system which explains the behavior of some system, hopefully at the level where some alteration of the model predicts some alteration of the real-world system. In machine learning “model” has several variant definitions. Everyday . The common definition is sometimes used. Parameterized . Sometimes model is a short-hand for “parameterized model”. Here, it refers to a model with unspecified free parameters. In the Bayesian learning approach, you typically have a prior over (everyday) models. Predictive . Even further from everyday use is the predictive model. Examples of this are “my model is a decision tree” or “my model is a support vector machine”. Here, there is no real sense in which an SVM explains the underlying process. For example, an SVM tells us nothing in particular about how alterations to the real-world system would create a change. Which definition is being used at any particular time is important information. For examp

5 0.11292071 189 hunch net-2006-07-05-more icml papers

Introduction: Here are a few other papers I enjoyed from ICML06. Topic Models: Dynamic Topic Models David Blei, John Lafferty A nice model for how topics in LDA type models can evolve over time, using a linear dynamical system on the natural parameters and a very clever structured variational approximation (in which the mean field parameters are pseudo-observations of a virtual LDS). Like all Blei papers, he makes it look easy, but it is extremely impressive. Pachinko Allocation Wei Li, Andrew McCallum A very elegant (but computationally challenging) model which induces correlation amongst topics using a multi-level DAG whose interior nodes are “super-topics” and “sub-topics” and whose leaves are the vocabulary words. Makes the slumbering monster of structure learning stir. Sequence Analysis (I missed these talks since I was chairing another session) Online Decoding of Markov Models with Latency Constraints Mukund Narasimhan, Paul Viola, Michael Shilman An “a

6 0.11035453 95 hunch net-2005-07-14-What Learning Theory might do

7 0.10825863 438 hunch net-2011-07-11-Interesting Neural Network Papers at ICML 2011

8 0.1029337 406 hunch net-2010-08-22-KDD 2010

9 0.10248584 23 hunch net-2005-02-19-Loss Functions for Discriminative Training of Energy-Based Models

10 0.098194636 143 hunch net-2005-12-27-Automated Labeling

11 0.095267132 45 hunch net-2005-03-22-Active learning

12 0.092851125 256 hunch net-2007-07-20-Motivation should be the Responsibility of the Reviewer

13 0.09237285 347 hunch net-2009-03-26-Machine Learning is too easy

14 0.092350967 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006

15 0.091118313 385 hunch net-2009-12-27-Interesting things at NIPS 2009

16 0.088594079 8 hunch net-2005-02-01-NIPS: Online Bayes

17 0.085721396 161 hunch net-2006-03-05-“Structural” Learning

18 0.084336884 51 hunch net-2005-04-01-The Producer-Consumer Model of Research

19 0.084009983 235 hunch net-2007-03-03-All Models of Learning have Flaws

20 0.08316198 233 hunch net-2007-02-16-The Forgetting


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.204), (1, 0.012), (2, 0.008), (3, 0.024), (4, 0.075), (5, -0.009), (6, -0.015), (7, -0.078), (8, 0.116), (9, -0.104), (10, 0.033), (11, -0.008), (12, -0.138), (13, 0.012), (14, 0.029), (15, -0.023), (16, 0.008), (17, 0.146), (18, 0.126), (19, -0.123), (20, -0.031), (21, -0.057), (22, -0.027), (23, -0.026), (24, 0.045), (25, 0.045), (26, -0.037), (27, 0.008), (28, -0.041), (29, -0.111), (30, -0.009), (31, 0.131), (32, 0.058), (33, -0.031), (34, 0.015), (35, -0.002), (36, 0.049), (37, -0.046), (38, 0.016), (39, -0.056), (40, -0.043), (41, -0.082), (42, -0.007), (43, 0.066), (44, -0.017), (45, -0.006), (46, 0.013), (47, 0.014), (48, -0.005), (49, -0.081)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97900397 97 hunch net-2005-07-23-Interesting papers at ACL

Introduction: A recent discussion indicated that one goal of this blog might be to allow people to post comments about recent papers that they liked. I think this could potentially be very useful, especially for those with diverse interests but only finite time to read through conference proceedings. ACL 2005 recently completed, and here are four papers from that conference that I thought were either good or perhaps of interest to a machine learning audience. David Chiang, A Hierarchical Phrase-Based Model for Statistical Machine Translation . (Best paper award.) This paper takes the standard phrase-based MT model that is popular in our field (basically, translate a sentence by individually translating phrases and reordering them according to a complicated statistical model) and extends it to take into account hierarchy in phrases, so that you can learn things like “X ‘s Y” -> “Y de X” in chinese, where X and Y are arbitrary phrases. This takes a step toward linguistic syntax for MT, whic

2 0.79692072 189 hunch net-2006-07-05-more icml papers

Introduction: Here are a few other papers I enjoyed from ICML06. Topic Models: Dynamic Topic Models David Blei, John Lafferty A nice model for how topics in LDA type models can evolve over time, using a linear dynamical system on the natural parameters and a very clever structured variational approximation (in which the mean field parameters are pseudo-observations of a virtual LDS). Like all Blei papers, he makes it look easy, but it is extremely impressive. Pachinko Allocation Wei Li, Andrew McCallum A very elegant (but computationally challenging) model which induces correlation amongst topics using a multi-level DAG whose interior nodes are “super-topics” and “sub-topics” and whose leaves are the vocabulary words. Makes the slumbering monster of structure learning stir. Sequence Analysis (I missed these talks since I was chairing another session) Online Decoding of Markov Models with Latency Constraints Mukund Narasimhan, Paul Viola, Michael Shilman An “a

3 0.7251963 135 hunch net-2005-12-04-Watchword: model

Introduction: In everyday use a model is a system which explains the behavior of some system, hopefully at the level where some alteration of the model predicts some alteration of the real-world system. In machine learning “model” has several variant definitions. Everyday . The common definition is sometimes used. Parameterized . Sometimes model is a short-hand for “parameterized model”. Here, it refers to a model with unspecified free parameters. In the Bayesian learning approach, you typically have a prior over (everyday) models. Predictive . Even further from everyday use is the predictive model. Examples of this are “my model is a decision tree” or “my model is a support vector machine”. Here, there is no real sense in which an SVM explains the underlying process. For example, an SVM tells us nothing in particular about how alterations to the real-world system would create a change. Which definition is being used at any particular time is important information. For examp

4 0.6917901 194 hunch net-2006-07-11-New Models

Introduction: How should we, as researchers in machine learning, organize ourselves? The most immediate measurable objective of computer science research is publishing a paper. The most difficult aspect of publishing a paper is having reviewers accept and recommend it for publication. The simplest mechanism for doing this is to show theoretical progress on some standard, well-known easily understood problem. In doing this, we often fall into a local minima of the research process. The basic problem in machine learning is that it is very unclear that the mathematical model is the right one for the (or some) real problem. A good mathematical model in machine learning should have one fundamental trait: it should aid the design of effective learning algorithms. To date, our ability to solve interesting learning problems (speech recognition, machine translation, object recognition, etc…) remains limited (although improving), so the “rightness” of our models is in doubt. If our mathematical mod

5 0.69135642 280 hunch net-2007-12-20-Cool and Interesting things at NIPS, take three

Introduction: Following up on Hal Daume’s post and John’s post on cool and interesting things seen at NIPS I’ll post my own little list of neat papers here as well. Of course it’s going to be biased towards what I think is interesting. Also, I have to say that I hadn’t been able to see many papers this year at nips due to myself being too busy, so please feel free to contribute the papers that you liked 1. P. Mudigonda, V. Kolmogorov, P. Torr. An Analysis of Convex Relaxations for MAP Estimation. A surprising paper which shows that many of the more sophisticated convex relaxations that had been proposed recently turns out to be subsumed by the simplest LP relaxation. Be careful next time you try a cool new convex relaxation! 2. D. Sontag, T. Jaakkola. New Outer Bounds on the Marginal Polytope. The title says it all. The marginal polytope is the set of local marginal distributions over subsets of variables that are globally consistent in the sense that there is at least one distributio

6 0.68895191 139 hunch net-2005-12-11-More NIPS Papers

7 0.65086329 301 hunch net-2008-05-23-Three levels of addressing the Netflix Prize

8 0.63033861 440 hunch net-2011-08-06-Interesting thing at UAI 2011

9 0.61854577 188 hunch net-2006-06-30-ICML papers

10 0.54268557 95 hunch net-2005-07-14-What Learning Theory might do

11 0.53867632 140 hunch net-2005-12-14-More NIPS Papers II

12 0.5284788 77 hunch net-2005-05-29-Maximum Margin Mismatch?

13 0.52496415 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006

14 0.51814967 330 hunch net-2008-12-07-A NIPS paper

15 0.5176425 310 hunch net-2008-07-15-Interesting papers at COLT (and a bit of UAI & workshops)

16 0.51094741 23 hunch net-2005-02-19-Loss Functions for Discriminative Training of Energy-Based Models

17 0.50750929 406 hunch net-2010-08-22-KDD 2010

18 0.50605279 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009

19 0.50380921 398 hunch net-2010-05-10-Aggregation of estimators, sparsity in high dimension and computational feasibility

20 0.49994847 233 hunch net-2007-02-16-The Forgetting


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(9, 0.03), (10, 0.016), (26, 0.412), (27, 0.175), (38, 0.047), (53, 0.066), (55, 0.079), (64, 0.01), (94, 0.018), (95, 0.069)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.98078203 171 hunch net-2006-04-09-Progress in Machine Translation

Introduction: I just visited ISI where Daniel Marcu and others are working on machine translation. Apparently, machine translation is rapidly improving. A particularly dramatic year was 2002->2003 when systems switched from word-based translation to phrase-based translation. From a (now famous) slide by Charles Wayne at DARPA (which funds much of the work on machine translation) here is some anecdotal evidence: 2002 2003 insistent Wednesday may recurred her trips to Libya tomorrow for flying. Cairo 6-4 ( AFP ) – An official announced today in the Egyptian lines company for flying Tuesday is a company “insistent for flying” may resumed a consideration of a day Wednesday tomorrow her trips to Libya of Security Council decision trace international the imposed ban comment. And said the official “the institution sent a speech to Ministry of Foreign Affairs of lifting on Libya air, a situation her recieving replying are so a trip will pull to Libya a morning Wednesday.” E

2 0.96616817 413 hunch net-2010-10-08-An easy proof of the Chernoff-Hoeffding bound

Introduction: Textbooks invariably seem to carry the proof that uses Markov’s inequality, moment-generating functions, and Taylor approximations. Here’s an easier way. For , let be the KL divergence between a coin of bias and one of bias : Theorem: Suppose you do independent tosses of a coin of bias . The probability of seeing heads or more, for , is at most . So is the probability of seeing heads or less, for . Remark: By Pinsker’s inequality, . Proof Let’s do the case; the other is identical. Let be the distribution over induced by a coin of bias , and likewise for a coin of bias . Let be the set of all sequences of tosses which contain heads or more. We’d like to show that is unlikely under . Pick any , with say heads. Then: Since for every , we have and we’re done.

3 0.83041888 17 hunch net-2005-02-10-Conferences, Dates, Locations

Introduction: Conference Locate Date COLT Bertinoro, Italy June 27-30 AAAI Pittsburgh, PA, USA July 9-13 UAI Edinburgh, Scotland July 26-29 IJCAI Edinburgh, Scotland July 30 – August 5 ICML Bonn, Germany August 7-11 KDD Chicago, IL, USA August 21-24 The big winner this year is Europe. This is partly a coincidence, and partly due to the general internationalization of science over the last few years. With cuts to basic science in the US and increased hassle for visitors, conferences outside the US become more attractive. Europe and Australia/New Zealand are the immediate winners because they have the science, infrastructure, and english in place. China and India are possible future winners.

same-blog 4 0.80341452 97 hunch net-2005-07-23-Interesting papers at ACL

Introduction: A recent discussion indicated that one goal of this blog might be to allow people to post comments about recent papers that they liked. I think this could potentially be very useful, especially for those with diverse interests but only finite time to read through conference proceedings. ACL 2005 recently completed, and here are four papers from that conference that I thought were either good or perhaps of interest to a machine learning audience. David Chiang, A Hierarchical Phrase-Based Model for Statistical Machine Translation . (Best paper award.) This paper takes the standard phrase-based MT model that is popular in our field (basically, translate a sentence by individually translating phrases and reordering them according to a complicated statistical model) and extends it to take into account hierarchy in phrases, so that you can learn things like “X ‘s Y” -> “Y de X” in chinese, where X and Y are arbitrary phrases. This takes a step toward linguistic syntax for MT, whic

5 0.79782009 305 hunch net-2008-06-30-ICML has a comment system

Introduction: Mark Reid has stepped up and created a comment system for ICML papers which Greger Linden has tightly integrated. My understanding is that Mark spent quite a bit of time on the details, and there are some cool features like working latex math mode. This is an excellent chance for the ICML community to experiment with making ICML year-round, so I hope it works out. Please do consider experimenting with it.

6 0.73670202 25 hunch net-2005-02-20-At One Month

7 0.61414653 43 hunch net-2005-03-18-Binomial Weighting

8 0.45612857 478 hunch net-2013-01-07-NYU Large Scale Machine Learning Class

9 0.45427841 225 hunch net-2007-01-02-Retrospective

10 0.45411387 194 hunch net-2006-07-11-New Models

11 0.45369023 466 hunch net-2012-06-05-ICML acceptance statistics

12 0.45157343 406 hunch net-2010-08-22-KDD 2010

13 0.45131937 360 hunch net-2009-06-15-In Active Learning, the question changes

14 0.45123503 343 hunch net-2009-02-18-Decision by Vetocracy

15 0.45056507 160 hunch net-2006-03-02-Why do people count for learning?

16 0.4495936 12 hunch net-2005-02-03-Learning Theory, by assumption

17 0.44851345 403 hunch net-2010-07-18-ICML & COLT 2010

18 0.44728589 370 hunch net-2009-09-18-Necessary and Sufficient Research

19 0.44683889 36 hunch net-2005-03-05-Funding Research

20 0.44666913 309 hunch net-2008-07-10-Interesting papers, ICML 2008