hunch_net hunch_net-2006 hunch_net-2006-206 knowledge-graph by maker-knowledge-mining

206 hunch net-2006-09-09-How to solve an NP hard problem in quadratic time

meta infos for this blog

Source: html

Introduction: This title is a lie, but it is a special lie which has a bit of truth. If n players each play each other, you have a tournament. How do you order the players from weakest to strongest? The standard first attempt is “find the ordering which agrees with the tournament on as many player pairs as possible”. This is called the “minimum feedback arcset” problem in the CS theory literature and it is a well known NP-hard problem. A basic guarantee holds for the solution to this problem: if there is some “true” intrinsic ordering, and the outcome of the tournament disagrees k times (due to noise for instance), then the output ordering will disagree with the original ordering on at most 2k edges (and no solution can be better). One standard approach to tractably solving an NP-hard problem is to find another algorithm with an approximation guarantee. For example, Don Coppersmith , Lisa Fleischer and Atri Rudra proved that ordering players according to the number of wins is

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 If n players each play each other, you have a tournament. [sent-2, score-0.178]

2 How do you order the players from weakest to strongest? [sent-3, score-0.254]

3 The standard first attempt is “find the ordering which agrees with the tournament on as many player pairs as possible”. [sent-4, score-0.944]

4 One standard approach to tractably solving an NP-hard problem is to find another algorithm with an approximation guarantee. [sent-7, score-0.243]

5 For example, Don Coppersmith , Lisa Fleischer and Atri Rudra proved that ordering players according to the number of wins is a 5-approximation to the NP-hard problem . [sent-8, score-1.136]

6 An even better approach is to realize that the NP hard problem may not be the real problem. [sent-9, score-0.177]

7 The real problem may be finding a good approximation to the “true” intrinsic ordering given noisy tournament information. [sent-10, score-1.019]

8 In a learning setting, the simplest form of ranking problem is “bipartite ranking” where every element has a value of either 0 or 1 and we want to order 0 s before 1 s. [sent-11, score-0.294]

9 A common way to measure the performance of bipartite ranking is according to “area under the ROC curve” (AUC) = 1 – the fraction of out-of-order pairs. [sent-12, score-0.404]

10 Nina , Alina , Greg and I proved that if we learn a comparison function which errs on k dissimilar pairs, then ordering according to the number of wins yields an order within 4k edge reversals of the original ordering. [sent-13, score-1.205]

11 As a reduction statement(*), this shows that an error rate of e for a learned pairwise binary classifier produces an ordering with an expected AUC of 1 – 4e . [sent-14, score-1.05]

12 The same inequality even holds for a (stronger) regret transform. [sent-15, score-0.268]

13 If r = e – e min is the regret of the binary pairwise classifier, then the AUC regret is bounded by 4r . [sent-16, score-0.731]

14 (Here e min is the error rate of the best possible classifier which predicts knowing the most likely outcome. [sent-17, score-0.182]

15 ) The regret result extends to more general measures of ordering than simply AUC. [sent-18, score-0.747]

16 We were unable to find any examples where ordering according to the degree produced more than a 2r AUC regret. [sent-19, score-0.789]

17 The learning lesson is that a good pairwise comparator implies the ability to rank well according to AUC. [sent-23, score-0.654]

18 The general research lesson is that an NP hard problem for an approximate solution is not an intrinsic obstacle. [sent-24, score-0.53]

19 (*) To prove the reduction, you must make sure that your form pairwise examples in the right way. [sent-26, score-0.377]

20 Your source of pairwise ordering examples must be uniform over the dissimilar pairs containing one example with label 1 and one example with label 0 . [sent-27, score-1.34]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('ordering', 0.513), ('pairwise', 0.321), ('auc', 0.257), ('np', 0.214), ('players', 0.178), ('tournament', 0.178), ('intrinsic', 0.154), ('according', 0.151), ('pairs', 0.14), ('bipartite', 0.128), ('dissimilar', 0.128), ('lesson', 0.128), ('regret', 0.126), ('ranking', 0.125), ('lie', 0.119), ('greg', 0.119), ('wins', 0.112), ('min', 0.093), ('problem', 0.093), ('holds', 0.091), ('classifier', 0.089), ('proved', 0.089), ('hard', 0.084), ('approximation', 0.081), ('original', 0.077), ('order', 0.076), ('guarantee', 0.076), ('solution', 0.071), ('find', 0.069), ('binary', 0.065), ('label', 0.063), ('reduction', 0.062), ('strongest', 0.059), ('disagrees', 0.059), ('errs', 0.059), ('edges', 0.059), ('player', 0.059), ('containing', 0.056), ('lessons', 0.056), ('examples', 0.056), ('true', 0.054), ('measures', 0.054), ('extends', 0.054), ('agrees', 0.054), ('rank', 0.054), ('nina', 0.054), ('curve', 0.051), ('roc', 0.051), ('inequality', 0.051), ('satisfy', 0.051)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 206 hunch net-2006-09-09-How to solve an NP hard problem in quadratic time

2 0.17580363 259 hunch net-2007-08-19-Choice of Metrics

Introduction: How do we judge success in Machine Learning? As Aaron notes , the best way is to use the loss imposed on you by the world. This turns out to be infeasible sometimes for various reasons. The ones I’ve seen are: The learned prediction is used in some complicated process that does not give the feedback necessary to understand the prediction’s impact on the loss. The prediction is used by some other system which expects some semantics to the predicted value. This is similar to the previous example, except that the issue is design modularity rather than engineering modularity. The correct loss function is simply unknown (and perhaps unknowable, except by experimentation). In these situations, it’s unclear what metric for evaluation should be chosen. This post has some design advice for this murkier case. I’m using the word “metric” here to distinguish the fact that we are considering methods for evaluating predictive systems rather than a loss imposed by the real wor

3 0.16671206 72 hunch net-2005-05-16-Regret minimizing vs error limiting reductions

Introduction: This post is about a reductions-related problem that I find mysterious. There are two kinds of reductions analysis currently under consideration. Error limiting reductions. Here, the goal is to bound the error rate of the created classifier in terms of the error rate of the binary classifiers that you reduce to. A very simple example of this is that error correcting output codes where it is possible to prove that for certain codes, the multiclass error rate is at most 4 * the binary classifier error rate. Regret minimizing reductions. Here, the goal is to bound the regret of the created classifier in terms of the regret of the binary classifiers reduced to. The regret is the error rate minus the minimum error rate. When the learning problem is noisy the minimum error rate may not be 0 . An analagous result for reget is that for a probabilistic error correcting output code , multiclass regret is at most 4 * (binary regret) 0.5 . The use of “regret” is more desi

4 0.15973057 243 hunch net-2007-05-08-Conditional Tournaments for Multiclass to Binary

Introduction: This problem has been cracked (but not quite completely solved) by Alina , Pradeep , and I . The problem is essentially finding a better way to reduce multiclass classification to binary classification. The solution is to use a carefully crafted tournament, the simplest version of which is a single elimination tournament where the “players” are the different classes. An example of the structure is here: For the single elimination tournament, we can prove that: For all multiclass problems D , for all learned binary classifiers c , the regret of an induced multiclass classifier is bounded by the regret of the binary classifier times log 2 k . Restated: reg multiclass (D,Filter_tree_test(c)) <= reg binary (Filter_tree_train(D),c) Here: Filter_tree_train(D) is the induced binary classification problem Filter_tree_test(c) is the induced multiclass classifier. reg multiclass is the multiclass regret (= difference between error rate and minim

5 0.14381787 181 hunch net-2006-05-23-What is the best regret transform reduction from multiclass to binary?

Introduction: This post is about an open problem in learning reductions. Background A reduction might transform a a multiclass prediction problem where there are k possible labels into a binary learning problem where there are only 2 possible labels. On this induced binary problem we might learn a binary classifier with some error rate e . After subtracting the minimum possible (Bayes) error rate b , we get a regret r = e – b . The PECOC (Probabilistic Error Correcting Output Code) reduction has the property that binary regret r implies multiclass regret at most 4r 0.5 . The problem This is not the “rightest” answer. Consider the k=2 case, where we reduce binary to binary. There exists a reduction (the identity) with the property that regret r implies regret r . This is substantially superior to the transform given by the PECOC reduction, which suggests that a better reduction may exist for general k . For example, we can not rule out the possibility that a reduction

6 0.12305955 318 hunch net-2008-09-26-The SODA Program Committee

7 0.11983034 14 hunch net-2005-02-07-The State of the Reduction

8 0.11062269 391 hunch net-2010-03-15-The Efficient Robust Conditional Probability Estimation Problem

9 0.093655832 236 hunch net-2007-03-15-Alternative Machine Learning Reductions Definitions

10 0.092634365 231 hunch net-2007-02-10-Best Practices for Collaboration

11 0.092556469 334 hunch net-2009-01-07-Interesting Papers at SODA 2009

12 0.089814357 18 hunch net-2005-02-12-ROC vs. Accuracy vs. AROC

13 0.087584987 304 hunch net-2008-06-27-Reviewing Horror Stories

14 0.08655867 165 hunch net-2006-03-23-The Approximation Argument

15 0.079877242 227 hunch net-2007-01-10-A Deep Belief Net Learning Problem

16 0.078427292 67 hunch net-2005-05-06-Don’t mix the solution into the problem

17 0.07833869 311 hunch net-2008-07-26-Compositional Machine Learning Algorithm Design

18 0.077894695 102 hunch net-2005-08-11-Why Manifold-Based Dimension Reduction Techniques?

19 0.076105699 406 hunch net-2010-08-22-KDD 2010

20 0.075862363 388 hunch net-2010-01-24-Specializations of the Master Problem

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.167), (1, 0.113), (2, 0.074), (3, -0.044), (4, -0.05), (5, -0.024), (6, 0.143), (7, -0.017), (8, -0.09), (9, -0.044), (10, -0.055), (11, 0.054), (12, 0.019), (13, -0.033), (14, -0.029), (15, 0.011), (16, -0.007), (17, 0.004), (18, 0.012), (19, 0.033), (20, -0.056), (21, 0.017), (22, 0.067), (23, 0.006), (24, 0.041), (25, -0.009), (26, 0.079), (27, -0.049), (28, 0.01), (29, -0.061), (30, -0.033), (31, -0.063), (32, 0.006), (33, 0.065), (34, 0.078), (35, -0.011), (36, 0.01), (37, 0.01), (38, 0.003), (39, 0.01), (40, 0.04), (41, -0.006), (42, 0.051), (43, 0.062), (44, 0.04), (45, -0.003), (46, -0.016), (47, -0.009), (48, 0.056), (49, -0.047)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96330464 206 hunch net-2006-09-09-How to solve an NP hard problem in quadratic time

2 0.80962509 243 hunch net-2007-05-08-Conditional Tournaments for Multiclass to Binary

3 0.72800523 181 hunch net-2006-05-23-What is the best regret transform reduction from multiclass to binary?

4 0.72218508 72 hunch net-2005-05-16-Regret minimizing vs error limiting reductions

5 0.71433145 391 hunch net-2010-03-15-The Efficient Robust Conditional Probability Estimation Problem

Introduction: I’m offering a reward of $1000 for a solution to this problem. This joins the cross validation problem which I’m offering a $500 reward for. I believe both of these problems are hard but plausibly solvable, and plausibly with a solution of substantial practical value. While it’s unlikely these rewards are worth your time on an hourly wage basis, the recognition for solving them definitely should be The Problem The problem is finding a general, robust, and efficient mechanism for estimating a conditional probability P(y|x) where robustness and efficiency are measured using techniques from learning reductions. In particular, suppose we have access to a binary regression oracle B which has two interfaces—one for specifying training information and one for testing. Training information is specified as B(x’,y’) where x’ is a feature vector and y’ is a scalar in [0,1] with no value returned. Testing is done according to B(x’) with a value in [0,1] returned.

6 0.68987399 196 hunch net-2006-07-13-Regression vs. Classification as a Primitive

7 0.66417164 14 hunch net-2005-02-07-The State of the Reduction

8 0.62824136 78 hunch net-2005-06-06-Exact Online Learning for Classification

9 0.61417943 102 hunch net-2005-08-11-Why Manifold-Based Dimension Reduction Techniques?

10 0.59483099 18 hunch net-2005-02-12-ROC vs. Accuracy vs. AROC

11 0.58506244 67 hunch net-2005-05-06-Don’t mix the solution into the problem

12 0.5615012 43 hunch net-2005-03-18-Binomial Weighting

13 0.55233186 31 hunch net-2005-02-26-Problem: Reductions and Relative Ranking Metrics

14 0.53504592 218 hunch net-2006-11-20-Context and the calculation misperception

15 0.52117038 334 hunch net-2009-01-07-Interesting Papers at SODA 2009

16 0.51897901 327 hunch net-2008-11-16-Observations on Linearity for Reductions to Regression

17 0.51592833 311 hunch net-2008-07-26-Compositional Machine Learning Algorithm Design

18 0.50846636 157 hunch net-2006-02-18-Multiplication of Learned Probabilities is Dangerous

19 0.50656754 176 hunch net-2006-05-01-A conversation between Theo and Pat

20 0.50396413 247 hunch net-2007-06-14-Interesting Papers at COLT 2007

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(3, 0.047), (27, 0.228), (38, 0.053), (53, 0.025), (55, 0.053), (64, 0.016), (77, 0.407), (94, 0.046), (95, 0.032)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.98106873 436 hunch net-2011-06-22-Ultra LDA

Introduction: Shravan and Alex ‘s LDA code is released . On a single machine, I’m not sure how it currently compares to the online LDA in VW , but the ability to effectively scale across very many machines is surely interesting.

2 0.97442895 405 hunch net-2010-08-21-Rob Schapire at NYC ML Meetup

Introduction: I’ve been wanting to attend the NYC ML Meetup for some time and hope to make it next week on the 25th . Rob Schapire is talking about “Playing Repeated Games”, which in my experience is far more relevant to machine learning than the title might indicate.

same-blog 3 0.92536473 206 hunch net-2006-09-09-How to solve an NP hard problem in quadratic time

4 0.83774209 165 hunch net-2006-03-23-The Approximation Argument

Introduction: An argument is sometimes made that the Bayesian way is the “right” way to do machine learning. This is a serious argument which deserves a serious reply. The approximation argument is a serious reply for which I have not yet seen a reply 2 . The idea for the Bayesian approach is quite simple, elegant, and general. Essentially, you first specify a prior P(D) over possible processes D producing the data, observe the data, then condition on the data according to Bayes law to construct a posterior: P(D|x) = P(x|D)P(D)/P(x) After this, hard decisions are made (such as “turn left” or “turn right”) by choosing the one which minimizes the expected (with respect to the posterior) loss. This basic idea is reused thousands of times with various choices of P(D) and loss functions which is unsurprising given the many nice properties: There is an extremely strong associated guarantee: If the actual distribution generating the data is drawn from P(D) there is no better method.

5 0.80408752 269 hunch net-2007-10-24-Contextual Bandits

Introduction: One of the fundamental underpinnings of the internet is advertising based content. This has become much more effective due to targeted advertising where ads are specifically matched to interests. Everyone is familiar with this, because everyone uses search engines and all search engines try to make money this way. The problem of matching ads to interests is a natural machine learning problem in some ways since there is much information in who clicks on what. A fundamental problem with this information is that it is not supervised—in particular a click-or-not on one ad doesn’t generally tell you if a different ad would have been clicked on. This implies we have a fundamental exploration problem. A standard mathematical setting for this situation is “ k -Armed Bandits”, often with various relevant embellishments. The k -Armed Bandit setting works on a round-by-round basis. On each round: A policy chooses arm a from 1 of k arms (i.e. 1 of k ads). The world reveals t

6 0.76351058 388 hunch net-2010-01-24-Specializations of the Master Problem

7 0.74316412 317 hunch net-2008-09-12-How do we get weak action dependence for learning with partial observations?

8 0.73593193 375 hunch net-2009-10-26-NIPS workshops

9 0.63979828 392 hunch net-2010-03-26-A Variance only Deviation Bound

10 0.62072921 60 hunch net-2005-04-23-Advantages and Disadvantages of Bayesian Learning

11 0.61065394 118 hunch net-2005-10-07-On-line learning of regular decision rules

12 0.60937858 259 hunch net-2007-08-19-Choice of Metrics

13 0.5811829 235 hunch net-2007-03-03-All Models of Learning have Flaws

14 0.57923037 183 hunch net-2006-06-14-Explorations of Exploration

15 0.57668203 100 hunch net-2005-08-04-Why Reinforcement Learning is Important

16 0.57632399 293 hunch net-2008-03-23-Interactive Machine Learning

17 0.57133067 220 hunch net-2006-11-27-Continuizing Solutions

18 0.57002813 311 hunch net-2008-07-26-Compositional Machine Learning Algorithm Design

19 0.55803543 237 hunch net-2007-04-02-Contextual Scaling

20 0.55796051 258 hunch net-2007-08-12-Exponentiated Gradient