nips nips2007 nips2007-136 knowledge-graph by maker-knowledge-mining

136 nips-2007-Multiple-Instance Active Learning


Source: pdf

Author: Burr Settles, Mark Craven, Soumya Ray

Abstract: We present a framework for active learning in the multiple-instance (MI) setting. In an MI learning problem, instances are naturally organized into bags and it is the bags, instead of individual instances, that are labeled for training. MI learners assume that every instance in a bag labeled negative is actually negative, whereas at least one instance in a bag labeled positive is actually positive. We consider the particular case in which an MI learner is allowed to selectively query unlabeled instances from positive bags. This approach is well motivated in domains in which it is inexpensive to acquire bag labels and possible, but expensive, to acquire instance labels. We describe a method for learning from labels at mixed levels of granularity, and introduce two active query selection strategies motivated by the MI setting. Our experiments show that learning from instance labels can significantly improve performance of a basic MI learning algorithm in two multiple-instance domains: content-based image retrieval and text classification. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 In an MI learning problem, instances are naturally organized into bags and it is the bags, instead of individual instances, that are labeled for training. [sent-6, score-0.75]

2 MI learners assume that every instance in a bag labeled negative is actually negative, whereas at least one instance in a bag labeled positive is actually positive. [sent-7, score-1.161]

3 We consider the particular case in which an MI learner is allowed to selectively query unlabeled instances from positive bags. [sent-8, score-0.87]

4 This approach is well motivated in domains in which it is inexpensive to acquire bag labels and possible, but expensive, to acquire instance labels. [sent-9, score-0.59]

5 We describe a method for learning from labels at mixed levels of granularity, and introduce two active query selection strategies motivated by the MI setting. [sent-10, score-0.581]

6 Our experiments show that learning from instance labels can significantly improve performance of a basic MI learning algorithm in two multiple-instance domains: content-based image retrieval and text classification. [sent-11, score-0.356]

7 In the MI setting, instances are grouped into bags (i. [sent-14, score-0.671]

8 A bag is labeled negative if and only if it contains all negative instances. [sent-17, score-0.465]

9 A bag is labeled positive, however, if at least one of its instances is positive. [sent-18, score-0.639]

10 Note that positive bags may also contain negative instances. [sent-19, score-0.512]

11 For the CBIR task, images are represented as bags and instances correspond to segmented regions of the image. [sent-23, score-0.716]

12 A bag representing a given image is labeled positive if the image contains some object of interest. [sent-24, score-0.555]

13 For text classification, documents are represented as bags and instances correspond to short passages (e. [sent-27, score-0.825]

14 (a) (b) Figure 1: Motivating examples for multiple- instance active learning. [sent-36, score-0.288]

15 (a) In content- based image retrieval, images are represented as bags and instances correspond to segmented image regions. [sent-37, score-0.806]

16 An active MI learner may query which segments belong to the object of interest, such as the gold medal shown in this image. [sent-38, score-0.719]

17 (b) In text classification, documents are bags and the instances represent passages of text. [sent-39, score-0.805]

18 In MI active learning, the learner may query specific passages to determine if they are representative of the positive class at hand. [sent-40, score-0.728]

19 The main challenge of multiple- instance learning is that, to induce an accurate model of the target concept, the learner must determine which instances in positive bags are actually positive, even though the ratio of negatives to positives in these bags can be arbitrarily high. [sent-41, score-1.536]

20 For many MI problems, such as the tasks illustrated in Figure 1, it is possible to obtain labels both at the bag level and directly at the instance level. [sent-42, score-0.58]

21 The approach that we consider here is one that involves selectively obtaining the labels of certain instances in the context of MI learning. [sent-45, score-0.391]

22 In particular, we consider obtaining labels for selected instances in positive bags, since the labels for instances in negative bags are known. [sent-46, score-1.181]

23 In active learning [2], the learner is allowed to ask queries about unlabeled instances. [sent-47, score-0.536]

24 In this way, the oracle (or human annotator) is required to label only instances that are assumed to be most valuable for training. [sent-48, score-0.31]

25 In the standard supervised setting, pool- based active learning typically begins with an initial learner trained with a small set of labeled instances. [sent-49, score-0.481]

26 Then the learner can query instances from a large pool of unlabeled instances, re- train, and repeat. [sent-50, score-0.759]

27 We argue that whereas multiple- instance learning reduces the burden of labeling data by getting labels at a coarse level of granularity, we may also benefit from selectively labeling some part of the training data at a finer level of granularity. [sent-52, score-0.437]

28 Hence, we explore the approach of multiple- instance active learning as a way to efficiently overcome the ambiguity of the MI framework while keeping labeling costs low. [sent-53, score-0.341]

29 The first, which is analogous to standard supervised active learning, is simply to allow the learner to query for the labels of unlabeled bags. [sent-55, score-0.782]

30 A second scenario is one in which all bags in the training set are labeled and the learner is allowed to query for the labels of selected instances from positive bags. [sent-56, score-1.379]

31 For example, the learner might query on particular image segments or passages of text in the CBIR and text classification domains, respectively. [sent-57, score-0.716]

32 If an instance- query result is positive, the learner now has direct evidence for the positive class. [sent-58, score-0.493]

33 If the query result is negative, the learner knows to focus its attention to other instances from that bag, also reducing ambiguity. [sent-59, score-0.684]

34 A third scenario involves querying selected positive bags rather than instances, and obtaining labels for any (or all) instances in such bags. [sent-60, score-0.891]

35 For example, the learner might query a positive image in the CBIR domain, and ask the oracle to label as many segments as desired. [sent-61, score-0.654]

36 A final scenario would assume that some bags are labeled and some are not, and the learner would be able to query on (i) unlabeled bags, (ii) unlabeled instances in positive bags, or (iii) some combination thereof. [sent-62, score-1.409]

37 In the present work, we focus on the second formulation above, where the learner queries selected unlabeled instances from labeled, positive bags. [sent-63, score-0.655]

38 First, we describe the algorithms we use to train MI classifiers and select instance queries for active learning. [sent-65, score-0.395]

39 For MI classification, we seek the conditional probability that the label yi is positive for bag Bi given n constituent instances: P (yi = 1|Bi = {Bi1 , Bi2 , . [sent-70, score-0.444]

40 If a classifier can provide an equivalent probability P (yij = 1|Bij ) for instance Bij , we can use a combining function (such as softmax or noisy-or) to combine posterior probabilities of all the instances in a bag and estimate its posterior probability P (yi = 1|Bi ). [sent-74, score-0.801]

41 If the model finds an instance likely to be positive, the output of the combining function should find its corresponding bag likely to be positive as well. [sent-76, score-0.567]

42 In order to combine these class probabilities for instances into a class probability for a bag, MILR uses the softmax function: oi = P (yi = 1|Bi ) = softmaxα (oi1 , . [sent-80, score-0.332]

43 In the general MI setting we do not know the labels of instances in positive bags. [sent-84, score-0.371]

44 In the present work, we minimize squared error over the bags E(θ) = 1 i (yi − oi )2 , where yi ∈ {0, 1} 2 is the known label of bag Bi . [sent-86, score-0.864]

45 Suppose our active MI learner queries instance Bij and the corresponding instance label yij is provided by the oracle. [sent-90, score-0.881]

46 Consider, though, that in MI learning a labeled instance is effectively the same as a labeled bag that contains only that instance. [sent-93, score-0.62]

47 So when the label for instance Bij is known, we transform the training set for each query by adding a new training tuple {Bij }, yij , where {Bij } is a new singleton bag containing only a copy of the queried instance, and yij is the corresponding label. [sent-94, score-1.176]

48 A copy of the query instance Bij also remains in the original bag Bi , enabling the learner to compute the remaining instance gradients as described below. [sent-95, score-1.048]

49 Since the objective function will guide the learner toward classifying the singleton query instance Bij in the positive tuple {Bij }, 1 as positive, it will tend to classify the original bag Bi positive as well. [sent-96, score-1.125]

50 Conversely, if we add the negative tuple {Bij }, 0 , the learner will tend to classify the instance negative in the original bag, which will affect the other instance gradients via the combining function and guides the learner to focus on other potentially positive instances in that bag. [sent-97, score-1.121]

51 It may seem that this effect on the original bag could be achieved by clamping the instance output oij to yij during training, but this has the undesirable property of eliminating the training signal for the bag and the instance. [sent-98, score-1.081]

52 If yij = 1, the combining function output would be extremely high, making bag error nearly zero, thus minimizing the objective function without any actual parameter updates. [sent-99, score-0.505]

53 If yij = 0, the instance would output nothing to the combining function, thus the learner would get no training signal for this instance (though in this case the learner can still focus on other instances in the bag). [sent-100, score-1.108]

54 It is possible to combine clamped instance outputs with our singleton bag approach to overcome this problem, but our experiments indicate that this has no practical advantage over adding singleton bags alone. [sent-101, score-1.041]

55 Also note that simply adding singleton bags will alter the objective function by adding weight, albeit indirectly, to bags that have been queried more often. [sent-102, score-1.041]

56 To control this effect, we uniformly weight each bag and all its queried singleton bags to sum to 1 when computing the value and gradient for the objective function during training. [sent-103, score-0.925]

57 For example, an unqueried bag has weight 1, a bag with one instance query and its derived singleton bag each have weight 0. [sent-104, score-1.41]

58 Now we turn our attention to strategies for selecting query instances for labeling. [sent-107, score-0.527]

59 For probabilistic classifiers, this involves applying the classifier to each unlabeled instance and querying those with most uncertainty about the class label. [sent-109, score-0.341]

60 Recall that the learned model estimates oij = P (yij = 1|Bij ), the probability that instance Bij is positive. [sent-110, score-0.298]

61 We argue that when doing active learning in a multiple-instance setting, the selection criterion should take into account not just uncertainty about a given instance’s class label, but also the extent to which the learner can adequately “explain” the bag to which the instance belongs. [sent-116, score-0.93]

62 For example, the instance that the learner finds most uncertain may belong to the same bag as the instance it finds most positive. [sent-117, score-0.823]

63 In this case, the learned model will have a high value of P (yi = 1|Bi ) for the bag because the value computed by the combining function will be dominated by the output of the positive-looking instance. [sent-118, score-0.38]

64 We propose an uncertainty-based query strategy that weights the uncertainty of Bij in terms of how much it contributes to the classification of bag Bi . [sent-119, score-0.661]

65 As such, we define the MI Uncertainty (MIU) of an instance to be the derivative of bag output with respect to instance output (i. [sent-120, score-0.636]

66 Another query strategy we consider is to identify the instance that would impart the greatest change to the current model if we knew its label. [sent-124, score-0.428]

67 Since we train MILR with gradient descent, this involves querying the instance which, if {Bij }, yij is added to the training set, would create the greatest change in the gradient of the objective function (i. [sent-125, score-0.443]

68 + Now let Eij (θ) be the new gradient obtained by adding the positive tuple {Bij }, 1 to the training − set, and likewise let Eij (θ) be the new gradient if a query results in the negative tuple {Bij }, 0 being added. [sent-132, score-0.508]

69 Since we do not know which label the oracle will provide in advance, we instead calculate the expected length of the gradient based on the learner’s current belief oij in each outcome. [sent-133, score-0.281]

70 More precisely, we define the Expected Gradient Length (EGL) to be: EGL(Bij ) = oij + Eij (θ) + (1 − oij ) − Eij (θ) . [sent-134, score-0.316]

71 This strategy can be generalized to query for other properties in non-MI active learning as well. [sent-137, score-0.411]

72 [11] use a related approach to determine the expected label of candidate query instances when combining active learning with graph-based semi-supervised learning. [sent-139, score-0.72]

73 We modified the collection by manually annotating the instance segments that belong to the labeled object for each image using a graphical interface we developed. [sent-147, score-0.341]

74 This corpus was chosen because it is an established benchmark for text classification, and because the source texts—newsnet posts from the early 1990s—are relatively short (in the MI setting, instances are usually paragraphs or short passages [1, 9]). [sent-149, score-0.485]

75 For each of the 20 news categories, we generate artificial bags of approximately 50 posts (instances) each by randomly sampling from the target class (i. [sent-150, score-0.462]

76 , newsgroup category) at a rate of 3% for positive bags, with remaining instances (and all instances for negative bags) drawn uniformly from the other classes. [sent-152, score-0.58]

77 We construct a data set of 100 bags (50 positives and 50 negatives) for each class. [sent-155, score-0.45]

78 We evaluate our methods by constructing learning curves that plot the area under the ROC curve (AUROC) as a function of instances queried for each data set and selection strategy. [sent-160, score-0.353]

79 The initial point in all experiments is the AUROC for a model trained on labeled bags from the training set without any instance queries. [sent-161, score-0.69]

80 Following previous work on the CBIR problem [8], we average results for SIVAL over 20 independent runs for each image class, where the learner begins with 20 randomly drawn positive bags (from which instances may be queried) and 20 random negative bags. [sent-162, score-1.0]

81 The model is then evaluated on the remainder of the unlabeled bags, and labeled query instances are added to the training set in batches of size q = 2. [sent-163, score-0.692]

82 For 20 Newsgroups, we average results using 10-fold cross-validation for each newsgroup category, using a query batch size of q = 5. [sent-164, score-0.266]

83 In Table 1 we summarize all curves by reporting the average improvement made by each query selection strategy over the initial MILR model (before any instance queries) for various points along the learning curve. [sent-167, score-0.487]

84 Table 2 presents a more detailed comparison of the initial model against each query selection method at a fixed point early on in active learning (10 query batches). [sent-168, score-0.712]

85 As Table 1 shows, random selection at 100 queries fails to be competitive with the three active query strategies after half as many queries. [sent-266, score-0.566]

86 On 20 Newsgroups tasks, random selection has a slight negative effect (if any) early on, possibly because it lacks a focused search for positive instances (of which there are only one or two per bag). [sent-267, score-0.377]

87 Finally, MIU appears to be a well-suited query strategy for this formulation of MI active learning. [sent-269, score-0.411]

88 On both data sets, it consistently improves the initial MI learner, usually with statistical significance, and often approaches the asymptotic level of accuracy with fewer labeled instances than the other two active methods. [sent-270, score-0.487]

89 MIU’s gains over these other query strategies are not usually statistically significant, however, and in the long run it is generally matched or slightly surpassed by them. [sent-272, score-0.311]

90 Table 2: Detailed comparison of the initial MI learner against various query strategies after 10 query batches (20 instances for SIVAL, 50 instances for 20 Newsgroups). [sent-274, score-1.276]

91 793 TOTAL NUMBER OF WINS 4 3 9 12 19 It is also interesting to note that in an earlier version of our learning algorithm, we did not normalize weights for bags and instance-query singleton bags when learning with labels at mixed granularities. [sent-539, score-1.034]

92 Instead, all such bags were weighted equally and the objective function was slightly altered. [sent-540, score-0.452]

93 In those experiments, MIU’s accuracy was roughly equivalent to the figures reported here, although the improvement for all other query strategies (especially random selection) were lower. [sent-541, score-0.289]

94 5 Conclusion We have presented multiple-instance active learning, a novel framework for reducing the labeling burden by obtaining labels at a coarse granularity, and then selectively labeling at finer levels. [sent-542, score-0.45]

95 This approach is useful when bag labels are easily acquired, and instance labels can be obtained but are expensive. [sent-543, score-0.634]

96 In the present work, we explored the case where an MI learner may query unlabeled instances from positively labeled bags in order reduce the inherent ambiguity of the MI representation, while keeping label costs low. [sent-544, score-1.341]

97 We also described a simple method for learning from labels at both the bag-level and instance-level, and showed that querying instance-level labels through active learning is beneficial in content-based image retrieval and text categorization problems. [sent-545, score-0.5]

98 In addition, we introduced two active query selection strategies motivated by this work, MI Uncertainty and Expected Gradient Length, and demonstrated that they are well-suited to MI active learning. [sent-546, score-0.624]

99 Of particular interest is the setting where, initially, some bags are labeled and others are not, and the learner is allowed to query on (i) unlabeled bags, (ii) unlabeled instances from positively labeled bags, or (iii) some combination thereof. [sent-548, score-1.461]

100 We also plan to investigate other selection methods for different query formats, such as “label any or all positive instances in this bag,” which may be more natural for some MI learning problems. [sent-549, score-0.565]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('bags', 0.433), ('mi', 0.33), ('bag', 0.322), ('bij', 0.255), ('miu', 0.244), ('query', 0.241), ('instances', 0.238), ('egl', 0.229), ('learner', 0.205), ('oij', 0.158), ('active', 0.148), ('instance', 0.14), ('yij', 0.106), ('cbir', 0.1), ('milr', 0.1), ('queries', 0.09), ('passages', 0.087), ('labels', 0.086), ('auroc', 0.086), ('labeled', 0.079), ('uncertainty', 0.076), ('newsgroups', 0.075), ('unlabeled', 0.075), ('sival', 0.072), ('singleton', 0.063), ('softmax', 0.06), ('mouse', 0.057), ('queried', 0.053), ('labeling', 0.053), ('bi', 0.052), ('label', 0.052), ('granularity', 0.05), ('querying', 0.05), ('strategies', 0.048), ('positive', 0.047), ('text', 0.047), ('selectively', 0.046), ('genome', 0.046), ('image', 0.045), ('segments', 0.044), ('batches', 0.043), ('eij', 0.043), ('combining', 0.041), ('tuple', 0.041), ('selection', 0.039), ('retrieval', 0.038), ('classi', 0.038), ('gradient', 0.035), ('oi', 0.034), ('tasks', 0.032), ('negative', 0.032), ('medal', 0.029), ('posts', 0.029), ('rahmani', 0.029), ('spritecan', 0.029), ('translucentbowl', 0.029), ('segmented', 0.029), ('supervised', 0.027), ('greatest', 0.025), ('newsgroup', 0.025), ('ray', 0.025), ('curves', 0.023), ('coarse', 0.023), ('yi', 0.023), ('paragraphs', 0.023), ('negatives', 0.023), ('gains', 0.022), ('strategy', 0.022), ('initial', 0.022), ('cheaply', 0.021), ('acquire', 0.021), ('ner', 0.021), ('dietterich', 0.021), ('obtaining', 0.021), ('early', 0.021), ('short', 0.02), ('oracle', 0.02), ('burden', 0.02), ('adding', 0.02), ('icml', 0.019), ('texts', 0.019), ('wins', 0.019), ('gold', 0.019), ('mixed', 0.019), ('database', 0.019), ('objective', 0.019), ('winning', 0.018), ('positively', 0.018), ('allowed', 0.018), ('logistic', 0.018), ('train', 0.017), ('positives', 0.017), ('object', 0.017), ('output', 0.017), ('images', 0.016), ('training', 0.016), ('scenario', 0.016), ('belong', 0.016), ('document', 0.016), ('length', 0.016)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 136 nips-2007-Multiple-Instance Active Learning

Author: Burr Settles, Mark Craven, Soumya Ray

Abstract: We present a framework for active learning in the multiple-instance (MI) setting. In an MI learning problem, instances are naturally organized into bags and it is the bags, instead of individual instances, that are labeled for training. MI learners assume that every instance in a bag labeled negative is actually negative, whereas at least one instance in a bag labeled positive is actually positive. We consider the particular case in which an MI learner is allowed to selectively query unlabeled instances from positive bags. This approach is well motivated in domains in which it is inexpensive to acquire bag labels and possible, but expensive, to acquire instance labels. We describe a method for learning from labels at mixed levels of granularity, and introduce two active query selection strategies motivated by the MI setting. Our experiments show that learning from instance labels can significantly improve performance of a basic MI learning algorithm in two multiple-instance domains: content-based image retrieval and text classification. 1

2 0.24162343 69 nips-2007-Discriminative Batch Mode Active Learning

Author: Yuhong Guo, Dale Schuurmans

Abstract: Active learning sequentially selects unlabeled instances to label with the goal of reducing the effort needed to learn a good classifier. Most previous studies in active learning have focused on selecting one unlabeled instance to label at a time while retraining in each iteration. Recently a few batch mode active learning approaches have been proposed that select a set of most informative unlabeled instances in each iteration under the guidance of heuristic scores. In this paper, we propose a discriminative batch mode active learning approach that formulates the instance selection task as a continuous optimization problem over auxiliary instance selection variables. The optimization is formulated to maximize the discriminative classification performance of the target classifier, while also taking the unlabeled data into account. Although the objective is not convex, we can manipulate a quasi-Newton method to obtain a good local solution. Our empirical studies on UCI datasets show that the proposed active learning is more effective than current state-of-the art batch mode active learning algorithms. 1

3 0.17452338 42 nips-2007-CPR for CSPs: A Probabilistic Relaxation of Constraint Propagation

Author: Luis E. Ortiz

Abstract: This paper proposes constraint propagation relaxation (CPR), a probabilistic approach to classical constraint propagation that provides another view on the whole parametric family of survey propagation algorithms SP(ρ). More importantly, the approach elucidates the implicit, but fundamental assumptions underlying SP(ρ), thus shedding some light on its effectiveness and leading to applications beyond k-SAT. 1

4 0.09070278 6 nips-2007-A General Boosting Method and its Application to Learning Ranking Functions for Web Search

Author: Zhaohui Zheng, Hongyuan Zha, Tong Zhang, Olivier Chapelle, Keke Chen, Gordon Sun

Abstract: We present a general boosting method extending functional gradient boosting to optimize complex loss functions that are encountered in many machine learning problems. Our approach is based on optimization of quadratic upper bounds of the loss functions which allows us to present a rigorous convergence analysis of the algorithm. More importantly, this general framework enables us to use a standard regression base learner such as single regression tree for £tting any loss function. We illustrate an application of the proposed method in learning ranking functions for Web search by combining both preference data and labeled data for training. We present experimental results for Web search using data from a commercial search engine that show signi£cant improvements of our proposed methods over some existing methods. 1

5 0.08289215 166 nips-2007-Regularized Boost for Semi-Supervised Learning

Author: Ke Chen, Shihai Wang

Abstract: Semi-supervised inductive learning concerns how to learn a decision rule from a data set containing both labeled and unlabeled data. Several boosting algorithms have been extended to semi-supervised learning with various strategies. To our knowledge, however, none of them takes local smoothness constraints among data into account during ensemble learning. In this paper, we introduce a local smoothness regularizer to semi-supervised boosting algorithms based on the universal optimization framework of margin cost functionals. Our regularizer is applicable to existing semi-supervised boosting algorithms to improve their generalization and speed up their training. Comparative results on synthetic, benchmark and real world tasks demonstrate the effectiveness of our local smoothness regularizer. We discuss relevant issues and relate our regularizer to previous work. 1

6 0.082811914 175 nips-2007-Semi-Supervised Multitask Learning

7 0.080969088 143 nips-2007-Object Recognition by Scene Alignment

8 0.078313388 110 nips-2007-Learning Bounds for Domain Adaptation

9 0.078235917 16 nips-2007-A learning framework for nearest neighbor search

10 0.073901296 15 nips-2007-A general agnostic active learning algorithm

11 0.069858484 137 nips-2007-Multiple-Instance Pruning For Learning Efficient Cascade Detectors

12 0.062853754 19 nips-2007-Active Preference Learning with Discrete Choice Data

13 0.059040595 201 nips-2007-The Value of Labeled and Unlabeled Examples when the Model is Imperfect

14 0.0579895 113 nips-2007-Learning Visual Attributes

15 0.053836208 187 nips-2007-Structured Learning with Approximate Inference

16 0.05064277 186 nips-2007-Statistical Analysis of Semi-Supervised Regression

17 0.049584329 126 nips-2007-McRank: Learning to Rank Using Multiple Classification and Gradient Boosting

18 0.048935521 2 nips-2007-A Bayesian LDA-based model for semi-supervised part-of-speech tagging

19 0.047849316 75 nips-2007-Efficient Bayesian Inference for Dynamically Changing Graphs

20 0.047617219 172 nips-2007-Scene Segmentation with CRFs Learned from Partially Labeled Images


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.144), (1, 0.043), (2, -0.1), (3, 0.049), (4, 0.056), (5, 0.106), (6, 0.111), (7, -0.065), (8, 0.167), (9, 0.008), (10, -0.004), (11, 0.024), (12, -0.077), (13, -0.093), (14, -0.008), (15, 0.109), (16, -0.039), (17, 0.017), (18, 0.135), (19, -0.048), (20, 0.127), (21, -0.009), (22, 0.267), (23, -0.077), (24, 0.011), (25, -0.036), (26, 0.185), (27, -0.013), (28, -0.026), (29, 0.079), (30, -0.154), (31, -0.215), (32, 0.081), (33, -0.053), (34, -0.076), (35, -0.022), (36, 0.081), (37, 0.099), (38, -0.008), (39, -0.004), (40, 0.043), (41, -0.223), (42, 0.035), (43, 0.069), (44, 0.047), (45, 0.034), (46, 0.01), (47, -0.054), (48, 0.042), (49, 0.186)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97261208 136 nips-2007-Multiple-Instance Active Learning

Author: Burr Settles, Mark Craven, Soumya Ray

Abstract: We present a framework for active learning in the multiple-instance (MI) setting. In an MI learning problem, instances are naturally organized into bags and it is the bags, instead of individual instances, that are labeled for training. MI learners assume that every instance in a bag labeled negative is actually negative, whereas at least one instance in a bag labeled positive is actually positive. We consider the particular case in which an MI learner is allowed to selectively query unlabeled instances from positive bags. This approach is well motivated in domains in which it is inexpensive to acquire bag labels and possible, but expensive, to acquire instance labels. We describe a method for learning from labels at mixed levels of granularity, and introduce two active query selection strategies motivated by the MI setting. Our experiments show that learning from instance labels can significantly improve performance of a basic MI learning algorithm in two multiple-instance domains: content-based image retrieval and text classification. 1

2 0.71818554 69 nips-2007-Discriminative Batch Mode Active Learning

Author: Yuhong Guo, Dale Schuurmans

Abstract: Active learning sequentially selects unlabeled instances to label with the goal of reducing the effort needed to learn a good classifier. Most previous studies in active learning have focused on selecting one unlabeled instance to label at a time while retraining in each iteration. Recently a few batch mode active learning approaches have been proposed that select a set of most informative unlabeled instances in each iteration under the guidance of heuristic scores. In this paper, we propose a discriminative batch mode active learning approach that formulates the instance selection task as a continuous optimization problem over auxiliary instance selection variables. The optimization is formulated to maximize the discriminative classification performance of the target classifier, while also taking the unlabeled data into account. Although the objective is not convex, we can manipulate a quasi-Newton method to obtain a good local solution. Our empirical studies on UCI datasets show that the proposed active learning is more effective than current state-of-the art batch mode active learning algorithms. 1

3 0.54724181 15 nips-2007-A general agnostic active learning algorithm

Author: Sanjoy Dasgupta, Claire Monteleoni, Daniel J. Hsu

Abstract: We present an agnostic active learning algorithm for any hypothesis class of bounded VC dimension under arbitrary data distributions. Most previous work on active learning either makes strong distributional assumptions, or else is computationally prohibitive. Our algorithm extends the simple scheme of Cohn, Atlas, and Ladner [1] to the agnostic setting, using reductions to supervised learning that harness generalization bounds in a simple but subtle manner. We provide a fall-back guarantee that bounds the algorithm’s label complexity by the agnostic PAC sample complexity. Our analysis yields asymptotic label complexity improvements for certain hypothesis classes and distributions. We also demonstrate improvements experimentally. 1

4 0.51858377 42 nips-2007-CPR for CSPs: A Probabilistic Relaxation of Constraint Propagation

Author: Luis E. Ortiz

Abstract: This paper proposes constraint propagation relaxation (CPR), a probabilistic approach to classical constraint propagation that provides another view on the whole parametric family of survey propagation algorithms SP(ρ). More importantly, the approach elucidates the implicit, but fundamental assumptions underlying SP(ρ), thus shedding some light on its effectiveness and leading to applications beyond k-SAT. 1

5 0.43418333 19 nips-2007-Active Preference Learning with Discrete Choice Data

Author: Brochu Eric, Nando D. Freitas, Abhijeet Ghosh

Abstract: We propose an active learning algorithm that learns a continuous valuation model from discrete preferences. The algorithm automatically decides what items are best presented to an individual in order to find the item that they value highly in as few trials as possible, and exploits quirks of human psychology to minimize time and cognitive burden. To do this, our algorithm maximizes the expected improvement at each query without accurately modelling the entire valuation surface, which would be needlessly expensive. The problem is particularly difficult because the space of choices is infinite. We demonstrate the effectiveness of the new algorithm compared to related active learning methods. We also embed the algorithm within a decision making tool for assisting digital artists in rendering materials. The tool finds the best parameters while minimizing the number of queries. 1

6 0.41602618 201 nips-2007-The Value of Labeled and Unlabeled Examples when the Model is Imperfect

7 0.34435904 113 nips-2007-Learning Visual Attributes

8 0.33475044 16 nips-2007-A learning framework for nearest neighbor search

9 0.32769048 166 nips-2007-Regularized Boost for Semi-Supervised Learning

10 0.32293153 139 nips-2007-Nearest-Neighbor-Based Active Learning for Rare Category Detection

11 0.30403209 27 nips-2007-Anytime Induction of Cost-sensitive Trees

12 0.29911703 6 nips-2007-A General Boosting Method and its Application to Learning Ranking Functions for Web Search

13 0.28761581 76 nips-2007-Efficient Convex Relaxation for Transductive Support Vector Machine

14 0.2801176 175 nips-2007-Semi-Supervised Multitask Learning

15 0.27498645 110 nips-2007-Learning Bounds for Domain Adaptation

16 0.26821184 28 nips-2007-Augmented Functional Time Series Representation and Forecasting with Gaussian Processes

17 0.26435587 172 nips-2007-Scene Segmentation with CRFs Learned from Partially Labeled Images

18 0.26348248 114 nips-2007-Learning and using relational theories

19 0.25901061 143 nips-2007-Object Recognition by Scene Alignment

20 0.24838361 137 nips-2007-Multiple-Instance Pruning For Learning Efficient Cascade Detectors


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(5, 0.024), (11, 0.012), (13, 0.027), (16, 0.029), (18, 0.015), (19, 0.01), (21, 0.113), (31, 0.016), (34, 0.015), (35, 0.033), (47, 0.076), (83, 0.125), (85, 0.011), (87, 0.032), (89, 0.331), (90, 0.038)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.75994408 136 nips-2007-Multiple-Instance Active Learning

Author: Burr Settles, Mark Craven, Soumya Ray

Abstract: We present a framework for active learning in the multiple-instance (MI) setting. In an MI learning problem, instances are naturally organized into bags and it is the bags, instead of individual instances, that are labeled for training. MI learners assume that every instance in a bag labeled negative is actually negative, whereas at least one instance in a bag labeled positive is actually positive. We consider the particular case in which an MI learner is allowed to selectively query unlabeled instances from positive bags. This approach is well motivated in domains in which it is inexpensive to acquire bag labels and possible, but expensive, to acquire instance labels. We describe a method for learning from labels at mixed levels of granularity, and introduce two active query selection strategies motivated by the MI setting. Our experiments show that learning from instance labels can significantly improve performance of a basic MI learning algorithm in two multiple-instance domains: content-based image retrieval and text classification. 1

2 0.49994928 94 nips-2007-Gaussian Process Models for Link Analysis and Transfer Learning

Author: Kai Yu, Wei Chu

Abstract: This paper aims to model relational data on edges of networks. We describe appropriate Gaussian Processes (GPs) for directed, undirected, and bipartite networks. The inter-dependencies of edges can be effectively modeled by adapting the GP hyper-parameters. The framework suggests an intimate connection between link prediction and transfer learning, which were traditionally two separate research topics. We develop an efficient learning algorithm that can handle a large number of observations. The experimental results on several real-world data sets verify superior learning capacity. 1

3 0.49758327 69 nips-2007-Discriminative Batch Mode Active Learning

Author: Yuhong Guo, Dale Schuurmans

Abstract: Active learning sequentially selects unlabeled instances to label with the goal of reducing the effort needed to learn a good classifier. Most previous studies in active learning have focused on selecting one unlabeled instance to label at a time while retraining in each iteration. Recently a few batch mode active learning approaches have been proposed that select a set of most informative unlabeled instances in each iteration under the guidance of heuristic scores. In this paper, we propose a discriminative batch mode active learning approach that formulates the instance selection task as a continuous optimization problem over auxiliary instance selection variables. The optimization is formulated to maximize the discriminative classification performance of the target classifier, while also taking the unlabeled data into account. Although the objective is not convex, we can manipulate a quasi-Newton method to obtain a good local solution. Our empirical studies on UCI datasets show that the proposed active learning is more effective than current state-of-the art batch mode active learning algorithms. 1

4 0.49310496 93 nips-2007-GRIFT: A graphical model for inferring visual classification features from human data

Author: Michael Ross, Andrew Cohen

Abstract: This paper describes a new model for human visual classification that enables the recovery of image features that explain human subjects’ performance on different visual classification tasks. Unlike previous methods, this algorithm does not model their performance with a single linear classifier operating on raw image pixels. Instead, it represents classification as the combination of multiple feature detectors. This approach extracts more information about human visual classification than previous methods and provides a foundation for further exploration. 1

5 0.49163118 172 nips-2007-Scene Segmentation with CRFs Learned from Partially Labeled Images

Author: Bill Triggs, Jakob J. Verbeek

Abstract: Conditional Random Fields (CRFs) are an effective tool for a variety of different data segmentation and labeling tasks including visual scene interpretation, which seeks to partition images into their constituent semantic-level regions and assign appropriate class labels to each region. For accurate labeling it is important to capture the global context of the image as well as local information. We introduce a CRF based scene labeling model that incorporates both local features and features aggregated over the whole image or large sections of it. Secondly, traditional CRF learning requires fully labeled datasets which can be costly and troublesome to produce. We introduce a method for learning CRFs from datasets with many unlabeled nodes by marginalizing out the unknown labels so that the log-likelihood of the known ones can be maximized by gradient ascent. Loopy Belief Propagation is used to approximate the marginals needed for the gradient and log-likelihood calculations and the Bethe free-energy approximation to the log-likelihood is monitored to control the step size. Our experimental results show that effective models can be learned from fragmentary labelings and that incorporating top-down aggregate features significantly improves the segmentations. The resulting segmentations are compared to the state-of-the-art on three different image datasets. 1

6 0.4915106 97 nips-2007-Hidden Common Cause Relations in Relational Learning

7 0.49007607 209 nips-2007-Ultrafast Monte Carlo for Statistical Summations

8 0.48953903 212 nips-2007-Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes

9 0.48800898 18 nips-2007-A probabilistic model for generating realistic lip movements from speech

10 0.48557702 88 nips-2007-Fast and Scalable Training of Semi-Supervised CRFs with Application to Activity Recognition

11 0.48528153 175 nips-2007-Semi-Supervised Multitask Learning

12 0.48506039 56 nips-2007-Configuration Estimates Improve Pedestrian Finding

13 0.48462802 19 nips-2007-Active Preference Learning with Discrete Choice Data

14 0.48355624 6 nips-2007-A General Boosting Method and its Application to Learning Ranking Functions for Web Search

15 0.48353884 153 nips-2007-People Tracking with the Laplacian Eigenmaps Latent Variable Model

16 0.48289934 156 nips-2007-Predictive Matrix-Variate t Models

17 0.48239163 186 nips-2007-Statistical Analysis of Semi-Supervised Regression

18 0.48221472 73 nips-2007-Distributed Inference for Latent Dirichlet Allocation

19 0.48217714 138 nips-2007-Near-Maximum Entropy Models for Binary Neural Representations of Natural Images

20 0.48090002 104 nips-2007-Inferring Neural Firing Rates from Spike Trains Using Gaussian Processes