nips nips2006 nips2006-136 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Zhi-hua Zhou, Min-ling Zhang
Abstract: In this paper, we formalize multi-instance multi-label learning, where each training example is associated with not only multiple instances but also multiple class labels. Such a problem can occur in many real-world tasks, e.g. an image usually contains multiple patches each of which can be described by a feature vector, and the image can belong to multiple categories since its semantics can be recognized in different ways. We analyze the relationship between multi-instance multi-label learning and the learning frameworks of traditional supervised learning, multiinstance learning and multi-label learning. Then, we propose the M IML B OOST and M IML S VM algorithms which achieve good performance in an application to scene classification. 1
Reference: text
sentIndex sentText sentNum sentScore
1 cn Abstract In this paper, we formalize multi-instance multi-label learning, where each training example is associated with not only multiple instances but also multiple class labels. [sent-4, score-0.15]
2 an image usually contains multiple patches each of which can be described by a feature vector, and the image can belong to multiple categories since its semantics can be recognized in different ways. [sent-7, score-0.123]
3 We analyze the relationship between multi-instance multi-label learning and the learning frameworks of traditional supervised learning, multiinstance learning and multi-label learning. [sent-8, score-0.23]
4 Then, we propose the M IML B OOST and M IML S VM algorithms which achieve good performance in an application to scene classification. [sent-9, score-0.091]
5 1 Introduction In traditional supervised learning, an object is represented by an instance (or feature vector) and associated with a class label. [sent-10, score-0.172]
6 Formally, let X denote the instance space (or feature space) and Y the set of class labels. [sent-11, score-0.057]
7 Then the task is to learn a function f : X → Y from a given data set {(x1 , y1 ), (x2 , y2 ), · · · , (xm , ym )}, where xi ∈ X is an instance and yi ∈ Y the known label of xi . [sent-12, score-0.232]
8 Although the above formalization is prevailing and successful, there are many real-world problems which do not fit this framework well, where a real-world object may be associated with a number of instances and a number of labels simultaneously. [sent-13, score-0.129]
9 For example, an image usually contains multiple patches each can be represented by an instance, while in image classification such an image can belong to several classes simultaneously, e. [sent-14, score-0.121]
10 an image can belong to mountains as well as Africa. [sent-16, score-0.083]
11 Another example is text categorization, where a document usually contains multiple sections each of which can be represented as an instance, and the document can be regarded as belonging to different categories if it was viewed from different aspects, e. [sent-17, score-0.042]
12 Web mining is a further example, where each of the links can be regarded as an instance while the web page itself can be recognized as news page, sports page, soccer page, etc. [sent-20, score-0.1]
13 In order to deal with such problems, in this paper we formalize multi-instance multi-label learning (abbreviated as M IML). [sent-21, score-0.036]
14 In this learning framework, a training example is described by multiple instances and associated with multiple class labels. [sent-22, score-0.158]
15 Formally, let X denote the instance space and Y the set of class labels. [sent-23, score-0.057]
16 Here ni denotes the number of instances in Xi and li the number of labels in Yi . [sent-25, score-0.149]
17 After analyzing the relationship between M IML and the frameworks of traditional supervised learning, multi-instance learning and multi-label learning, we propose two M IML algorithms, M IML - B OOST and M IML S VM. [sent-26, score-0.154]
18 Application to scene classification shows that, solving some real-world problems in the M IML framework can achieve better performance than solving them in existing frameworks such as multi-instance learning and multi-label learning. [sent-27, score-0.151]
19 2 Multi-Instance Multi-Label Learning We start by investigating the relationship between M IML and the frameworks of traditional supervised learning, multi-instance learning and multi-label learning, and then we develop some solutions. [sent-28, score-0.154]
20 Multi-instance learning [4] studies the problem where a real-world object described by a number of instances is associated with one class label. [sent-29, score-0.143]
21 Formally, the task is to learn a function fM IL : 2X → {−1, +1} from a given data set {(X1 , y1 ), (X2 , y2 ), · · · , (Xm , ym )}, where Xi ⊆ X is a set of (i) (i) (i) (i) instances {x1 , x2 , · · · , xni }, xj ∈ X (j = 1, 2, · · · , ni ), yi ∈ {−1, +1} is the label of Xi . [sent-30, score-0.309]
22 1 Multi-instance learning techniques have been successfully applied to diverse applications including scene classification [3, 7]. [sent-31, score-0.096]
23 Multi-label learning [8] studies the problem where a real-world object described by one instance is associated with a number of class labels. [sent-32, score-0.114]
24 Formally, the task is to learn a function fM LL : X → 2Y from a given data set {(x1 , Y1 ), (x2 , Y2 ), · · · , (xm , Ym )}, where xi ∈ X is an instance and Yi ⊆ Y (i) (i) (i) (i) a set of labels {y1 , y2 , · · · , yli }, yk ∈ Y (k = 1, 2, · · · , li ). [sent-33, score-0.155]
25 2 Multi-label learning techniques have also been successfully applied to scene classification [1]. [sent-34, score-0.096]
26 In fact, the multi- learning frameworks result from the ambiguity in representing real-world objects. [sent-35, score-0.09]
27 Multi-instance learning studies the ambiguity in the input space (or instance space), where an object has many alternative input descriptions, i. [sent-36, score-0.112]
28 instances; multi-label learning studies the ambiguity in the output space (or label space), where an object has many alternative output descriptions, i. [sent-38, score-0.109]
29 We illustrate the differences among these learning frameworks in Figure 1. [sent-41, score-0.06]
30 Thus, we can tackle M IML by identifying its equivalence in the traditional supervised learning framework, using multi-instance learning or multi-label learning as the bridge. [sent-43, score-0.146]
31 1 According to notions used in multi-instance learning, (Xi , yi ) is a labeled bag while Xi an unlabeled bag. [sent-44, score-0.144]
32 Although most works on multi-label learning assume that an instance can be associated with multiple valid labels, there are also works assuming that only one of the labels associated with an instance is correct [6]. [sent-45, score-0.207]
33 2 Solution 1: Using multi-instance learning as the bridge: We can transform a M IML learning task, i. [sent-47, score-0.069]
34 to learn a function fM IM L : 2X → 2Y , into a multi-instance learning task, i. [sent-49, score-0.044]
35 We can transform this multi-instance learning task further into a traditional supervised learning task, i. [sent-54, score-0.168]
36 to learn a function fSISL : X × Y → {−1, +1}, under (i) a constraint specifying how to derive fM IL (Xi , y) from fSISL (xj , y) (j = 1, · · · , ni ). [sent-56, score-0.077]
37 Here the constraint can be fM IL (Xi , y) = n (i) i sign[ j=1 fSISL (xj , y)] which has been used in transforming multi-instance learning tasks into traditional supervised learning tasks [9]. [sent-58, score-0.143]
38 Solution 2: Using multi-label learning as the bridge: We can also transform a M IML learning task, i. [sent-60, score-0.069]
39 to learn a function fM IM L : 2X → 2Y , into a multi-label learning task, i. [sent-62, score-0.044]
40 For any zi ∈ Z, fM LL (zi ) = fM IM L (Xi ) if zi = φ(Xi ), φ : 2X → Z. [sent-65, score-0.058]
41 We can transform this multi-label learning task further into a traditional supervised learning task, i. [sent-67, score-0.168]
42 Here the mapping φ can be implemented with constructive clustering which has been used in transforming multi-instance bags into traditional single-instances [11]. [sent-72, score-0.126]
43 Note that [(Xu , yv ), Ψ(Xu , yv )] (v = 1, 2, · · · , |Y|) is a labeled multi-instance (u) (u) bag where (Xu , yv ) is a bag containing nu number of instances, i. [sent-83, score-0.412]
44 {(x1 , yv ), (x2 , yv ), · · · , (u) (xnu , yv )}, and Ψ(Xu , yv ) ∈ {+1, −1} is the label of this bag. [sent-85, score-0.294]
45 Then, from the data set a multi-instance learning function fM IL can be learned, which can accomplish the desired M IML function because fM IM L (X ∗ ) = {y| argy∈Y (sign[fM IL (X ∗ , y)] = +1)}. [sent-92, score-0.038]
46 For convenience, let (B, g) denote the bag [(X, y), Ψ(X, y)]. [sent-94, score-0.11]
47 the label Africa of an image is usually triggered by several patches jointly instead of by only one patch. [sent-99, score-0.067]
48 Table 1: The M IML B OOST algorithm 1 Transform each M IML example (Xu , Yu ) (u = 1, 2, · · · , m) into |Y| number of multiinstance bags {[(Xu , y1 ), Ψ(Xu , y1 )], · · · , [(Xu , y|Y| ), Ψ(Xu , y|Y| )]}. [sent-100, score-0.083]
49 2 Initialize weight of each bag to W (i) = 3 1 m×|Y| (i = 1, 2, · · · , m × |Y|). [sent-102, score-0.11]
50 Repeat for t = 1, 2, · · · , T iterations: (i) 3a Set Wj = W (i) /ni (i = 1, 2, · · · , m × |Y|), assign the bag’s label Ψ(X (i) , y (i) ) (i) to each of its instances (xj , y (i) ) (j = 1, 2, · · · , ni ), and build an instance-level (i) predictor ht [(xj , y (i) )] ∈ {−1, +1}. [sent-103, score-0.163]
51 3b For the ith bag, compute the error rate e(i) ∈ [0, 1] by counting the number of ni misclassified instances within the bag, i. [sent-104, score-0.125]
52 e(i) = (i) j=1 (i) [ t [(x j ,y (i) )]=Ψ(X (i) ,y (i) )] [h ] ni . [sent-106, score-0.055]
53 3c 3d 3e Compute ct = arg minct If ct ≤ 0, go to Step 4. [sent-107, score-0.048]
54 In each boosting round, the aim is to 2 r(g=−1|B) expand F(B) into F(B) + cf (B), i. [sent-114, score-0.052]
55 Assuming all instances in a bag contribute equally and independently to the bag’s label, f (B) = n1 j h(bj ) can be derived, where h(bj ) ∈ {−1, +1} is the prediction of the B instance-level classifier h(·) for the jth instance in bag B, and nB is the number of instances in B. [sent-117, score-0.401]
56 It has been shown by [9] that the best f (B) to be added can be achieved by seeking h(·) which (i) ni 1 maximizes i j=1 [ ni W (i) g (i) h(bj )], given the bag-level weights W = exp(−gF(B)). [sent-118, score-0.11]
57 By assigning each instance the label of its bag and the corresponding weight W (i) /ni , h(·) can be learned by minimizing the weighted instance-level classification error. [sent-119, score-0.189]
58 When f (B) is found, the best multiplier c > 0 can be got by directly optimizing the exponential loss: EB EG|B [exp(−gF(B) + c(−gf (B)))] = = i i W (i) exp[c − g (i) (i) j h(bj ) ni ] W (i) exp[(2e(i) − 1)c] (i) 1 where e(i) = ni j [[(h(bj ) = g (i) )]] (computed in Step 3b). [sent-121, score-0.11]
59 2 Randomly select k elements from Γ to initialize the medoids Mt (t = 1, 2, · · · , k), repeat until all Mt do not change: 2a Γt = {Mt } (t = 1, 2, · · · , k). [sent-130, score-0.045]
60 A∈Γt B∈Γ t 3 Transform (Xu , Yu ) into a multi-label example (zu , Yu ) (u = 1, 2, · · · , m), where zu = (zu1 , zu2 , · · · , zuk ) = (dH (Xu , M1 ), dH (Xu , M2 ), · · · , dH (Xu , Mk )). [sent-133, score-0.064]
61 Xu , is an unlabeled multi-instance bag instead of a single instance, we employ Hausdorff distance [5] to measure the distance. [sent-138, score-0.11]
62 With the help of these medoids, we transform the original multi-instance example Xu into a k-dimensional numerical vector zu , where the ith (i = 1, 2, · · · , k) component of zu is the distance between Xu and Mi , that is, dH (Xu , Mi ). [sent-141, score-0.153]
63 This process reassembles the constructive clustering process used by [11] in transforming multiinstance examples into single-instance examples except that in [11] the clustering is executed at the instance level while here we execute it at the bag level. [sent-143, score-0.219]
64 Then, from the data set a multi-label learning function fM LL can be learned, which can accomplish the desired M IML function because fM IM L (X ∗ ) = fM LL (z ∗ ). [sent-147, score-0.038]
65 That is, the test example is labeled by all the class labels with positive S VM scores, except that when all the S VM scores are negative, the test example is labeled by the class label which is with the top (least negative) score. [sent-151, score-0.094]
66 4 Application to Scene Classification The data set consists of 2,000 natural scene images belonging to the classes desert, mountains, sea, sunset, and trees, as shown in Table 3. [sent-152, score-0.091]
67 Some images were from the C OREL image collection while some were collected from the Internet. [sent-153, score-0.046]
68 Table 3: The image data set (d: desert, m: mountains, s: sea, su: sunset, t: trees) label d m s su t 4. [sent-155, score-0.129]
69 The former is the core of a successful multi-label learning system B OOS T EXTER [8], while the latter has achieved excellent performance in scene classification [1]. [sent-158, score-0.127]
70 For M IML B OOST and M IML S VM, each image is represented as a bag of nine instances generated by the S BN method [7]. [sent-159, score-0.209]
71 Here each instance actually corresponds to an image patch, and better performance can be expected with better image patch generation method. [sent-160, score-0.116]
72 MH and M L S VM, each image is represented as a feature vector obtained by concatenating the instances of M IML B OOST or M IML S VM. [sent-162, score-0.099]
73 Gaussian kernel L IBSVM [2] is used to implement M L S VM, where the cross-training strategy is used to build the classifiers while the T-Criterion is used to label the images [1]. [sent-163, score-0.055]
74 MH and M L S VM make multi-label predictions, here the performance of the compared algorithms are evaluated according to five multi-label evaluation metrics, as shown in Tables 4 to 7, where ‘↓’ indicates ‘the smaller the better’ while ‘↑’ indicates ‘the bigger the better’. [sent-167, score-0.038]
75 Details of these evaluation metrics can be found in [8]. [sent-168, score-0.041]
76 Note that since in each boosting round M IML B OOST performs more operations than A DA B OOST. [sent-170, score-0.052]
77 MH does, for fair comparison, the boosting rounds used by A DA B OOST. [sent-171, score-0.097]
78 Table 4: The performance of M IML B OOST with different boosting rounds boosting rounds 5 10 15 20 25 evaluation metric hamm. [sent-173, score-0.246]
79 MH with different boosting rounds boosting rounds 50 100 150 200 250 4 evaluation metric hamm. [sent-228, score-0.229]
80 023 Table 7: The performance of M L S VM with different γ used in Gaussian kernel Gaussian kernel γ=1 γ=2 γ=3 γ=4 γ=5 evaluation metric hamm. [sent-343, score-0.052]
81 05 significance level reveal that the worst performance of M IML B OOST (with 5 boosting rounds) is even significantly better than the best performance of A DA B OOST. [sent-400, score-0.086]
82 MH (with 250 boosting rounds) on all the evaluation metrics, and is significantly better than the best performance of M L S VM (with γ = 2) in terms of coverage while comparable on the remaining metrics; the worse performance of M IML S VM (with γ = . [sent-401, score-0.146]
83 5) is even comparable to the best performance of M L S VM and is significantly better than the best performance of A DA B OOST. [sent-402, score-0.034]
84 These observations confirm that formalizing the scene classification task as a M IML problem to solve by M IML B OOST or M IML S VM is better than formalizing it as a multi-label learning problem to solve by A DA B OOST. [sent-404, score-0.181]
85 2 Comparison with Multi-Instance Learning Algorithms Since the scene classification task has been successfully tackled by multi-instance learning algorithms [7], we compare the M IML algorithms with established multi-instance learning algorithms D IVERSE D ENSITY [7] and E M - DD [10]. [sent-407, score-0.165]
86 The former is one of the most influential multi-instance learning algorithm and has achieved excellent performance in scene classification [7], while the latter has achieved excellent performance on multi-instance benchmark tests [10]. [sent-408, score-0.158]
87 That is, each image is represented as a bag of nine instances generated by the S BN method [7]. [sent-410, score-0.209]
88 1, with 25 boosting rounds for M IML B OOST while γ = . [sent-413, score-0.097]
89 Note that for M IML B OOST and M IML S VM, the top ranked class is regarded as the single-label prediction. [sent-418, score-0.041]
90 Tenfold cross-validation is performed and ‘mean ± std’ is presented in Table 8, where the best performance on each image class is bolded. [sent-419, score-0.062]
91 We can find from Table 8 that M IML B OOST achieves the best performance on image classes desert and trees while M IML S VM achieves the best performance on the remaining image classes. [sent-421, score-0.145]
92 These observations confirm that formalizing the scene classification task as a M IML problem to solve by M IML B OOST or M IML S VM is better than formalizing it as a multi-instance learning problem to solve by D IVERSE D ENSITY or E M - DD. [sent-425, score-0.181]
93 Table 8: Compare predictive accuracy of M IML B OOST, M IML S VM, D IVERSE D ENSITY and E M - DD Image class desert mountains sea sunset trees overall 5 M IML B OOST . [sent-426, score-0.166]
94 054 Conclusion In this paper, we formalize multi-instance multi-label learning where an example is associated with multiple instances and multiple labels simultaneously. [sent-474, score-0.18]
95 Although there were some works investigating the ambiguity of alternative input descriptions or alternative output descriptions associated with an object, this is the first work studying both these ambiguities simultaneously. [sent-475, score-0.115]
96 We show that an M IML problem can be solved by identifying its equivalence in the traditional supervised learning framework, using multi-instance learning or multi-label learning as the bridge. [sent-476, score-0.146]
97 The proposed algorithms, M IML B OOST and M IML S VM, have achieved good performance in the application to scene classification. [sent-477, score-0.091]
98 Moreover, it remains an open problem that whether M IML can be tackled directly, possibly by exploiting the connections between the instances and the labels. [sent-479, score-0.098]
99 It is also interesting to discover the relationship between the instances and labels. [sent-480, score-0.084]
100 Logistic regression and boosting for labeled bags of instances. [sent-542, score-0.103]
wordName wordTfidf (topN-words)
[('iml', 0.833), ('vm', 0.241), ('oost', 0.22), ('xu', 0.185), ('fm', 0.154), ('bag', 0.11), ('dh', 0.084), ('ensity', 0.074), ('fsisl', 0.074), ('iverse', 0.074), ('scene', 0.074), ('instances', 0.07), ('yv', 0.064), ('zu', 0.064), ('su', 0.062), ('ni', 0.055), ('boosting', 0.052), ('bags', 0.051), ('da', 0.049), ('il', 0.047), ('rounds', 0.045), ('ll', 0.044), ('mt', 0.043), ('argy', 0.042), ('supervised', 0.041), ('instance', 0.041), ('coverage', 0.039), ('traditional', 0.039), ('im', 0.039), ('frameworks', 0.038), ('label', 0.038), ('yu', 0.037), ('dd', 0.037), ('desert', 0.037), ('mountains', 0.037), ('yi', 0.034), ('formalizing', 0.033), ('xm', 0.033), ('medoids', 0.032), ('multiinstance', 0.032), ('sunset', 0.032), ('xni', 0.032), ('gf', 0.031), ('ambiguity', 0.03), ('image', 0.029), ('zi', 0.029), ('classi', 0.028), ('xi', 0.028), ('tackled', 0.028), ('degenerated', 0.028), ('hy', 0.028), ('sea', 0.028), ('descriptions', 0.027), ('transform', 0.025), ('bj', 0.025), ('regarded', 0.025), ('ct', 0.024), ('labels', 0.024), ('ym', 0.022), ('table', 0.022), ('learning', 0.022), ('learn', 0.022), ('loss', 0.021), ('hausdorff', 0.021), ('ibsvm', 0.021), ('nanjing', 0.021), ('yli', 0.021), ('evaluation', 0.021), ('page', 0.02), ('metrics', 0.02), ('transforming', 0.019), ('task', 0.019), ('object', 0.019), ('tenfold', 0.018), ('eb', 0.018), ('performance', 0.017), ('belong', 0.017), ('images', 0.017), ('constructive', 0.017), ('multiple', 0.017), ('xj', 0.017), ('associated', 0.016), ('trees', 0.016), ('std', 0.016), ('accomplish', 0.016), ('class', 0.016), ('tables', 0.015), ('eg', 0.015), ('works', 0.015), ('excellent', 0.014), ('metric', 0.014), ('formalize', 0.014), ('recognized', 0.014), ('relationship', 0.014), ('transformed', 0.014), ('step', 0.014), ('formally', 0.014), ('bn', 0.013), ('dy', 0.013), ('repeat', 0.013)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999994 136 nips-2006-Multi-Instance Multi-Label Learning with Application to Scene Classification
Author: Zhi-hua Zhou, Min-ling Zhang
Abstract: In this paper, we formalize multi-instance multi-label learning, where each training example is associated with not only multiple instances but also multiple class labels. Such a problem can occur in many real-world tasks, e.g. an image usually contains multiple patches each of which can be described by a feature vector, and the image can belong to multiple categories since its semantics can be recognized in different ways. We analyze the relationship between multi-instance multi-label learning and the learning frameworks of traditional supervised learning, multiinstance learning and multi-label learning. Then, we propose the M IML B OOST and M IML S VM algorithms which achieve good performance in an application to scene classification. 1
2 0.16314575 48 nips-2006-Branch and Bound for Semi-Supervised Support Vector Machines
Author: Olivier Chapelle, Vikas Sindhwani, S. S. Keerthi
Abstract: Semi-supervised SVMs (S3 VM) attempt to learn low-density separators by maximizing the margin over labeled and unlabeled examples. The associated optimization problem is non-convex. To examine the full potential of S3 VMs modulo local minima problems in current implementations, we apply branch and bound techniques for obtaining exact, globally optimal solutions. Empirical evidence suggests that the globally optimal solution can return excellent generalization performance in situations where other implementations fail completely. While our current implementation is only applicable to small datasets, we discuss variants that can potentially lead to practically useful algorithms. 1
3 0.12610248 21 nips-2006-AdaBoost is Consistent
Author: Peter L. Bartlett, Mikhail Traskin
Abstract: The risk, or probability of error, of the classifier produced by the AdaBoost algorithm is investigated. In particular, we consider the stopping strategy to be used in AdaBoost to achieve universal consistency. We show that provided AdaBoost is stopped after nν iterations—for sample size n and ν < 1—the sequence of risks of the classifiers it produces approaches the Bayes risk if Bayes risk L∗ > 0. 1
4 0.11663319 47 nips-2006-Boosting Structured Prediction for Imitation Learning
Author: J. A. Bagnell, Joel Chestnutt, David M. Bradley, Nathan D. Ratliff
Abstract: The Maximum Margin Planning (MMP) (Ratliff et al., 2006) algorithm solves imitation learning problems by learning linear mappings from features to cost functions in a planning domain. The learned policy is the result of minimum-cost planning using these cost functions. These mappings are chosen so that example policies (or trajectories) given by a teacher appear to be lower cost (with a lossscaled margin) than any other policy for a given planning domain. We provide a novel approach, M MP B OOST , based on the functional gradient descent view of boosting (Mason et al., 1999; Friedman, 1999a) that extends MMP by “boosting” in new features. This approach uses simple binary classification or regression to improve performance of MMP imitation learning, and naturally extends to the class of structured maximum margin prediction problems. (Taskar et al., 2005) Our technique is applied to navigation and planning problems for outdoor mobile robots and robotic legged locomotion. 1
5 0.055725537 140 nips-2006-Multiple Instance Learning for Computer Aided Diagnosis
Author: Murat Dundar, Balaji Krishnapuram, R. B. Rao, Glenn M. Fung
Abstract: Many computer aided diagnosis (CAD) problems can be best modelled as a multiple-instance learning (MIL) problem with unbalanced data: i.e. , the training data typically consists of a few positive bags, and a very large number of negative instances. Existing MIL algorithms are much too computationally expensive for these datasets. We describe CH, a framework for learning a Convex Hull representation of multiple instances that is significantly faster than existing MIL algorithms. Our CH framework applies to any standard hyperplane-based learning algorithm, and for some algorithms, is guaranteed to find the global optimal solution. Experimental studies on two different CAD applications further demonstrate that the proposed algorithm significantly improves diagnostic accuracy when compared to both MIL and traditional classifiers. Although not designed for standard MIL problems (which have both positive and negative bags and relatively balanced datasets), comparisons against other MIL methods on benchmark problems also indicate that the proposed method is competitive with the state-of-the-art.
6 0.054927528 150 nips-2006-On Transductive Regression
7 0.051728044 7 nips-2006-A Local Learning Approach for Clustering
8 0.048134513 183 nips-2006-Stochastic Relational Models for Discriminative Link Prediction
9 0.045640491 51 nips-2006-Clustering Under Prior Knowledge with Application to Image Segmentation
10 0.042905118 163 nips-2006-Prediction on a Graph with a Perceptron
11 0.039044779 50 nips-2006-Chained Boosting
12 0.036697593 78 nips-2006-Fast Discriminative Visual Codebooks using Randomized Clustering Forests
13 0.036480717 116 nips-2006-Learning from Multiple Sources
14 0.033921499 33 nips-2006-Analysis of Representations for Domain Adaptation
15 0.033148129 171 nips-2006-Sample Complexity of Policy Search with Known Dynamics
16 0.032272443 118 nips-2006-Learning to Model Spatial Dependency: Semi-Supervised Discriminative Random Fields
17 0.031499416 130 nips-2006-Max-margin classification of incomplete data
18 0.030472089 152 nips-2006-Online Classification for Complex Problems Using Simultaneous Projections
19 0.028640768 94 nips-2006-Image Retrieval and Classification Using Local Distance Functions
20 0.028416065 98 nips-2006-Inferring Network Structure from Co-Occurrences
topicId topicWeight
[(0, -0.101), (1, 0.041), (2, -0.016), (3, -0.006), (4, -0.003), (5, 0.091), (6, -0.086), (7, 0.006), (8, 0.027), (9, -0.032), (10, 0.001), (11, -0.058), (12, -0.025), (13, 0.03), (14, 0.016), (15, 0.077), (16, 0.094), (17, -0.006), (18, -0.049), (19, -0.008), (20, -0.048), (21, 0.014), (22, 0.059), (23, 0.204), (24, 0.045), (25, 0.143), (26, -0.044), (27, -0.077), (28, -0.073), (29, 0.076), (30, -0.105), (31, -0.01), (32, 0.102), (33, 0.334), (34, 0.061), (35, -0.092), (36, -0.018), (37, 0.23), (38, 0.04), (39, -0.081), (40, -0.009), (41, -0.023), (42, 0.004), (43, -0.017), (44, -0.142), (45, 0.031), (46, -0.001), (47, 0.024), (48, -0.013), (49, -0.177)]
simIndex simValue paperId paperTitle
same-paper 1 0.92203724 136 nips-2006-Multi-Instance Multi-Label Learning with Application to Scene Classification
Author: Zhi-hua Zhou, Min-ling Zhang
Abstract: In this paper, we formalize multi-instance multi-label learning, where each training example is associated with not only multiple instances but also multiple class labels. Such a problem can occur in many real-world tasks, e.g. an image usually contains multiple patches each of which can be described by a feature vector, and the image can belong to multiple categories since its semantics can be recognized in different ways. We analyze the relationship between multi-instance multi-label learning and the learning frameworks of traditional supervised learning, multiinstance learning and multi-label learning. Then, we propose the M IML B OOST and M IML S VM algorithms which achieve good performance in an application to scene classification. 1
2 0.57253283 21 nips-2006-AdaBoost is Consistent
Author: Peter L. Bartlett, Mikhail Traskin
Abstract: The risk, or probability of error, of the classifier produced by the AdaBoost algorithm is investigated. In particular, we consider the stopping strategy to be used in AdaBoost to achieve universal consistency. We show that provided AdaBoost is stopped after nν iterations—for sample size n and ν < 1—the sequence of risks of the classifiers it produces approaches the Bayes risk if Bayes risk L∗ > 0. 1
3 0.5449068 47 nips-2006-Boosting Structured Prediction for Imitation Learning
Author: J. A. Bagnell, Joel Chestnutt, David M. Bradley, Nathan D. Ratliff
Abstract: The Maximum Margin Planning (MMP) (Ratliff et al., 2006) algorithm solves imitation learning problems by learning linear mappings from features to cost functions in a planning domain. The learned policy is the result of minimum-cost planning using these cost functions. These mappings are chosen so that example policies (or trajectories) given by a teacher appear to be lower cost (with a lossscaled margin) than any other policy for a given planning domain. We provide a novel approach, M MP B OOST , based on the functional gradient descent view of boosting (Mason et al., 1999; Friedman, 1999a) that extends MMP by “boosting” in new features. This approach uses simple binary classification or regression to improve performance of MMP imitation learning, and naturally extends to the class of structured maximum margin prediction problems. (Taskar et al., 2005) Our technique is applied to navigation and planning problems for outdoor mobile robots and robotic legged locomotion. 1
4 0.48737761 48 nips-2006-Branch and Bound for Semi-Supervised Support Vector Machines
Author: Olivier Chapelle, Vikas Sindhwani, S. S. Keerthi
Abstract: Semi-supervised SVMs (S3 VM) attempt to learn low-density separators by maximizing the margin over labeled and unlabeled examples. The associated optimization problem is non-convex. To examine the full potential of S3 VMs modulo local minima problems in current implementations, we apply branch and bound techniques for obtaining exact, globally optimal solutions. Empirical evidence suggests that the globally optimal solution can return excellent generalization performance in situations where other implementations fail completely. While our current implementation is only applicable to small datasets, we discuss variants that can potentially lead to practically useful algorithms. 1
5 0.34784624 150 nips-2006-On Transductive Regression
Author: Corinna Cortes, Mehryar Mohri
Abstract: In many modern large-scale learning applications, the amount of unlabeled data far exceeds that of labeled data. A common instance of this problem is the transductive setting where the unlabeled test points are known to the learning algorithm. This paper presents a study of regression problems in that setting. It presents explicit VC-dimension error bounds for transductive regression that hold for all bounded loss functions and coincide with the tight classification bounds of Vapnik when applied to classification. It also presents a new transductive regression algorithm inspired by our bound that admits a primal and kernelized closedform solution and deals efficiently with large amounts of unlabeled data. The algorithm exploits the position of unlabeled points to locally estimate their labels and then uses a global optimization to ensure robust predictions. Our study also includes the results of experiments with several publicly available regression data sets with up to 20,000 unlabeled examples. The comparison with other transductive regression algorithms shows that it performs well and that it can scale to large data sets.
6 0.30003119 118 nips-2006-Learning to Model Spatial Dependency: Semi-Supervised Discriminative Random Fields
7 0.25573981 104 nips-2006-Large-Scale Sparsified Manifold Regularization
8 0.25239563 140 nips-2006-Multiple Instance Learning for Computer Aided Diagnosis
9 0.22478263 98 nips-2006-Inferring Network Structure from Co-Occurrences
10 0.21258911 93 nips-2006-Hyperparameter Learning for Graph Based Semi-supervised Learning Algorithms
11 0.20298307 50 nips-2006-Chained Boosting
12 0.1972965 41 nips-2006-Bayesian Ensemble Learning
13 0.19697186 30 nips-2006-An Oracle Inequality for Clipped Regularized Risk Minimizers
14 0.18648371 183 nips-2006-Stochastic Relational Models for Discriminative Link Prediction
15 0.18577261 37 nips-2006-Attribute-efficient learning of decision lists and linear threshold functions under unconcentrated distributions
16 0.18152255 78 nips-2006-Fast Discriminative Visual Codebooks using Randomized Clustering Forests
17 0.17770514 172 nips-2006-Scalable Discriminative Learning for Natural Language Parsing and Translation
18 0.17654565 29 nips-2006-An Information Theoretic Framework for Eukaryotic Gradient Sensing
19 0.17107442 43 nips-2006-Bayesian Model Scoring in Markov Random Fields
20 0.17062145 170 nips-2006-Robotic Grasping of Novel Objects
topicId topicWeight
[(1, 0.08), (3, 0.01), (7, 0.062), (9, 0.032), (12, 0.033), (20, 0.011), (22, 0.042), (44, 0.04), (57, 0.081), (64, 0.023), (65, 0.045), (69, 0.033), (97, 0.381)]
simIndex simValue paperId paperTitle
same-paper 1 0.69408453 136 nips-2006-Multi-Instance Multi-Label Learning with Application to Scene Classification
Author: Zhi-hua Zhou, Min-ling Zhang
Abstract: In this paper, we formalize multi-instance multi-label learning, where each training example is associated with not only multiple instances but also multiple class labels. Such a problem can occur in many real-world tasks, e.g. an image usually contains multiple patches each of which can be described by a feature vector, and the image can belong to multiple categories since its semantics can be recognized in different ways. We analyze the relationship between multi-instance multi-label learning and the learning frameworks of traditional supervised learning, multiinstance learning and multi-label learning. Then, we propose the M IML B OOST and M IML S VM algorithms which achieve good performance in an application to scene classification. 1
2 0.53143394 121 nips-2006-Learning to be Bayesian without Supervision
Author: Martin Raphan, Eero P. Simoncelli
Abstract: unkown-abstract
3 0.43637431 8 nips-2006-A Nonparametric Approach to Bottom-Up Visual Saliency
Author: Wolf Kienzle, Felix A. Wichmann, Matthias O. Franz, Bernhard Schölkopf
Abstract: This paper addresses the bottom-up influence of local image information on human eye movements. Most existing computational models use a set of biologically plausible linear filters, e.g., Gabor or Difference-of-Gaussians filters as a front-end, the outputs of which are nonlinearly combined into a real number that indicates visual saliency. Unfortunately, this requires many design parameters such as the number, type, and size of the front-end filters, as well as the choice of nonlinearities, weighting and normalization schemes etc., for which biological plausibility cannot always be justified. As a result, these parameters have to be chosen in a more or less ad hoc way. Here, we propose to learn a visual saliency model directly from human eye movement data. The model is rather simplistic and essentially parameter-free, and therefore contrasts recent developments in the field that usually aim at higher prediction rates at the cost of additional parameters and increasing model complexity. Experimental results show that—despite the lack of any biological prior knowledge—our model performs comparably to existing approaches, and in fact learns image features that resemble findings from several previous studies. In particular, its maximally excitatory stimuli have center-surround structure, similar to receptive fields in the early human visual system. 1
4 0.36659306 110 nips-2006-Learning Dense 3D Correspondence
Author: Florian Steinke, Volker Blanz, Bernhard Schölkopf
Abstract: Establishing correspondence between distinct objects is an important and nontrivial task: correctness of the correspondence hinges on properties which are difficult to capture in an a priori criterion. While previous work has used a priori criteria which in some cases led to very good results, the present paper explores whether it is possible to learn a combination of features that, for a given training set of aligned human heads, characterizes the notion of correct correspondence. By optimizing this criterion, we are then able to compute correspondence and morphs for novel heads. 1
5 0.3663277 160 nips-2006-Part-based Probabilistic Point Matching using Equivalence Constraints
Author: Graham Mcneill, Sethu Vijayakumar
Abstract: Correspondence algorithms typically struggle with shapes that display part-based variation. We present a probabilistic approach that matches shapes using independent part transformations, where the parts themselves are learnt during matching. Ideas from semi-supervised learning are used to bias the algorithm towards finding ‘perceptually valid’ part structures. Shapes are represented by unlabeled point sets of arbitrary size and a background component is used to handle occlusion, local dissimilarity and clutter. Thus, unlike many shape matching techniques, our approach can be applied to shapes extracted from real images. Model parameters are estimated using an EM algorithm that alternates between finding a soft correspondence and computing the optimal part transformations using Procrustes analysis.
6 0.36608785 34 nips-2006-Approximate Correspondences in High Dimensions
7 0.3649258 51 nips-2006-Clustering Under Prior Knowledge with Application to Image Segmentation
8 0.36381108 119 nips-2006-Learning to Rank with Nonsmooth Cost Functions
9 0.36319774 72 nips-2006-Efficient Learning of Sparse Representations with an Energy-Based Model
10 0.36293411 65 nips-2006-Denoising and Dimension Reduction in Feature Space
11 0.36201015 3 nips-2006-A Complexity-Distortion Approach to Joint Pattern Alignment
12 0.36105171 118 nips-2006-Learning to Model Spatial Dependency: Semi-Supervised Discriminative Random Fields
13 0.36068246 75 nips-2006-Efficient sparse coding algorithms
14 0.35970762 195 nips-2006-Training Conditional Random Fields for Maximum Labelwise Accuracy
15 0.3593592 185 nips-2006-Subordinate class recognition using relational object models
16 0.35934639 158 nips-2006-PG-means: learning the number of clusters in data
17 0.35934031 112 nips-2006-Learning Nonparametric Models for Probabilistic Imitation
18 0.35858423 33 nips-2006-Analysis of Representations for Domain Adaptation
19 0.35830894 122 nips-2006-Learning to parse images of articulated bodies
20 0.35824552 32 nips-2006-Analysis of Empirical Bayesian Methods for Neuroelectromagnetic Source Localization