nips nips2004 nips2004-44 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Ariadna Quattoni, Michael Collins, Trevor Darrell
Abstract: We present a discriminative part-based approach for the recognition of object classes from unsegmented cluttered scenes. Objects are modeled as flexible constellations of parts conditioned on local observations found by an interest operator. For each object class the probability of a given assignment of parts to local features is modeled by a Conditional Random Field (CRF). We propose an extension of the CRF framework that incorporates hidden variables and combines class conditional CRFs into a unified framework for part-based object recognition. The parameters of the CRF are estimated in a maximum likelihood framework and recognition proceeds by finding the most likely class under our model. The main advantage of the proposed CRF framework is that it allows us to relax the assumption of conditional independence of the observed data (i.e. local features) often used in generative approaches, an assumption that might be too restrictive for a considerable number of object classes.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract We present a discriminative part-based approach for the recognition of object classes from unsegmented cluttered scenes. [sent-3, score-0.383]
2 Objects are modeled as flexible constellations of parts conditioned on local observations found by an interest operator. [sent-4, score-0.248]
3 For each object class the probability of a given assignment of parts to local features is modeled by a Conditional Random Field (CRF). [sent-5, score-0.607]
4 We propose an extension of the CRF framework that incorporates hidden variables and combines class conditional CRFs into a unified framework for part-based object recognition. [sent-6, score-0.627]
5 The parameters of the CRF are estimated in a maximum likelihood framework and recognition proceeds by finding the most likely class under our model. [sent-7, score-0.177]
6 local features) often used in generative approaches, an assumption that might be too restrictive for a considerable number of object classes. [sent-10, score-0.333]
7 1 Introduction The problem that we address in this paper is that of learning object categories from supervised data. [sent-11, score-0.206]
8 Given a training set of n pairs (xi , yi ), where xi is the ith image and yi is the category of the object present in xi , we would like to learn a model that maps images to object categories. [sent-12, score-1.101]
9 The part-based models we consider represent images as sets of patches, or local features, which are detected by an interest operator such as that described in [4]. [sent-14, score-0.204]
10 Thus an image xi can be considered to be a vector {xi,1 , . [sent-15, score-0.178]
11 Each patch xi,j has a feature-vector representation φ(xi,j ) ∈ Rd ; the feature vector might capture various features of the appearance of a patch, as well as features of its relative location and scale. [sent-19, score-0.523]
12 Moreover, the local patches underlying the local feature vectors may have complex interdependencies: for example, they may correspond to different parts of an object, whose spatial arrangement is important to the classification task. [sent-24, score-0.518]
13 The most widely used approach for part-based object recognition is the generative model proposed in [1]. [sent-25, score-0.325]
14 , local features) given their assignment to parts in the model. [sent-29, score-0.314]
15 This assumption might be too restrictive for a considerable number of object classes made of structured patterns. [sent-30, score-0.276]
16 A second limitation of generative approaches is that they require a model P (xi,j |hi,j ) of patches xi,j given underlying variables hi,j (e. [sent-31, score-0.336]
17 , hi,j may be a hidden variable in the model, or may simply be yi ). [sent-33, score-0.277]
18 Accurately specifying such a generative model may be challenging – in particular in cases where patches overlap one another, or where we wish to allow a hidden variable hi,j to depend on several surrounding patches. [sent-34, score-0.384]
19 A more direct approach may be to use a feature vector representation of patches, and to use a discriminative learning approach. [sent-35, score-0.168]
20 Similar observations concerning the limitations of generative models have been made in the context of natural language processing, in particular in sequence-labeling tasks such as part-of-speech tagging [7, 5, 3] and in previous work on conditional random fields (CRFs) for vision [2]. [sent-37, score-0.137]
21 In sequence-labeling problems for NLP each observation xi,j is typically the j’th word for some input sentence, and hi,j is a hidden state, for example representing the part-of-speech of that word. [sent-38, score-0.142]
22 Hidden Markov models (HMMs), a generative approach, require a model of P (xi,j |hi,j ), and this can be a challenging task when features such as word prefixes or suffixes are included in the model, or where hi,j is required to depend directly on words other than xi,j . [sent-39, score-0.208]
23 This has led to research on discriminative models for sequence labeling such as MEMM’s [7, 5] and conditional random fields (CRFs)[3]. [sent-40, score-0.237]
24 We propose a new model for object recognition based on Conditional Random Fields. [sent-42, score-0.272]
25 A key difference of our approach from previous work on CRFs is that we make use of hidden variables in the model. [sent-44, score-0.23]
26 , [2, 3]) each “label” yi is a sequence hi = {hi,1 , hi,2 , . [sent-47, score-0.135]
27 In our case the labels yi are unstructured labels from some fixed set of object categories, and the relationship between yi and each observation xi,j is not clearly defined. [sent-52, score-0.527]
28 Instead, we model intermediate part-labels hi,j as hidden variables in the model. [sent-53, score-0.271]
29 The model defines conditional probabilities P (y, h | x), and hence indirectly P (y | x) = h P (y, h | x), using a CRF. [sent-54, score-0.125]
30 Dependencies between the hidden variables h are modeled by an undirected graph over these variables. [sent-55, score-0.332]
31 The result is a model where inference and parameter estimation can be carried out using standard graphical model algorithms such as belief propagation [6]. [sent-56, score-0.359]
32 1 Conditional Random Fields with Hidden Variables Our task is to learn a mapping from images x to labels y. [sent-58, score-0.169]
33 1 Each patch xj is represented by a feature vector φ(xj ) ∈ Rd . [sent-64, score-0.224]
34 For example, in our experiments each xj corresponds to a patch that is detected by the feature detector in [4]; section [3] gives details of the feature-vector representation φ(xj ) for each patch. [sent-65, score-0.39]
35 Our training set consists of labeled images (xi , yi ) for i = 1 . [sent-66, score-0.301]
36 n, where each yi ∈ Y, and each xi = {xi,1 , xi,2 , . [sent-69, score-0.251]
37 For any image x we also assume a vector of “parts” variables h = {h1 , h2 , . [sent-73, score-0.15]
38 These variables are not observed on training examples, and will therefore form a set of hidden variables in the 1 Note that the number of patches m can vary across images, and did vary in our experiments. [sent-77, score-0.593]
39 For convenience we use notation where m is fixed across different images; in reality it will vary across images but this leads to minor changes to the model. [sent-78, score-0.234]
40 Each hj is a member of H where H is a finite set of possible parts in the model. [sent-80, score-0.765]
41 Intuitively, each hj corresponds to a labeling of xj with some member of H. [sent-81, score-0.681]
42 Given these definitions of image-labels y, images x, and part-labels h, we will define a conditional probabilistic model: eΨ(y,h,x;θ) . [sent-82, score-0.207]
43 Ψ(y ,h,x;θ) y ,h e h (2) Given a new test image x, and parameter values θ∗ induced from a training example, we will take the label for the image to be arg maxy∈Y P (y | x, θ∗ ). [sent-86, score-0.307]
44 Following previous work on CRFs [2, 3], we use the following objective function in training the parameters: L(θ) = log P (yi | xi , θ) − i 1 ||θ||2 2σ 2 (3) The first term in Eq. [sent-87, score-0.159]
45 We assume an undirected graph structure, with the hidden variables {h1 , . [sent-94, score-0.332]
46 We use E to denote the set of edges in the graph, and we will write (j, k) ∈ E to signify that there is an edge in the graph between variables hj and hk . [sent-98, score-0.921]
47 2 We define Ψ to take the following form: m 2 fl2 (j, k, y, hj , hk , x)θl 1 fl1 (j, y, hj , x)θl + Ψ(y, h, x; θ) = j=1 l (j,k)∈E (4) l 1 2 where fl1 , fl2 are functions defining the features in the model, and θl , θl are the components 1 of θ. [sent-100, score-1.293]
48 The f features depend on single hidden variable values in the model, the f 2 features can depend on pairs of values. [sent-101, score-0.37]
49 Moreover the features respect the structure of the graph, in that no feature depends on more than two hidden variables hj , hk , and if a feature does depend on variables hj and hk there must be an edge (j, k) in the graph E. [sent-104, score-2.062]
50 4, then exact methods exist for inference and parameter estimation in the model. [sent-106, score-0.127]
51 This follows because belief propagation [6] can be used to calculate the following quantities in O(|E||Y|) time: ∀y ∈ Y, Z(y | x, θ) = exp{Ψ(y, h, x; θ)} h ∀y ∈ Y, j ∈ 1 . [sent-107, score-0.178]
52 m, a ∈ H, P (hj = a | y, x, θ) = P (h | y, x, θ) h:hj =a ∀y ∈ Y, (j, k) ∈ E, a, b ∈ H, P (h | y, x, θ) P (hj = a, hk = b | y, x, θ) = h:hj =a,hk =b 2 This will allow exact methods for inference and parameter estimation in the model, for example using belief propagation. [sent-110, score-0.39]
53 If E contains cycles then approximate methods, such as loopy beliefpropagation, may be necessary for inference and parameter estimation. [sent-111, score-0.157]
54 The second and third terms are marginal distributions over individual variables hj or pairs of variables hj , hk corresponding to edges in the graph. [sent-115, score-1.421]
55 2 Parameter Estimation Using Belief Propagation This section considers estimation of the parameters θ∗ = arg max L(θ) from a training sample, where L(θ) is defined in Eq. [sent-118, score-0.131]
56 In our work we used a conjugate-gradient method to optimize L(θ) (note that due to the use of hidden variables, L(θ) has multiple local minima, and our method is therefore not guaranteed to reach the globally optimal point). [sent-120, score-0.195]
57 Ψ(y ,h,xi ;θ) y ,h e h Li (θ) = log P (yi | xi , θ) = log (5) 1 We first consider derivatives with respect to the parameters θl corresponding to features fl1 (j, y, hj , x) that depend on single hidden variables. [sent-123, score-0.922]
58 We define Ψ(y, h, x; θ) = φ(xj ) · θ(hj ) + j θ(y, hj ) + j θ(y, hj , hk ) (6) (j,k)∈E Here θ(k) ∈ Rd for k ∈ H is a parameter vector corresponding to the k’th part label. [sent-128, score-1.338]
59 The inner-product φ(xj ) · θ(hj ) can be interpreted as a measure of the compatibility between patch xj and part-label hj . [sent-129, score-0.74]
60 Each parameter θ(y, k) ∈ R for k ∈ H, y ∈ Y can be interpreted as a measure of the compatibility between part k and label y. [sent-130, score-0.233]
61 Finally, each parameter θ(y, k, l) ∈ R for y ∈ Y, and k, l ∈ H measures the compatibility between an edge with labels k and l and the label y. [sent-131, score-0.234]
62 Hence belief propagation can be used for inference and parameter estimation in the model. [sent-135, score-0.277]
63 The patches xi,j in each image are obtained using the SIFT detector [4]. [sent-136, score-0.312]
64 Each patch xi,j is then represented by a feature vector φ(xi,j ) that incorporates a combination of SIFT and relative location and scale features. [sent-137, score-0.254]
65 The tree E is formed by running a minimum spanning tree algorithm over the parts hi,j , where the cost of an edge in the graph between hi,j and hi,k is taken to be the distance between xi,j and xi,k in the image. [sent-138, score-0.466]
66 Our choice of E encodes our assumption that parts conditioned on features that are spatially close are more likely to be dependent. [sent-140, score-0.301]
67 We also plan to investigate more complex graph structures that involve cycles, which may require approximate methods such as loopy belief propagation for parameter estimation and inference. [sent-142, score-0.346]
68 3 The first two experiments consisted of training a two class model (object vs. [sent-144, score-0.134]
69 The third experiment consisted of training a multi-class model to distinguish between n classes. [sent-146, score-0.126]
70 The only parameter that was adjusted in the experiments was the scale of the images upon which the interest point detector was run. [sent-147, score-0.335]
71 In particular, we adjusted the scale on the car side data set: in this data set the images were too small and without this adjustment the detector would fail to find a significant amount of features. [sent-148, score-0.768]
72 a we show how the number of parts in the model affects performance. [sent-153, score-0.236]
73 In the case of the car side data set, the ten-part model shows a significant improvement compared to the five parts model while for the car rear data set the performance improvement obtained by increasing the number of parts is not as significant. [sent-154, score-1.484]
74 The model exhibits best performance for the Leopard data set, for which the presence of part 1 alone is a clear predictor of the class. [sent-160, score-0.118]
75 This shows again that our model can learn discriminative part distributions for each class. [sent-161, score-0.221]
76 Figure 3 shows results for a multi-view experiment where the task is two distinguish between two different views of a car and background. [sent-162, score-0.439]
77 Notice, that since our algorithm does not currently allow for the recognition of multiple instances of an object we test it on a partition of the the training set in http://l2r. [sent-171, score-0.311]
78 Figure 1: Examples of the most likely assignment of parts to features for the two class experiments (car data set). [sent-176, score-0.417]
79 Data set (a) Car Side Car Rear 5 parts 94 % 91 % 10 parts 99 % 91. [sent-177, score-0.39]
80 5 % Figure 2: (a) Equal Error Rates for the car side and car rear experiments with different number of parts. [sent-183, score-1.012]
81 Figure 1 displays the Viterbi labeling4 for a set of example images showing the most likely assignment of local features to parts in the model. [sent-185, score-0.543]
82 Figure 6 shows the mean and variance of each part’s location for car side images and background images. [sent-186, score-0.76]
83 The mean and variance of each part’s location for the car side images were calculated in the following manner: First we find for every image classified as class a the most likely part assignment under our model. [sent-187, score-1.028]
84 Second, we calculate the mean and variance of positions of all local features that were assigned to the same part. [sent-188, score-0.231]
85 As can be seen in Figure 6 , while the mean location of a given part in the background images and the mean location of the same part in the car images are very similar, the parts in the car have a much tighter distribution which seems to suggest that the model is learning the shape of the object. [sent-190, score-1.639]
86 As shown in Figure 5 the model has also learnt discriminative part distributions for each class, for example the presence of part 1 seems to be a clear predictor for the car class. [sent-191, score-0.739]
87 In general part assignments seem to rely on a combination of appearance and relative location. [sent-192, score-0.193]
88 4 This is the labeling h∗ = arg maxh P (h | y, x, θ) where x is an image and y is the label for the image under the model. [sent-194, score-0.268]
89 It appears that the model has learnt a vocabulary of very general parts with significant variability in appearance and learns to discriminate between classes by capturing the most likely arrangement of these parts for each class. [sent-201, score-0.739]
90 In some cases the model relies more heavily on relative location than appearance because the appearance information might not be very useful for discriminating between the two classes. [sent-202, score-0.401]
91 One of the reasons for this is that the detector produces a large number of false detections, making the appearance data too noisy for discrimination. [sent-203, score-0.254]
92 The fact that the model is able to cope with this lack of discriminating appearance information illustrates its flexible data-driven nature. [sent-204, score-0.197]
93 This can be a desirable model property of a general object recognition system, because for some object classes appearance is the important discriminant (i. [sent-205, score-0.602]
94 One noticeable difference between our model and similar part-based models is that our model learns large parts composed of small local features. [sent-210, score-0.33]
95 , through their position in minimum spanning tree): the potential functions defined on pairs of hidden variables tend to smooth the allocation of parts to patches. [sent-213, score-0.474]
96 Similarly to CRFs and other maximum entropy models our approach allows us to combine arbitrary observation features for training discriminative classifiers with hidden variables. [sent-215, score-0.395]
97 Furthermore, by making some assumptions about the joint distribution of hidden variables one can derive efficient training algorithms based on dynamic programming. [sent-216, score-0.273]
98 The main limitation of our model is that it is dependent on the feature detector picking up discriminative features of the object. [sent-218, score-0.467]
99 Furthermore, our model might learn to discriminate between classes based on the statistics of the feature detector and not the true underlying data, to which it has no access. [sent-219, score-0.353]
100 This is not a desirable property since it assumes the feature detector to be consistent. [sent-220, score-0.203]
wordName wordTfidf (topN-words)
[('hj', 0.518), ('car', 0.397), ('parts', 0.195), ('hk', 0.179), ('crfs', 0.167), ('object', 0.165), ('hidden', 0.142), ('detector', 0.138), ('rear', 0.136), ('yi', 0.135), ('images', 0.123), ('appearance', 0.116), ('xi', 0.116), ('patches', 0.112), ('crf', 0.104), ('discriminative', 0.103), ('patch', 0.098), ('variables', 0.088), ('conditional', 0.084), ('belief', 0.084), ('side', 0.082), ('features', 0.078), ('part', 0.077), ('graph', 0.074), ('assignment', 0.066), ('propagation', 0.066), ('recognition', 0.066), ('feature', 0.065), ('ariadna', 0.063), ('leopards', 0.063), ('llamas', 0.063), ('pigeons', 0.063), ('rhinos', 0.063), ('compatibility', 0.063), ('image', 0.062), ('xj', 0.061), ('location', 0.059), ('tree', 0.058), ('background', 0.056), ('li', 0.056), ('int', 0.053), ('generative', 0.053), ('local', 0.053), ('member', 0.052), ('sift', 0.05), ('labeling', 0.05), ('class', 0.05), ('classes', 0.049), ('spanning', 0.049), ('arg', 0.047), ('label', 0.047), ('hm', 0.046), ('parameter', 0.046), ('labels', 0.046), ('trevor', 0.044), ('learnt', 0.044), ('xes', 0.044), ('elds', 0.043), ('variance', 0.043), ('training', 0.043), ('vary', 0.043), ('limitation', 0.042), ('distinguish', 0.042), ('categories', 0.041), ('calculated', 0.041), ('estimation', 0.041), ('model', 0.041), ('inference', 0.04), ('discriminating', 0.04), ('arrangement', 0.04), ('fields', 0.038), ('partition', 0.037), ('hmms', 0.037), ('depend', 0.036), ('viterbi', 0.036), ('cycles', 0.036), ('shape', 0.035), ('th', 0.035), ('loopy', 0.035), ('across', 0.034), ('uni', 0.034), ('framework', 0.033), ('restrictive', 0.033), ('edge', 0.032), ('incorporates', 0.032), ('derivatives', 0.032), ('discriminate', 0.031), ('exible', 0.03), ('edges', 0.03), ('might', 0.029), ('assigned', 0.029), ('rd', 0.029), ('entropy', 0.029), ('calculate', 0.028), ('detected', 0.028), ('mccallum', 0.028), ('calculation', 0.028), ('adjusted', 0.028), ('undirected', 0.028), ('likely', 0.028)]
simIndex simValue paperId paperTitle
same-paper 1 1.0 44 nips-2004-Conditional Random Fields for Object Recognition
Author: Ariadna Quattoni, Michael Collins, Trevor Darrell
Abstract: We present a discriminative part-based approach for the recognition of object classes from unsegmented cluttered scenes. Objects are modeled as flexible constellations of parts conditioned on local observations found by an interest operator. For each object class the probability of a given assignment of parts to local features is modeled by a Conditional Random Field (CRF). We propose an extension of the CRF framework that incorporates hidden variables and combines class conditional CRFs into a unified framework for part-based object recognition. The parameters of the CRF are estimated in a maximum likelihood framework and recognition proceeds by finding the most likely class under our model. The main advantage of the proposed CRF framework is that it allows us to relax the assumption of conditional independence of the observed data (i.e. local features) often used in generative approaches, an assumption that might be too restrictive for a considerable number of object classes.
2 0.20765005 99 nips-2004-Learning Hyper-Features for Visual Identification
Author: Andras D. Ferencz, Erik G. Learned-miller, Jitendra Malik
Abstract: We address the problem of identifying specific instances of a class (cars) from a set of images all belonging to that class. Although we cannot build a model for any particular instance (as we may be provided with only one “training” example of it), we can use information extracted from observing other members of the class. We pose this task as a learning problem, in which the learner is given image pairs, labeled as matching or not, and must discover which image features are most consistent for matching instances and discriminative for mismatches. We explore a patch based representation, where we model the distributions of similarity measurements defined on the patches. Finally, we describe an algorithm that selects the most salient patches based on a mutual information criterion. This algorithm performs identification well for our challenging dataset of car images, after matching only a few, well chosen patches. 1
3 0.17057188 67 nips-2004-Exponentiated Gradient Algorithms for Large-margin Structured Classification
Author: Peter L. Bartlett, Michael Collins, Ben Taskar, David A. McAllester
Abstract: We consider the problem of structured classification, where the task is to predict a label y from an input x, and y has meaningful internal structure. Our framework includes supervised training of Markov random fields and weighted context-free grammars as special cases. We describe an algorithm that solves the large-margin optimization problem defined in [12], using an exponential-family (Gibbs distribution) representation of structured objects. The algorithm is efficient—even in cases where the number of labels y is exponential in size—provided that certain expectations under Gibbs distributions can be calculated efficiently. The method for structured labels relies on a more general result, specifically the application of exponentiated gradient updates [7, 8] to quadratic programs. 1
4 0.15343969 47 nips-2004-Contextual Models for Object Detection Using Boosted Random Fields
Author: Antonio Torralba, Kevin P. Murphy, William T. Freeman
Abstract: We seek to both detect and segment objects in images. To exploit both local image data as well as contextual information, we introduce Boosted Random Fields (BRFs), which uses Boosting to learn the graph structure and local evidence of a conditional random field (CRF). The graph structure is learned by assembling graph fragments in an additive model. The connections between individual pixels are not very informative, but by using dense graphs, we can pool information from large regions of the image; dense models also support efficient inference. We show how contextual information from other objects can improve detection performance, both in terms of accuracy and speed, by using a computational cascade. We apply our system to detect stuff and things in office and street scenes. 1
5 0.15176944 40 nips-2004-Common-Frame Model for Object Recognition
Author: Pierre Moreels, Pietro Perona
Abstract: A generative probabilistic model for objects in images is presented. An object consists of a constellation of features. Feature appearance and pose are modeled probabilistically. Scene images are generated by drawing a set of objects from a given database, with random clutter sprinkled on the remaining image surface. Occlusion is allowed. We study the case where features from the same object share a common reference frame. Moreover, parameters for shape and appearance densities are shared across features. This is to be contrasted with previous work on probabilistic ‘constellation’ models where features depend on each other, and each feature and model have different pose and appearance statistics [1, 2]. These two differences allow us to build models containing hundreds of features, as well as to train each model from a single example. Our model may also be thought of as a probabilistic revisitation of Lowe’s model [3, 4]. We propose an efficient entropy-minimization inference algorithm that constructs the best interpretation of a scene as a collection of objects and clutter. We test our ideas with experiments on two image databases. We compare with Lowe’s algorithm and demonstrate better performance, in particular in presence of large amounts of background clutter.
6 0.14040166 162 nips-2004-Semi-Markov Conditional Random Fields for Information Extraction
7 0.14008252 66 nips-2004-Exponential Family Harmoniums with an Application to Information Retrieval
8 0.13924062 178 nips-2004-Support Vector Classification with Input Data Uncertainty
9 0.1316395 16 nips-2004-Adaptive Discriminative Generative Model and Its Applications
10 0.10988314 134 nips-2004-Object Classification from a Single Example Utilizing Class Relevance Metrics
11 0.10466582 73 nips-2004-Generative Affine Localisation and Tracking
12 0.099237919 83 nips-2004-Incremental Learning for Visual Tracking
13 0.098429821 139 nips-2004-Optimal Aggregation of Classifiers and Boosting Maps in Functional Magnetic Resonance Imaging
14 0.094497323 187 nips-2004-The Entire Regularization Path for the Support Vector Machine
15 0.093613401 174 nips-2004-Spike Sorting: Bayesian Clustering of Non-Stationary Data
16 0.093404151 13 nips-2004-A Three Tiered Approach for Articulated Object Action Modeling and Recognition
17 0.090083145 9 nips-2004-A Method for Inferring Label Sampling Mechanisms in Semi-Supervised Learning
18 0.088368088 89 nips-2004-Joint MRI Bias Removal Using Entropy Minimization Across Images
19 0.087573737 11 nips-2004-A Second Order Cone programming Formulation for Classifying Missing Data
20 0.084801875 192 nips-2004-The power of feature clustering: An application to object detection
topicId topicWeight
[(0, -0.274), (1, 0.081), (2, -0.06), (3, -0.131), (4, 0.244), (5, 0.079), (6, 0.094), (7, -0.032), (8, -0.143), (9, 0.044), (10, 0.107), (11, 0.063), (12, 0.028), (13, 0.17), (14, 0.036), (15, 0.081), (16, 0.061), (17, 0.136), (18, 0.047), (19, -0.021), (20, 0.012), (21, 0.017), (22, -0.016), (23, -0.018), (24, -0.108), (25, 0.093), (26, 0.039), (27, -0.08), (28, -0.096), (29, -0.161), (30, -0.071), (31, -0.061), (32, -0.049), (33, -0.056), (34, 0.038), (35, 0.028), (36, 0.102), (37, -0.07), (38, -0.093), (39, 0.018), (40, -0.117), (41, -0.015), (42, -0.005), (43, 0.056), (44, 0.051), (45, -0.07), (46, -0.034), (47, -0.037), (48, -0.005), (49, -0.035)]
simIndex simValue paperId paperTitle
same-paper 1 0.95415139 44 nips-2004-Conditional Random Fields for Object Recognition
Author: Ariadna Quattoni, Michael Collins, Trevor Darrell
Abstract: We present a discriminative part-based approach for the recognition of object classes from unsegmented cluttered scenes. Objects are modeled as flexible constellations of parts conditioned on local observations found by an interest operator. For each object class the probability of a given assignment of parts to local features is modeled by a Conditional Random Field (CRF). We propose an extension of the CRF framework that incorporates hidden variables and combines class conditional CRFs into a unified framework for part-based object recognition. The parameters of the CRF are estimated in a maximum likelihood framework and recognition proceeds by finding the most likely class under our model. The main advantage of the proposed CRF framework is that it allows us to relax the assumption of conditional independence of the observed data (i.e. local features) often used in generative approaches, an assumption that might be too restrictive for a considerable number of object classes.
2 0.68969101 99 nips-2004-Learning Hyper-Features for Visual Identification
Author: Andras D. Ferencz, Erik G. Learned-miller, Jitendra Malik
Abstract: We address the problem of identifying specific instances of a class (cars) from a set of images all belonging to that class. Although we cannot build a model for any particular instance (as we may be provided with only one “training” example of it), we can use information extracted from observing other members of the class. We pose this task as a learning problem, in which the learner is given image pairs, labeled as matching or not, and must discover which image features are most consistent for matching instances and discriminative for mismatches. We explore a patch based representation, where we model the distributions of similarity measurements defined on the patches. Finally, we describe an algorithm that selects the most salient patches based on a mutual information criterion. This algorithm performs identification well for our challenging dataset of car images, after matching only a few, well chosen patches. 1
3 0.67162424 40 nips-2004-Common-Frame Model for Object Recognition
Author: Pierre Moreels, Pietro Perona
Abstract: A generative probabilistic model for objects in images is presented. An object consists of a constellation of features. Feature appearance and pose are modeled probabilistically. Scene images are generated by drawing a set of objects from a given database, with random clutter sprinkled on the remaining image surface. Occlusion is allowed. We study the case where features from the same object share a common reference frame. Moreover, parameters for shape and appearance densities are shared across features. This is to be contrasted with previous work on probabilistic ‘constellation’ models where features depend on each other, and each feature and model have different pose and appearance statistics [1, 2]. These two differences allow us to build models containing hundreds of features, as well as to train each model from a single example. Our model may also be thought of as a probabilistic revisitation of Lowe’s model [3, 4]. We propose an efficient entropy-minimization inference algorithm that constructs the best interpretation of a scene as a collection of objects and clutter. We test our ideas with experiments on two image databases. We compare with Lowe’s algorithm and demonstrate better performance, in particular in presence of large amounts of background clutter.
4 0.62912834 162 nips-2004-Semi-Markov Conditional Random Fields for Information Extraction
Author: Sunita Sarawagi, William W. Cohen
Abstract: We describe semi-Markov conditional random fields (semi-CRFs), a conditionally trained version of semi-Markov chains. Intuitively, a semiCRF on an input sequence x outputs a “segmentation” of x, in which labels are assigned to segments (i.e., subsequences) of x rather than to individual elements xi of x. Importantly, features for semi-CRFs can measure properties of segments, and transitions within a segment can be non-Markovian. In spite of this additional power, exact learning and inference algorithms for semi-CRFs are polynomial-time—often only a small constant factor slower than conventional CRFs. In experiments on five named entity recognition problems, semi-CRFs generally outperform conventional CRFs. 1
5 0.61339074 47 nips-2004-Contextual Models for Object Detection Using Boosted Random Fields
Author: Antonio Torralba, Kevin P. Murphy, William T. Freeman
Abstract: We seek to both detect and segment objects in images. To exploit both local image data as well as contextual information, we introduce Boosted Random Fields (BRFs), which uses Boosting to learn the graph structure and local evidence of a conditional random field (CRF). The graph structure is learned by assembling graph fragments in an additive model. The connections between individual pixels are not very informative, but by using dense graphs, we can pool information from large regions of the image; dense models also support efficient inference. We show how contextual information from other objects can improve detection performance, both in terms of accuracy and speed, by using a computational cascade. We apply our system to detect stuff and things in office and street scenes. 1
6 0.60250384 67 nips-2004-Exponentiated Gradient Algorithms for Large-margin Structured Classification
7 0.59811127 43 nips-2004-Conditional Models of Identity Uncertainty with Application to Noun Coreference
8 0.5231598 73 nips-2004-Generative Affine Localisation and Tracking
9 0.52253085 66 nips-2004-Exponential Family Harmoniums with an Application to Information Retrieval
10 0.5162183 192 nips-2004-The power of feature clustering: An application to object detection
11 0.51166904 25 nips-2004-Assignment of Multiplicative Mixtures in Natural Images
12 0.50375551 41 nips-2004-Comparing Beliefs, Surveys, and Random Walks
13 0.48671484 14 nips-2004-A Topographic Support Vector Machine: Classification Using Local Label Configurations
14 0.47317815 178 nips-2004-Support Vector Classification with Input Data Uncertainty
15 0.4716261 19 nips-2004-An Application of Boosting to Graph Classification
16 0.44924843 111 nips-2004-Maximal Margin Labeling for Multi-Topic Text Categorization
17 0.44719785 16 nips-2004-Adaptive Discriminative Generative Model and Its Applications
18 0.40925935 9 nips-2004-A Method for Inferring Label Sampling Mechanisms in Semi-Supervised Learning
19 0.40489349 191 nips-2004-The Variational Ising Classifier (VIC) Algorithm for Coherently Contaminated Data
20 0.40458134 55 nips-2004-Distributed Occlusion Reasoning for Tracking with Nonparametric Belief Propagation
topicId topicWeight
[(9, 0.011), (13, 0.094), (15, 0.148), (17, 0.015), (21, 0.022), (26, 0.05), (27, 0.214), (31, 0.029), (33, 0.232), (35, 0.027), (39, 0.046), (50, 0.028)]
simIndex simValue paperId paperTitle
1 0.89895076 11 nips-2004-A Second Order Cone programming Formulation for Classifying Missing Data
Author: Chiranjib Bhattacharyya, Pannagadatta K. Shivaswamy, Alex J. Smola
Abstract: We propose a convex optimization based strategy to deal with uncertainty in the observations of a classification problem. We assume that instead of a sample (xi , yi ) a distribution over (xi , yi ) is specified. In particular, we derive a robust formulation when the distribution is given by a normal distribution. It leads to Second Order Cone Programming formulation. Our method is applied to the problem of missing data, where it outperforms direct imputation. 1
same-paper 2 0.87381709 44 nips-2004-Conditional Random Fields for Object Recognition
Author: Ariadna Quattoni, Michael Collins, Trevor Darrell
Abstract: We present a discriminative part-based approach for the recognition of object classes from unsegmented cluttered scenes. Objects are modeled as flexible constellations of parts conditioned on local observations found by an interest operator. For each object class the probability of a given assignment of parts to local features is modeled by a Conditional Random Field (CRF). We propose an extension of the CRF framework that incorporates hidden variables and combines class conditional CRFs into a unified framework for part-based object recognition. The parameters of the CRF are estimated in a maximum likelihood framework and recognition proceeds by finding the most likely class under our model. The main advantage of the proposed CRF framework is that it allows us to relax the assumption of conditional independence of the observed data (i.e. local features) often used in generative approaches, an assumption that might be too restrictive for a considerable number of object classes.
3 0.8501752 167 nips-2004-Semi-supervised Learning with Penalized Probabilistic Clustering
Author: Zhengdong Lu, Todd K. Leen
Abstract: While clustering is usually an unsupervised operation, there are circumstances in which we believe (with varying degrees of certainty) that items A and B should be assigned to the same cluster, while items A and C should not. We would like such pairwise relations to influence cluster assignments of out-of-sample data in a manner consistent with the prior knowledge expressed in the training set. Our starting point is probabilistic clustering based on Gaussian mixture models (GMM) of the data distribution. We express clustering preferences in the prior distribution over assignments of data points to clusters. This prior penalizes cluster assignments according to the degree with which they violate the preferences. We fit the model parameters with EM. Experiments on a variety of data sets show that PPC can consistently improve clustering results.
4 0.80439758 31 nips-2004-Blind One-microphone Speech Separation: A Spectral Learning Approach
Author: Francis R. Bach, Michael I. Jordan
Abstract: We present an algorithm to perform blind, one-microphone speech separation. Our algorithm separates mixtures of speech without modeling individual speakers. Instead, we formulate the problem of speech separation as a problem in segmenting the spectrogram of the signal into two or more disjoint sets. We build feature sets for our segmenter using classical cues from speech psychophysics. We then combine these features into parameterized affinity matrices. We also take advantage of the fact that we can generate training examples for segmentation by artificially superposing separately-recorded signals. Thus the parameters of the affinity matrices can be tuned using recent work on learning spectral clustering [1]. This yields an adaptive, speech-specific segmentation algorithm that can successfully separate one-microphone speech mixtures. 1
5 0.80187058 2 nips-2004-A Direct Formulation for Sparse PCA Using Semidefinite Programming
Author: Alexandre D'aspremont, Laurent E. Ghaoui, Michael I. Jordan, Gert R. Lanckriet
Abstract: We examine the problem of approximating, in the Frobenius-norm sense, a positive, semidefinite symmetric matrix by a rank-one matrix, with an upper bound on the cardinality of its eigenvector. The problem arises in the decomposition of a covariance matrix into sparse factors, and has wide applications ranging from biology to finance. We use a modification of the classical variational representation of the largest eigenvalue of a symmetric matrix, where cardinality is constrained, and derive a semidefinite programming based relaxation for our problem. 1
6 0.80066764 99 nips-2004-Learning Hyper-Features for Visual Identification
7 0.80021119 133 nips-2004-Nonparametric Transforms of Graph Kernels for Semi-Supervised Learning
8 0.7999391 163 nips-2004-Semi-parametric Exponential Family PCA
9 0.79976821 55 nips-2004-Distributed Occlusion Reasoning for Tracking with Nonparametric Belief Propagation
10 0.79923379 174 nips-2004-Spike Sorting: Bayesian Clustering of Non-Stationary Data
11 0.79903936 102 nips-2004-Learning first-order Markov models for control
12 0.7986027 127 nips-2004-Neighbourhood Components Analysis
13 0.79853755 178 nips-2004-Support Vector Classification with Input Data Uncertainty
14 0.79819006 161 nips-2004-Self-Tuning Spectral Clustering
15 0.79813325 187 nips-2004-The Entire Regularization Path for the Support Vector Machine
16 0.79802734 77 nips-2004-Hierarchical Clustering of a Mixture Model
17 0.79785901 3 nips-2004-A Feature Selection Algorithm Based on the Global Minimization of a Generalization Error Bound
18 0.79776657 207 nips-2004-ℓ₀-norm Minimization for Basis Selection
19 0.79772049 125 nips-2004-Multiple Relational Embedding
20 0.79730916 131 nips-2004-Non-Local Manifold Tangent Learning