nips nips2004 nips2004-40 knowledge-graph by maker-knowledge-mining

40 nips-2004-Common-Frame Model for Object Recognition


Source: pdf

Author: Pierre Moreels, Pietro Perona

Abstract: A generative probabilistic model for objects in images is presented. An object consists of a constellation of features. Feature appearance and pose are modeled probabilistically. Scene images are generated by drawing a set of objects from a given database, with random clutter sprinkled on the remaining image surface. Occlusion is allowed. We study the case where features from the same object share a common reference frame. Moreover, parameters for shape and appearance densities are shared across features. This is to be contrasted with previous work on probabilistic ‘constellation’ models where features depend on each other, and each feature and model have different pose and appearance statistics [1, 2]. These two differences allow us to build models containing hundreds of features, as well as to train each model from a single example. Our model may also be thought of as a probabilistic revisitation of Lowe’s model [3, 4]. We propose an efficient entropy-minimization inference algorithm that constructs the best interpretation of a scene as a collection of objects and clutter. We test our ideas with experiments on two image databases. We compare with Lowe’s algorithm and demonstrate better performance, in particular in presence of large amounts of background clutter.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract A generative probabilistic model for objects in images is presented. [sent-3, score-0.288]

2 Scene images are generated by drawing a set of objects from a given database, with random clutter sprinkled on the remaining image surface. [sent-6, score-0.492]

3 We study the case where features from the same object share a common reference frame. [sent-8, score-0.288]

4 Moreover, parameters for shape and appearance densities are shared across features. [sent-9, score-0.223]

5 This is to be contrasted with previous work on probabilistic ‘constellation’ models where features depend on each other, and each feature and model have different pose and appearance statistics [1, 2]. [sent-10, score-0.657]

6 We propose an efficient entropy-minimization inference algorithm that constructs the best interpretation of a scene as a collection of objects and clutter. [sent-13, score-0.234]

7 We test our ideas with experiments on two image databases. [sent-14, score-0.206]

8 1 Introduction There is broad agreement in the machine vision literature that objects and object categories should be represented as collections of features or parts with distinctive appearance and mutual position [1, 2, 4, 5, 6, 7, 8, 9]. [sent-16, score-0.69]

9 faces) have been proposed by virtually all the cited authors, far fewer for recognition (list all objects and their pose in a given image) where matching would ideally take a logarithmic time with respect to the number of available models [3, 4]. [sent-19, score-0.346]

10 This work is based on two complementary efforts: the deterministic recognition system proposed by Lowe [3, 4], and the probabilistic constellation models by Perona and collaborators [1, 2]. [sent-21, score-0.267]

11 The first line of work has three attractive characteristics: objects are represented with hundreds of features, thus increasing robustness; models are learned from a single training example; last but not least, recognition is efficient with databases of hundreds of objects. [sent-22, score-0.326]

12 The drawback of Lowe’s approach is that both modeling decisions and algorithms rely on heuristics, whose design and performance may be far from optimal in Figure 1: Diagram of our recognition model showing database, test image and two competing hypotheses. [sent-23, score-0.336]

13 To avoid a cluttered diagram, only one partial hypothesis is displayed for each hypothesis. [sent-24, score-0.351]

14 The predicted position of models according to the hypotheses are overlaid on the test image. [sent-25, score-0.325]

15 Conversely, the second line of work is based on principled probabilistic object models which yield principled and, in some respects, optimal algorithms for learning and recognition/detection. [sent-27, score-0.209]

16 A major difference with the work described here lies in the probabilistic treatment of hypotheses, which allows us here to use directly hypothesis likelihood as a guide for the search, instead of the arbitrary admissible heuristic required by A*. [sent-32, score-0.259]

17 Each model is the set of features extracted from one training image of a given object - although this could be generalized to features from many images of the same object. [sent-35, score-0.658]

18 Models are indexed by k and denoted by mk , while indices i and j are used respectively for features extracted from the test image k and from model images: f i denotes the i − th test feature, while f j denotes the j − th feature from the k − th model. [sent-36, score-0.814]

19 The features extracted from model images (training set) form the database. [sent-37, score-0.3]

20 A feature detected in a test image can be a consequence of the presence of a model object in the image, in which case it should be associated to a feature from the database. [sent-38, score-0.614]

21 In the alternative, this feature is attributed to a clutter - or background - detection. [sent-39, score-0.329]

22 The geometric information associated to each feature contains position information (x and y coordinates, denoted by the vector x), orientation (denoted by θ) and scale (denoted by k k σ). [sent-40, score-0.343]

23 It is denoted by X i = (x, θi , σi ) for test feature f i and Xjk = (xk θj , σj ) for model j k feature fj . [sent-41, score-0.602]

24 This geometric information is measured relatively to the standard reference frame of the image in which the feature has been detected. [sent-42, score-0.413]

25 All features extracted from the same image share the same reference frame. [sent-43, score-0.343]

26 The appearance information associated to a feature is a descriptor characterizing the local image appearance near this feature. [sent-44, score-0.668]

27 The measured appearance information is denoted by k Ai for test feature f i and Ak for model feature f j . [sent-45, score-0.582]

28 In our experiments, features are detected j at multiple scales at the extrema of difference-of-gaussians filtered versions of the image [4, 12]. [sent-46, score-0.285]

29 A partial hypothesis h explains the observations made in a fraction of the test image. [sent-48, score-0.467]

30 It combines a model image m h and a corresponding set of pose parameters X h . [sent-49, score-0.246]

31 This allows us to search in parallel for multiple objects in a test image. [sent-56, score-0.241]

32 A hypothesis H is the combination of several partial hypotheses, such that it explains completely the observations made in the test image. [sent-57, score-0.467]

33 A special notation H 0 or h0 denotes any (partial) hypothesis that states that no model object is present in a given fraction of the test image, and that features that could have been detected there are due to clutter. [sent-58, score-0.616]

34 Our objective is to find which model objects are present in the test scene, given the observations made in the test scene and the information that is present in the database. [sent-59, score-0.45]

35 In probabilistic terms, we look for hypotheses H for which the likelihood ration LR(H) = k P (H|{fi },{fj }) k P (H0 |{fi },{fj }) > 1. [sent-60, score-0.216]

36 A key assumption of this work is that once the pose parameters of the objects (and thus their reference frames) are known, the geometric configuration and appearance of the test features are independent from each other. [sent-63, score-0.787]

37 We also assume independence between features associated to models and features associated to clutter detections, as well as independence k k between separate clutter detections. [sent-64, score-0.776]

38 Assignment vectors v represent matches between features from the test scene, and model features or clutter. [sent-67, score-0.473]

39 The dimension of each assignment vector is the number of test features ntest . [sent-68, score-0.45]

40 Its i − th component v(i) = (k, j) denotes that the test feature f i is matched to k fv(i) = fj , j − th feature from model m k . [sent-69, score-0.645]

41 The set V H of assignment vectors compatible with a hypothesis H are those that assign test features only to models present in H (and to clutter). [sent-71, score-0.671]

42 In particular, the only assignment vector compatible with h 0 is v0 such that ∀i, v0 (i) = (0, 0). [sent-72, score-0.219]

43 We obtain   LR(H) = P (H) P (H0 ) k P (v|{fj }, mh , Xh ) · v∈VH h∈H i|fi ∈h P (fi |fv(i) , mh , Xh )  (2) P (fi |h0 ) k P (H) is a prior on hypotheses, we assume it is constant. [sent-73, score-0.43]

44 The term P (v|{f j }, mh , Xh ) is discussed in 3. [sent-74, score-0.215]

45 •P (fi |fv(i) , mh , Xh ) : fi and fv(i) are believed to be one and the same feature. [sent-76, score-0.469]

46 This noise probability p n encodes differences in appearance of the descriptors, but also in geometry, i. [sent-78, score-0.193]

47 position, scale, orientation Assuming independence between appearance information and geometry information, k pn (fi |fj , mh , Xh ) = pn,A (Ai |Av(i) , mh , Xh ) · pn,X (Xi |Xv(i) , mh , Xh ) (3) Figure 2: Snapshots from the iterative matching process. [sent-80, score-1.049]

48 Two competing hypotheses are displayed (top and bottom row) a) Each assignment vector contains one assignment, suggesting a transformation (red box) b) End of iterative process. [sent-81, score-0.381]

49 The correct hypothesis is supported by numerous matches and high belief, while the wrong hypothesis has only a weak support from few matches and low belief. [sent-82, score-0.6]

50 The error in geometry is measured by comparing the values observed in the test image, with the predicted values that would be observed if the model features were to be transformed according to the parameters X h . [sent-83, score-0.316]

51 3 Search for the best interpretation of the test image The building block of the recognition process is a question, comparing a feature from a database model with a feature of the test image. [sent-89, score-0.738]

52 A question selects a feature from the database, and tries to identify if and where this feature appears in the test image. [sent-90, score-0.355]

53 1 Assignment vectors compatible with hypotheses For a given hypothesis H, the set of possible assignment vectors V H is too large for explicit exploration. [sent-92, score-0.57]

54 In particular, each assignment vector v and each model referenced in v implies a set of pose parameters X v (extracted e. [sent-95, score-0.311]

55 Therefore, the term k P (v|{fj }, mh , Xh ) from (2) will be significant only when X v ≈ Xh , i. [sent-98, score-0.215]

56 when the pose implied by the assignment vector agrees with the pose specified by the partial hypothesis. [sent-100, score-0.511]

57 (2) becomes LR(H) ≈ P (H) P (H0 ) h∈H i|fi ∈h P (fi |fvh (i) , mh , Xh ) P (fi |h0 ) (6) Our recognition system proceeds by asking questions sequentially and adding matches to assignment vectors. [sent-104, score-0.643]

58 It is therefore natural to define, for a given hypothesis H and the corresponding assignment vector v H and t ≤ ntest , the belief in vH by pn (ft |fv(t) , mht , Xht ) B0 (vH ) = 1, Bt (vH ) = · Bt−1 (vH ) (7) pbg (ft |h0 ) The geometric part of the belief (cf. [sent-105, score-0.949]

59 (3)-(5) characterizes how close the pose X v implied by the assignments is to the pose X h specified by the hypothesis. [sent-106, score-0.322]

60 The geometric component of the belief characterizes the quality of the appearance match for the pairs (f i , fv(i) ). [sent-107, score-0.485]

61 2 Entropy-based optimization Our goal is finding quickly the hypothesis that best explains the observations, i. [sent-109, score-0.276]

62 the hypothesis (models+poses) that has the highest likelihood ratio. [sent-111, score-0.197]

63 We compute such hypothesis incrementally by asking questions sequentially. [sent-112, score-0.267]

64 a given model is present in the image) as soon as the belief of a corresponding hypothesis exceeds a given confidence threshold. [sent-116, score-0.296]

65 Therefore we approximate the MEE strategy with a simple heuristic: The next question consists of attempting to match one feature of the highest-belief model; specifically, the feature with best appearance match to a feature in the test image. [sent-126, score-0.889]

66 Questions to be examined are created by pairing database features to the test features closest in terms of appearance. [sent-129, score-0.437]

67 Note that since features encode location, orientation and scale, any single assignment between a test feature and a model feature contains enough information to characterize a similarity transformation. [sent-130, score-0.718]

68 It is therefore natural to restrict the set of possible transformations to similarities, and to insert each candidate assignment in the corresponding geometric hash table entry. [sent-131, score-0.509]

69 The set of hypotheses is initialized to the center of the hash table entries, and their belief is set to 1. [sent-133, score-0.349]

70 A partial hypothesis corresponds to a hash table entry, we consider only the candidate assignments that fall into this same entry. [sent-135, score-0.546]

71 The hypothesis H that currently has the highest likelihood ratio is selected. [sent-137, score-0.197]

72 1, only the best assignment Figure 3: Results from our algorithm in various situations (viewpoint change can be seen in Fig. [sent-140, score-0.22]

73 Each row shows the best hypothesis in terms of belief. [sent-142, score-0.241]

74 The threshold used is the repeatability rate defined in [15] m vector is explored: if p n (fi |fj h , mh , Xh ) > pbg (fi ) the match is accepted and inserted in m the hypothesis. [sent-146, score-0.603]

75 In the alternative, f i is considered a clutter detection and f j h is a missed detection. [sent-147, score-0.248]

76 After adding an assignment to a hypothesis, frame parameters X h are recomputed using least-squares optimization, based on all assignments currently associated to this hypothesis. [sent-149, score-0.275]

77 This parameter estimation step provides a progressive refinement of the model pose parameters as assignments are added. [sent-150, score-0.189]

78 The exploration of a partial hypothesis ends when no more candidate match is available in the hash table entry. [sent-153, score-0.603]

79 The search ends when all test scene features have been matched or assigned to clutter. [sent-155, score-0.357]

80 1 Experimental setting We tested our algorithm on two sets of images, containing respectively 49 and 161 model images, and 101 and 51 test images (sets P M − gadgets − 03 and JP − 3Dobjects − 04 available from http : //www. [sent-157, score-0.204]

81 Test images contained from zero (negative examples) to five objects, for a total of 178 objects in the first set, and 79 objects in the second set. [sent-163, score-0.31]

82 A large fraction of each test image consists of background. [sent-164, score-0.206]

83 The objects were always moved between model images and test images. [sent-168, score-0.321]

84 The images of model objects used in the learning stage were downsampled to fit in a 500 × 500 pixels box, the test images were downsampled to 800 × 800 pixels. [sent-169, score-0.469]

85 With these settings, the number of features generated by the features detector was of the order of 1000 per training image and 2000-4000 per test image. [sent-170, score-0.466]

86 A ground truth model was created by cutting a rectangle from the test image and adding noise. [sent-172, score-0.272]

87 The two rows show the best and second best model found by each algorithm (estimated frame position shown by the red box, features that found a match are shown in yellow). [sent-174, score-0.423]

88 In challenging situations with multiple objects or textured clutter, our method performs a more systematic check on geometric consistency by updating likelihoods every time a match is added. [sent-182, score-0.343]

89 Hypotheses starting with wrong matches due to clutter don’t find further supporting matches, and are easily discarded by a threshold based on the number of matches. [sent-183, score-0.309]

90 Conversely, Lowe’s algorithm checks geometric consistency as a last step of the recognition process, but needs to allow for a large slop in the transformation parameters. [sent-184, score-0.206]

91 Spurious matches induced by clutter detections may still be accepted, thus leading to the acceptance of incorrect hypotheses. [sent-185, score-0.336]

92 5: the test image consists of a picture of concrete. [sent-187, score-0.206]

93 A rectangular patch was extracted from this image, noise was added to this patch, and it was inserted in the database as a new model. [sent-188, score-0.184]

94 With our algorithm, the best hypothesis found the correct match with the patch of concrete, its best contender doesn’t succeed in collecting more than one correspondence and is discarded. [sent-189, score-0.404]

95 In Lowe’s case, other models manage to accumulate a high number of correspondences induced by texture matches among clutter detections. [sent-190, score-0.303]

96 Both curves confirm that our probabilistic interpretation leads to less false alarms than Lowe’s method for a same detection rate. [sent-195, score-0.201]

97 5 Conclusion We have proposed an object recognition method that combines the benefits of a set of rich features with those of a probabilistic model of features positions and appearance. [sent-196, score-0.569]

98 The probabilistic model verifies the validity of candidate hypotheses in terms of appearance and geometric configuration. [sent-198, score-0.614]

99 Our system improves upon a state-of-the art recognition method based on strict feature matching. [sent-199, score-0.204]

100 In particular, the rate of false alarms in the presence Figure 6: Sample scenes and training objects from the two sets of images. [sent-200, score-0.196]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('lowe', 0.294), ('xh', 0.293), ('fi', 0.254), ('pbg', 0.222), ('mh', 0.215), ('fj', 0.213), ('hypothesis', 0.197), ('appearance', 0.193), ('clutter', 0.188), ('assignment', 0.176), ('hypotheses', 0.154), ('vh', 0.148), ('features', 0.13), ('hash', 0.129), ('fv', 0.129), ('objects', 0.117), ('object', 0.117), ('image', 0.111), ('geometric', 0.109), ('feature', 0.107), ('partial', 0.103), ('pose', 0.102), ('mee', 0.099), ('lr', 0.098), ('recognition', 0.097), ('test', 0.095), ('matches', 0.085), ('perona', 0.082), ('match', 0.081), ('constellation', 0.078), ('images', 0.076), ('xv', 0.073), ('scene', 0.073), ('belief', 0.066), ('bg', 0.064), ('pn', 0.064), ('detections', 0.063), ('candidate', 0.063), ('probabilistic', 0.062), ('extracted', 0.061), ('detection', 0.06), ('geometry', 0.058), ('independence', 0.055), ('categories', 0.054), ('assignments', 0.054), ('poses', 0.051), ('displayed', 0.051), ('database', 0.049), ('moreels', 0.049), ('ntest', 0.049), ('accepted', 0.049), ('denoted', 0.047), ('position', 0.046), ('question', 0.046), ('th', 0.045), ('frame', 0.045), ('detected', 0.044), ('best', 0.044), ('compatible', 0.043), ('hashing', 0.043), ('discretizing', 0.043), ('hundreds', 0.041), ('reference', 0.041), ('false', 0.04), ('alarm', 0.039), ('alarms', 0.039), ('questions', 0.039), ('patch', 0.038), ('observations', 0.037), ('inserted', 0.036), ('downsampled', 0.036), ('textured', 0.036), ('wrong', 0.036), ('characterizes', 0.036), ('characterize', 0.036), ('explains', 0.035), ('descriptor', 0.034), ('geman', 0.034), ('attributed', 0.034), ('orientation', 0.034), ('created', 0.033), ('distinctive', 0.033), ('fergus', 0.033), ('model', 0.033), ('transformations', 0.032), ('asking', 0.031), ('densities', 0.03), ('ai', 0.03), ('ends', 0.03), ('characterizing', 0.03), ('ft', 0.03), ('models', 0.03), ('entropy', 0.03), ('box', 0.03), ('occlusion', 0.029), ('search', 0.029), ('strategy', 0.028), ('viewpoint', 0.028), ('lighting', 0.028), ('implied', 0.028)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999881 40 nips-2004-Common-Frame Model for Object Recognition

Author: Pierre Moreels, Pietro Perona

Abstract: A generative probabilistic model for objects in images is presented. An object consists of a constellation of features. Feature appearance and pose are modeled probabilistically. Scene images are generated by drawing a set of objects from a given database, with random clutter sprinkled on the remaining image surface. Occlusion is allowed. We study the case where features from the same object share a common reference frame. Moreover, parameters for shape and appearance densities are shared across features. This is to be contrasted with previous work on probabilistic ‘constellation’ models where features depend on each other, and each feature and model have different pose and appearance statistics [1, 2]. These two differences allow us to build models containing hundreds of features, as well as to train each model from a single example. Our model may also be thought of as a probabilistic revisitation of Lowe’s model [3, 4]. We propose an efficient entropy-minimization inference algorithm that constructs the best interpretation of a scene as a collection of objects and clutter. We test our ideas with experiments on two image databases. We compare with Lowe’s algorithm and demonstrate better performance, in particular in presence of large amounts of background clutter.

2 0.15716293 99 nips-2004-Learning Hyper-Features for Visual Identification

Author: Andras D. Ferencz, Erik G. Learned-miller, Jitendra Malik

Abstract: We address the problem of identifying specific instances of a class (cars) from a set of images all belonging to that class. Although we cannot build a model for any particular instance (as we may be provided with only one “training” example of it), we can use information extracted from observing other members of the class. We pose this task as a learning problem, in which the learner is given image pairs, labeled as matching or not, and must discover which image features are most consistent for matching instances and discriminative for mismatches. We explore a patch based representation, where we model the distributions of similarity measurements defined on the patches. Finally, we describe an algorithm that selects the most salient patches based on a mutual information criterion. This algorithm performs identification well for our challenging dataset of car images, after matching only a few, well chosen patches. 1

3 0.15232739 83 nips-2004-Incremental Learning for Visual Tracking

Author: Jongwoo Lim, David A. Ross, Ruei-sung Lin, Ming-Hsuan Yang

Abstract: Most existing tracking algorithms construct a representation of a target object prior to the tracking task starts, and utilize invariant features to handle appearance variation of the target caused by lighting, pose, and view angle change. In this paper, we present an efficient and effective online algorithm that incrementally learns and adapts a low dimensional eigenspace representation to reflect appearance changes of the target, thereby facilitating the tracking task. Furthermore, our incremental method correctly updates the sample mean and the eigenbasis, whereas existing incremental subspace update methods ignore the fact the sample mean varies over time. The tracking problem is formulated as a state inference problem within a Markov Chain Monte Carlo framework and a particle filter is incorporated for propagating sample distributions over time. Numerous experiments demonstrate the effectiveness of the proposed tracking algorithm in indoor and outdoor environments where the target objects undergo large pose and lighting changes. 1

4 0.15176944 44 nips-2004-Conditional Random Fields for Object Recognition

Author: Ariadna Quattoni, Michael Collins, Trevor Darrell

Abstract: We present a discriminative part-based approach for the recognition of object classes from unsegmented cluttered scenes. Objects are modeled as flexible constellations of parts conditioned on local observations found by an interest operator. For each object class the probability of a given assignment of parts to local features is modeled by a Conditional Random Field (CRF). We propose an extension of the CRF framework that incorporates hidden variables and combines class conditional CRFs into a unified framework for part-based object recognition. The parameters of the CRF are estimated in a maximum likelihood framework and recognition proceeds by finding the most likely class under our model. The main advantage of the proposed CRF framework is that it allows us to relax the assumption of conditional independence of the observed data (i.e. local features) often used in generative approaches, an assumption that might be too restrictive for a considerable number of object classes.

5 0.13279672 73 nips-2004-Generative Affine Localisation and Tracking

Author: John Winn, Andrew Blake

Abstract: We present an extension to the Jojic and Frey (2001) layered sprite model which allows for layers to undergo affine transformations. This extension allows for affine object pose to be inferred whilst simultaneously learning the object shape and appearance. Learning is carried out by applying an augmented variational inference algorithm which includes a global search over a discretised transform space followed by a local optimisation. To aid correct convergence, we use bottom-up cues to restrict the space of possible affine transformations. We present results on a number of video sequences and show how the model can be extended to track an object whose appearance changes throughout the sequence. 1

6 0.12930462 16 nips-2004-Adaptive Discriminative Generative Model and Its Applications

7 0.12569498 77 nips-2004-Hierarchical Clustering of a Mixture Model

8 0.11425965 13 nips-2004-A Three Tiered Approach for Articulated Object Action Modeling and Recognition

9 0.10872893 23 nips-2004-Analysis of a greedy active learning strategy

10 0.10153089 47 nips-2004-Contextual Models for Object Detection Using Boosted Random Fields

11 0.10054053 182 nips-2004-Synergistic Face Detection and Pose Estimation with Energy-Based Models

12 0.094281033 177 nips-2004-Supervised Graph Inference

13 0.089613594 192 nips-2004-The power of feature clustering: An application to object detection

14 0.085656658 134 nips-2004-Object Classification from a Single Example Utilizing Class Relevance Metrics

15 0.085578397 186 nips-2004-The Correlated Correspondence Algorithm for Unsupervised Registration of Nonrigid Surfaces

16 0.079666868 142 nips-2004-Outlier Detection with One-class Kernel Fisher Discriminants

17 0.07815896 85 nips-2004-Instance-Based Relevance Feedback for Image Retrieval

18 0.076322734 205 nips-2004-Who's In the Picture

19 0.07162331 14 nips-2004-A Topographic Support Vector Machine: Classification Using Local Label Configurations

20 0.071443081 91 nips-2004-Joint Tracking of Pose, Expression, and Texture using Conditionally Gaussian Filters


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.225), (1, 0.042), (2, -0.057), (3, -0.223), (4, 0.223), (5, 0.06), (6, 0.044), (7, -0.102), (8, -0.061), (9, 0.026), (10, -0.078), (11, 0.118), (12, 0.078), (13, 0.092), (14, -0.095), (15, -0.054), (16, 0.027), (17, 0.034), (18, -0.003), (19, -0.046), (20, 0.026), (21, 0.022), (22, 0.002), (23, 0.024), (24, 0.031), (25, -0.014), (26, 0.011), (27, -0.042), (28, 0.01), (29, -0.027), (30, -0.103), (31, -0.045), (32, 0.073), (33, 0.014), (34, -0.086), (35, 0.024), (36, 0.115), (37, 0.027), (38, -0.113), (39, 0.001), (40, -0.001), (41, 0.009), (42, 0.052), (43, 0.056), (44, -0.033), (45, 0.013), (46, 0.139), (47, -0.051), (48, 0.051), (49, 0.018)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9636538 40 nips-2004-Common-Frame Model for Object Recognition

Author: Pierre Moreels, Pietro Perona

Abstract: A generative probabilistic model for objects in images is presented. An object consists of a constellation of features. Feature appearance and pose are modeled probabilistically. Scene images are generated by drawing a set of objects from a given database, with random clutter sprinkled on the remaining image surface. Occlusion is allowed. We study the case where features from the same object share a common reference frame. Moreover, parameters for shape and appearance densities are shared across features. This is to be contrasted with previous work on probabilistic ‘constellation’ models where features depend on each other, and each feature and model have different pose and appearance statistics [1, 2]. These two differences allow us to build models containing hundreds of features, as well as to train each model from a single example. Our model may also be thought of as a probabilistic revisitation of Lowe’s model [3, 4]. We propose an efficient entropy-minimization inference algorithm that constructs the best interpretation of a scene as a collection of objects and clutter. We test our ideas with experiments on two image databases. We compare with Lowe’s algorithm and demonstrate better performance, in particular in presence of large amounts of background clutter.

2 0.72488165 73 nips-2004-Generative Affine Localisation and Tracking

Author: John Winn, Andrew Blake

Abstract: We present an extension to the Jojic and Frey (2001) layered sprite model which allows for layers to undergo affine transformations. This extension allows for affine object pose to be inferred whilst simultaneously learning the object shape and appearance. Learning is carried out by applying an augmented variational inference algorithm which includes a global search over a discretised transform space followed by a local optimisation. To aid correct convergence, we use bottom-up cues to restrict the space of possible affine transformations. We present results on a number of video sequences and show how the model can be extended to track an object whose appearance changes throughout the sequence. 1

3 0.69810236 99 nips-2004-Learning Hyper-Features for Visual Identification

Author: Andras D. Ferencz, Erik G. Learned-miller, Jitendra Malik

Abstract: We address the problem of identifying specific instances of a class (cars) from a set of images all belonging to that class. Although we cannot build a model for any particular instance (as we may be provided with only one “training” example of it), we can use information extracted from observing other members of the class. We pose this task as a learning problem, in which the learner is given image pairs, labeled as matching or not, and must discover which image features are most consistent for matching instances and discriminative for mismatches. We explore a patch based representation, where we model the distributions of similarity measurements defined on the patches. Finally, we describe an algorithm that selects the most salient patches based on a mutual information criterion. This algorithm performs identification well for our challenging dataset of car images, after matching only a few, well chosen patches. 1

4 0.60718524 44 nips-2004-Conditional Random Fields for Object Recognition

Author: Ariadna Quattoni, Michael Collins, Trevor Darrell

Abstract: We present a discriminative part-based approach for the recognition of object classes from unsegmented cluttered scenes. Objects are modeled as flexible constellations of parts conditioned on local observations found by an interest operator. For each object class the probability of a given assignment of parts to local features is modeled by a Conditional Random Field (CRF). We propose an extension of the CRF framework that incorporates hidden variables and combines class conditional CRFs into a unified framework for part-based object recognition. The parameters of the CRF are estimated in a maximum likelihood framework and recognition proceeds by finding the most likely class under our model. The main advantage of the proposed CRF framework is that it allows us to relax the assumption of conditional independence of the observed data (i.e. local features) often used in generative approaches, an assumption that might be too restrictive for a considerable number of object classes.

5 0.59548664 83 nips-2004-Incremental Learning for Visual Tracking

Author: Jongwoo Lim, David A. Ross, Ruei-sung Lin, Ming-Hsuan Yang

Abstract: Most existing tracking algorithms construct a representation of a target object prior to the tracking task starts, and utilize invariant features to handle appearance variation of the target caused by lighting, pose, and view angle change. In this paper, we present an efficient and effective online algorithm that incrementally learns and adapts a low dimensional eigenspace representation to reflect appearance changes of the target, thereby facilitating the tracking task. Furthermore, our incremental method correctly updates the sample mean and the eigenbasis, whereas existing incremental subspace update methods ignore the fact the sample mean varies over time. The tracking problem is formulated as a state inference problem within a Markov Chain Monte Carlo framework and a particle filter is incorporated for propagating sample distributions over time. Numerous experiments demonstrate the effectiveness of the proposed tracking algorithm in indoor and outdoor environments where the target objects undergo large pose and lighting changes. 1

6 0.55882418 16 nips-2004-Adaptive Discriminative Generative Model and Its Applications

7 0.5586375 47 nips-2004-Contextual Models for Object Detection Using Boosted Random Fields

8 0.53036898 192 nips-2004-The power of feature clustering: An application to object detection

9 0.50199145 13 nips-2004-A Three Tiered Approach for Articulated Object Action Modeling and Recognition

10 0.49955937 182 nips-2004-Synergistic Face Detection and Pose Estimation with Energy-Based Models

11 0.47262996 205 nips-2004-Who's In the Picture

12 0.43934608 191 nips-2004-The Variational Ising Classifier (VIC) Algorithm for Coherently Contaminated Data

13 0.43573746 77 nips-2004-Hierarchical Clustering of a Mixture Model

14 0.42391726 23 nips-2004-Analysis of a greedy active learning strategy

15 0.42374304 186 nips-2004-The Correlated Correspondence Algorithm for Unsupervised Registration of Nonrigid Surfaces

16 0.39056516 134 nips-2004-Object Classification from a Single Example Utilizing Class Relevance Metrics

17 0.38900009 85 nips-2004-Instance-Based Relevance Feedback for Image Retrieval

18 0.38875517 195 nips-2004-Trait Selection for Assessing Beef Meat Quality Using Non-linear SVM

19 0.3821207 15 nips-2004-Active Learning for Anomaly and Rare-Category Detection

20 0.3786931 91 nips-2004-Joint Tracking of Pose, Expression, and Texture using Conditionally Gaussian Filters


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(13, 0.076), (15, 0.155), (17, 0.017), (24, 0.298), (26, 0.062), (31, 0.034), (33, 0.181), (35, 0.022), (39, 0.017), (50, 0.028), (71, 0.022)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.8206687 40 nips-2004-Common-Frame Model for Object Recognition

Author: Pierre Moreels, Pietro Perona

Abstract: A generative probabilistic model for objects in images is presented. An object consists of a constellation of features. Feature appearance and pose are modeled probabilistically. Scene images are generated by drawing a set of objects from a given database, with random clutter sprinkled on the remaining image surface. Occlusion is allowed. We study the case where features from the same object share a common reference frame. Moreover, parameters for shape and appearance densities are shared across features. This is to be contrasted with previous work on probabilistic ‘constellation’ models where features depend on each other, and each feature and model have different pose and appearance statistics [1, 2]. These two differences allow us to build models containing hundreds of features, as well as to train each model from a single example. Our model may also be thought of as a probabilistic revisitation of Lowe’s model [3, 4]. We propose an efficient entropy-minimization inference algorithm that constructs the best interpretation of a scene as a collection of objects and clutter. We test our ideas with experiments on two image databases. We compare with Lowe’s algorithm and demonstrate better performance, in particular in presence of large amounts of background clutter.

2 0.80195493 26 nips-2004-At the Edge of Chaos: Real-time Computations and Self-Organized Criticality in Recurrent Neural Networks

Author: Nils Bertschinger, Thomas Natschläger, Robert A. Legenstein

Abstract: In this paper we analyze the relationship between the computational capabilities of randomly connected networks of threshold gates in the timeseries domain and their dynamical properties. In particular we propose a complexity measure which we find to assume its highest values near the edge of chaos, i.e. the transition from ordered to chaotic dynamics. Furthermore we show that the proposed complexity measure predicts the computational capabilities very well: only near the edge of chaos are such networks able to perform complex computations on time series. Additionally a simple synaptic scaling rule for self-organized criticality is presented and analyzed. 1

3 0.69236666 9 nips-2004-A Method for Inferring Label Sampling Mechanisms in Semi-Supervised Learning

Author: Saharon Rosset, Ji Zhu, Hui Zou, Trevor J. Hastie

Abstract: We consider the situation in semi-supervised learning, where the “label sampling” mechanism stochastically depends on the true response (as well as potentially on the features). We suggest a method of moments for estimating this stochastic dependence using the unlabeled data. This is potentially useful for two distinct purposes: a. As an input to a supervised learning procedure which can be used to “de-bias” its results using labeled data only and b. As a potentially interesting learning task in itself. We present several examples to illustrate the practical usefulness of our method.

4 0.66134977 53 nips-2004-Discriminant Saliency for Visual Recognition from Cluttered Scenes

Author: Dashan Gao, Nuno Vasconcelos

Abstract: Saliency mechanisms play an important role when visual recognition must be performed in cluttered scenes. We propose a computational definition of saliency that deviates from existing models by equating saliency to discrimination. In particular, the salient attributes of a given visual class are defined as the features that enable best discrimination between that class and all other classes of recognition interest. It is shown that this definition leads to saliency algorithms of low complexity, that are scalable to large recognition problems, and is compatible with existing models of early biological vision. Experimental results demonstrating success in the context of challenging recognition problems are also presented. 1

5 0.65505844 16 nips-2004-Adaptive Discriminative Generative Model and Its Applications

Author: Ruei-sung Lin, David A. Ross, Jongwoo Lim, Ming-Hsuan Yang

Abstract: This paper presents an adaptive discriminative generative model that generalizes the conventional Fisher Linear Discriminant algorithm and renders a proper probabilistic interpretation. Within the context of object tracking, we aim to find a discriminative generative model that best separates the target from the background. We present a computationally efficient algorithm to constantly update this discriminative model as time progresses. While most tracking algorithms operate on the premise that the object appearance or ambient lighting condition does not significantly change as time progresses, our method adapts a discriminative generative model to reflect appearance variation of the target and background, thereby facilitating the tracking task in ever-changing environments. Numerous experiments show that our method is able to learn a discriminative generative model for tracking target objects undergoing large pose and lighting changes.

6 0.65471512 14 nips-2004-A Topographic Support Vector Machine: Classification Using Local Label Configurations

7 0.65239108 58 nips-2004-Edge of Chaos Computation in Mixed-Mode VLSI - A Hard Liquid

8 0.6506868 118 nips-2004-Methods for Estimating the Computational Power and Generalization Capability of Neural Microcircuits

9 0.65018785 178 nips-2004-Support Vector Classification with Input Data Uncertainty

10 0.64897639 133 nips-2004-Nonparametric Transforms of Graph Kernels for Semi-Supervised Learning

11 0.64894307 68 nips-2004-Face Detection --- Efficient and Rank Deficient

12 0.64810652 189 nips-2004-The Power of Selective Memory: Self-Bounded Learning of Prediction Suffix Trees

13 0.64708108 110 nips-2004-Matrix Exponential Gradient Updates for On-line Learning and Bregman Projection

14 0.64597809 167 nips-2004-Semi-supervised Learning with Penalized Probabilistic Clustering

15 0.64571929 4 nips-2004-A Generalized Bradley-Terry Model: From Group Competition to Individual Skill

16 0.64563185 131 nips-2004-Non-Local Manifold Tangent Learning

17 0.64462179 93 nips-2004-Kernel Projection Machine: a New Tool for Pattern Recognition

18 0.64452732 174 nips-2004-Spike Sorting: Bayesian Clustering of Non-Stationary Data

19 0.64443839 187 nips-2004-The Entire Regularization Path for the Support Vector Machine

20 0.64442599 31 nips-2004-Blind One-microphone Speech Separation: A Spectral Learning Approach