nips nips2004 nips2004-47 knowledge-graph by maker-knowledge-mining

47 nips-2004-Contextual Models for Object Detection Using Boosted Random Fields


Source: pdf

Author: Antonio Torralba, Kevin P. Murphy, William T. Freeman

Abstract: We seek to both detect and segment objects in images. To exploit both local image data as well as contextual information, we introduce Boosted Random Fields (BRFs), which uses Boosting to learn the graph structure and local evidence of a conditional random field (CRF). The graph structure is learned by assembling graph fragments in an additive model. The connections between individual pixels are not very informative, but by using dense graphs, we can pool information from large regions of the image; dense models also support efficient inference. We show how contextual information from other objects can improve detection performance, both in terms of accuracy and speed, by using a computational cascade. We apply our system to detect stuff and things in office and street scenes. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Contextual models for object detection using boosted random fields Antonio Torralba MIT, CSAIL Cambridge, MA 02139 torralba@mit. [sent-1, score-0.22]

2 edu Abstract We seek to both detect and segment objects in images. [sent-7, score-0.144]

3 To exploit both local image data as well as contextual information, we introduce Boosted Random Fields (BRFs), which uses Boosting to learn the graph structure and local evidence of a conditional random field (CRF). [sent-8, score-0.431]

4 The graph structure is learned by assembling graph fragments in an additive model. [sent-9, score-0.231]

5 The connections between individual pixels are not very informative, but by using dense graphs, we can pool information from large regions of the image; dense models also support efficient inference. [sent-10, score-0.104]

6 We show how contextual information from other objects can improve detection performance, both in terms of accuracy and speed, by using a computational cascade. [sent-11, score-0.287]

7 We apply our system to detect stuff and things in office and street scenes. [sent-12, score-0.252]

8 1 Introduction Our long-term goal is to build a vision system that can examine an image and describe what objects are in it, and where. [sent-13, score-0.192]

9 5(a), objects of interest, such as the keyboard or mouse, are so small that they are impossible to detect just by using local features. [sent-15, score-0.377]

10 Murphy et al [9] used global scene context to help object recognition, but did not model relationships between objects. [sent-18, score-0.093]

11 Fink and Perona [4] exploited local dependencies in a boosting framework, but did not allow for multiple rounds of communication between correlated objects. [sent-19, score-0.361]

12 He et al [6] do not model connections between objects directly, but rather they induce such correlations indirectly, via a bank of hidden variables, using a “restricted Boltzmann machine” architecture. [sent-20, score-0.144]

13 In this paper, we exploit contextual correlations between the object classes by introducing Boosted Random Fields (BRFs). [sent-21, score-0.217]

14 Boosted random fields build on both boosting [5, 10] and conditional random fields (CRFs) [8, 7, 6]. [sent-22, score-0.299]

15 Boosting is a simple way of sequentially constructing “strong” classifiers from “weak” components, and has been used for singleclass object detection with great success [12]. [sent-23, score-0.169]

16 Dietterich et al [3] combine boosting and 1D CRFs, but they only consider the problem of learning the local evidence potentials; we consider the much harder problem of learning the structure of a 2D CRF. [sent-24, score-0.374]

17 We propose a method for learning densely connected random fields with long range connections. [sent-27, score-0.101]

18 The topology of these connections is chosen by a weak learner which has access to a library of graph fragments, derived from patches of labeled training images, which reflect typical spatial arrangments of objects (similar to the segmentation fragments in [2]). [sent-28, score-0.599]

19 At each round of the learning algorithm, we add more connections from other locations in the image and from other classes (detectors). [sent-29, score-0.22]

20 The connections are assumed to be spatially invariant, which means this update can be performed using convolution followed by a sigmoid nonlinearity. [sent-30, score-0.101]

21 In addition to recognizing things, such as cars and people, we are also interested in recognizing spatially extended “stuff” [1], such as roads and buildings. [sent-32, score-0.129]

22 The traditional sliding window approach to object detection does not work well for detecting “stuff”. [sent-33, score-0.2]

23 Instead, we combine object detection and image segmentation (c. [sent-34, score-0.306]

24 We do not rely on a bottom-up image segmentation algorithm, which can be fragile without top-down guidance. [sent-37, score-0.137]

25 2 Learning potentials and graph structure A conditional random field (CRF) is a distribution of the form 1 φi (Si ) ψi,j (Si , Sj ) P (S|x) = Z i j∈Ni where x is the input (e. [sent-38, score-0.182]

26 Our goal is to learn the local evidence potentials, φi , the compatibility potentials ψ, and the set of neighbors Ni . [sent-42, score-0.296]

27 We propose the following simple approximation: use belief propagation (BP) to estimate the marginals, P (Si |x), and then use boosting to maximize the likelihood of each node’s training data with respect to φi and ψ. [sent-43, score-0.331]

28 If we assume that the local potentials t t have the form φt (si ) = [eFi /2 ; e−Fi /2 ], where Fit is some function of the input data, then: i bt (+1) = σ(Fit + Gt ), Gt = log Mit (+1) − log Mit (−1) (3) i i i −u where σ(u) = 1/(1 + e ) is the sigmoid function. [sent-51, score-0.538]

29 Input: a set of labeled pairs {xi,m ; Si,m }, bound T t Output: Local evidence functions fit (x) and message update functions gi (bNi ). [sent-55, score-0.45]

30 (a) Fit local potential fi (xi,m ) by weighted LS to t t t Yi,m = Si,m (1 + e−Si,m (Fi +Gi,m ) ) t t . [sent-61, score-0.113]

31 (b) Fit compatibilities gi (bt−1 ) to Yi,m by weighted LS. [sent-62, score-0.193]

32 Ni ,m t−1 t (c) Compute local potential Fi,m = Fi,m + fit (xi,m ) t g n (bt−1 ) Ni ,m n=1 i t the beliefs bt = σ(Fi,m + Gt ) i,m i,m t+1 weights wi,m = bt (−1) bt (+1) i,m i,m (d) Compute compatibilities Gt = i,m (e) Update (f) Update Figure 1: BRF training algorithm. [sent-63, score-1.681]

33 We assume that the graph is very densely connected so that the information that one single node sends to another is so small that we can make the approximation µt+1 (+1)/µt+1 (−1) ≃ 1. [sent-64, score-0.243]

34 (This is a reasonable approximation in the case of images, k→i k→i where each node represents a single pixel; only when the influence of many pixels is taken into account will the messages become informative. [sent-65, score-0.165]

35 Therefore, We can write the beliefs at iteration t as a function of the local evidences and the beliefs at time t − 1: bt (+1) = σ(Fit (xi,m ) + Gt (bt−1 )). [sent-67, score-0.85]

36 The key idea i i m behind BRFs is to use boosting to learn the G functions, which approximately implement message passing in densely connected graphs. [sent-68, score-0.435]

37 1 Learning local evidence potentials Defining Fit (xi,m ) = Fit−1 (xi,m ) + fit (xi,m ) as an additive model, where xi,m are the features of training sample m at node i, we can learn this function in a stagewise fashion by optimizing the second order Taylor expansion of Eq. [sent-71, score-0.539]

38 4 wrt fit , as in logitBoost [5]: arg min log Jit ≃ arg min t t fi fi t t t wi,m (Yi,m − fit (xi,m ))2 (7) m t t where Yi,m = Si,m (1+e−Si,m (Fi +Gi,m ) ). [sent-72, score-0.494]

39 In the case that the weak learner is a “regression stump”, fi (x) = ah(x)+b, we can find the optimal a, b by solving a weighted least squares t problem, with weights wi,m = bt (−1) bt (+1); we can find the best basis function h(x) by i i searching over all elements of a dictionary. [sent-73, score-1.002]

40 2 Learning compatibility potentials and graph structure In this section, we discuss how to learn the compatibility functions ψij , and hence the structure of the graph. [sent-75, score-0.35]

41 Instead of learning the compatibility functions ψij , we propose to t 1. [sent-76, score-0.097]

42 Input: a set of inputs {xi,m } and functions fit , gi Output: Set of beliefs bi,m and MAP estimates Si,m . [sent-77, score-0.46]

43 From t = 1 to T , repeat t−1 t (a) Update local evidences Fi,m = Fi,m + fit (xi,m ) (b) Update compatibilities Gt = i,m (c) Compute current beliefs bt i,m = t g n (bt−1 ) Ni ,m n=1 i t σ(Fi,m + Gt ) i,m 4. [sent-80, score-0.906]

44 We propose to use an additive model for Gt+1 as we i i t n did for learning F : Gt+1 = n=1 gi (bt ), where bt is a vector with the beliefs of all m m i,m n nodes in the graph at iteration t for the training sample m. [sent-84, score-0.859]

45 The weak learners gi (bt ) can m n be regression stumps with the form gi (bt ) = aδ(w · bt > θ) + b, where a, b, θ are the m m parameters of the regression stump, and wi is a set of weights selected from a dictionary. [sent-85, score-0.77]

46 The weak learners gi (bt ) will also be linear m t functions. [sent-87, score-0.271]

47 Hence the belief update simplifies to bt+1 (+1) = σ(αi · bt + βi + Fi,m ), which m i,m is similar to the mean-field update equations. [sent-88, score-0.505]

48 The neighborhood Ni over which we sum incoming messages is determined by the graph structure, which is encoded in the non-zero n values of αi . [sent-89, score-0.166]

49 Each weak learner gi will compute a weighted combination of the beliefs of the some subset of the nodes; this subset may change from iteration to iteration, and can be t quite large. [sent-90, score-0.484]

50 At iteration t, we choose the weak learner gi so as to minimize t t t−1 log 1 + e−Si,m (Fi,m +gi (bm log Jit (bt−1 ) = − )+ t−1 n=1 n gi (bt−1 )) m m which reduces to a weighted least squares problem similar to Eq. [sent-91, score-0.458]

51 3 BRFs for multiclass object detection and segmentation With the BRF training algorithm in hand, we describe our approach for multiclass object detection and region-labeling using densely connected BRFs. [sent-96, score-0.599]

52 1 Weak learners for detecting stuff and things The square sliding window approach does not provide a natural way of working with irregular objects. [sent-98, score-0.256]

53 Using region labeling as an image representation allows dealing with irregular and extended objects (buildings, bookshelf, road, . [sent-99, score-0.252]

54 Extended stuff [1] may be a very important source of contextual information for other objects. [sent-103, score-0.223]

55 (a) Examples from the dictionary of about 2000 patches and masks, Ux,y , Vx,y . [sent-104, score-0.129]

56 Figure 3: Examples of patches from the dictionary and an example of the segmentation obtained using boosting trained with patches from (a). [sent-110, score-0.524]

57 The weak learners we use for the local evidence potentials are based on the segmentation fragments proposed in [2]. [sent-111, score-0.509]

58 Specifically, we create a dictionary of about 2000 image patches U , chosen at random (but overlapping each object), plus a corresponding set of binary (inclass/ out-of-class) image masks, V : see Fig. [sent-112, score-0.271]

59 At each round t, for each class c, and for each dictionary entry, we construct the following weak learner, whose output is a binary matrix of the same size as the image I: v(I) = ((I ⊗ U ) > θ) ∗ V > 0 (9) where ⊗ represents normalized cross-correlation and ∗ represents convolution. [sent-114, score-0.367]

60 The intuition behind this is that I ⊗ U will produce peaks at image locations that contain this patch/template, and then convolving with V will superimpose the segmentation mask on top of the peaks. [sent-115, score-0.206]

61 To be able to detect objects at multiple scales, we first downsample the image to scale σ, compute v(I ↓ σ), and then upsample the result. [sent-117, score-0.215]

62 The final weak learner does this for multiple scales, ORs all the results together, and then takes a linear transformation. [sent-118, score-0.162]

63 3(c) shows an example of segmentation obtained by using boosting without context. [sent-120, score-0.339]

64 The weak learners we use for the compatibility functions have a similar form: C gc (b) = α bc′ ∗ Wc′ +β (11) c′ =1 where bc′ is the image formed by the beliefs at all pixels for class c′ . [sent-121, score-0.541]

65 The binary kernels (graph fragments) W define, for each node x, y of object class c, all the nodes from which it will receive messages. [sent-124, score-0.172]

66 These kernels are chosen by sampling patches of various sizes from the labeling of images from the training set. [sent-125, score-0.178]

67 This allows generating complicated patterns of connectivity that reflect the statistics of object co-occurrences in the training set. [sent-126, score-0.123]

68 The overall incoming message is given by adding the kernels obtained at each boosting round. [sent-127, score-0.375]

69 (This is the key difference from mutual boosting [4], where the incoming message is just the output of a single weak learner; thus, in mutual boosting, previously learned inter-class connections are only used once. [sent-128, score-0.563]

70 ) Although it would seem to take O(t) time to compute Gt , we can precompute a single equivalent kernel W ′ , so at runtime the overall complexity is still linear in the number of boosting rounds, O(T ). [sent-129, score-0.301]

71 C Gt x,y,c = t n αn Wc′ b c′ ∗ c′ =1 n=1 def C βn = + n ′ b c′ ∗ W c′ + β ′ c′ =1 car car building car road car F Road b=σ(F+G) Car x car building building building road building Building car road building road a) Incoming messages to a car node. [sent-130, score-1.92]

72 t=1 t=2 t=4 G road road y t=20 c) A car out of context (outside 3rd floor windows) is less of a car. [sent-132, score-0.413]

73 t=40 Final labeling b(car) S(all) d) Evolution of the beliefs for the car nodes (b) and labeling (S) for road, building, car. [sent-133, score-0.462]

74 The BRF is trained to detect cars, buildings and the road. [sent-135, score-0.119]

75 4(a-b), we show the structures of the graph and the weights W ′ defined by GT for a BRF trained to detect cars, buildings and roads in street scenes. [sent-137, score-0.263]

76 2 Learning and inference For training we used a labeled dataset of office and street scenes with about 100 images in each set. [sent-139, score-0.146]

77 During the training, in the first 5 rounds we only update the local potentials, to allow local evidence to accrue. [sent-140, score-0.233]

78 After the 5th iteration we start updating also the compatibility functions. [sent-141, score-0.144]

79 At each round, we update only the local potential and compatibility function associated with a single object class that reduces the most the multiclass cost. [sent-142, score-0.346]

80 This allows objects that need many features to have more complicated local potentials. [sent-143, score-0.138]

81 The easy-to-detect objects can then pass information to the harder ones. [sent-145, score-0.087]

82 A similar behavior is obtained for the car detector (Fig. [sent-149, score-0.137]

83 The detection of building and road provides strong constraints for the locations of the car. [sent-151, score-0.271]

84 3 Cascade of classifiers with BRFs The BRF can be turned into a cascade [12] by thresholding the beliefs. [sent-153, score-0.094]

85 At each round we update a t binary rejection mask for each object class, Rx,y,c , by thresholding the beliefs at round t: t t−1 t t Rx,y,c = Rx,y,c δ(bx,y,c > θc ). [sent-155, score-0.615]

86 A pixel in the rejection mask is set to zero when we can t decide that the object is not present (when bt x,y,c is below the threshold θc ≃ 0), and it is set t to 1 when more processing is required. [sent-156, score-0.613]

87 Similarity we can define a detection mask that will indicate pixels in which we decide the object is present. [sent-158, score-0.285]

88 The mask is then used for computing the features v(I) and messages G by applying the convolutions only on the pixels not yet classified. [sent-159, score-0.214]

89 In this desk scene, it is easy to identify objects like the screen, keyboard and mouse, even though the local information is sometimes insufficient. [sent-163, score-0.32]

90 Middle: the evolution of the beliefs (b and F and G) during detection for a test image. [sent-164, score-0.27]

91 The graph bellow shows the average evolution of the area under the ROC for the three objects on 120 test images. [sent-166, score-0.174]

92 6 we compare the reduction of the search space when implementing a cascade using independent boosting (which reduces to Viola and Jones [12]), and when using BRF’s. [sent-169, score-0.394]

93 We see that for objects for which context is the main source of information, like the mouse, the reduction in search space is much more dramatic using BRFs than using boosting alone. [sent-170, score-0.389]

94 4 Conclusion The proposed BRF algorithm combines boosting and CRF’s, providing an algorithm that is easy for both training and inference. [sent-171, score-0.303]

95 We have demonstrated object detection in cluttered scenes by exploiting contextual relationships between objects. [sent-172, score-0.328]

96 The BRF algorithm is computationally efficient and provides a natural extension of the cascade of classifiers by integrating evidence from other objects in order to quickly reject certain image regions. [sent-173, score-0.271]

97 The BRF’s densely connected graphs, which efficiently collect information over large image regions, provide an alternative framework to nearest-neighbor grids for vision problems. [sent-174, score-0.206]

98 5 False alarm rate 1 Figure 6: Contextual information reduces the search space in the framework of a cascade and improves performances. [sent-183, score-0.171]

99 Discriminative random fields: A discriminative framework for contextual interaction in classification. [sent-229, score-0.124]

100 Using the forest to see the trees: a graphical model relating features, objects and scenes. [sent-244, score-0.087]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('bt', 0.389), ('brf', 0.374), ('boosting', 0.273), ('gt', 0.264), ('mouse', 0.198), ('sk', 0.189), ('fit', 0.185), ('keyboard', 0.182), ('beliefs', 0.165), ('road', 0.138), ('car', 0.137), ('brfs', 0.125), ('screen', 0.124), ('contextual', 0.124), ('gi', 0.11), ('weak', 0.104), ('stuff', 0.099), ('potentials', 0.098), ('compatibility', 0.097), ('object', 0.093), ('round', 0.092), ('ni', 0.088), ('objects', 0.087), ('compatibilities', 0.083), ('jit', 0.083), ('fragments', 0.083), ('detection', 0.076), ('densely', 0.073), ('dictionary', 0.073), ('image', 0.071), ('mask', 0.069), ('messages', 0.067), ('labeling', 0.066), ('segmentation', 0.066), ('cascade', 0.063), ('fi', 0.062), ('buildings', 0.062), ('message', 0.061), ('graph', 0.058), ('learner', 0.058), ('elds', 0.057), ('learners', 0.057), ('building', 0.057), ('detect', 0.057), ('connections', 0.057), ('patches', 0.056), ('street', 0.055), ('local', 0.051), ('boosted', 0.051), ('node', 0.051), ('evidence', 0.05), ('alarm', 0.05), ('torralba', 0.05), ('pixels', 0.047), ('iteration', 0.047), ('si', 0.046), ('cars', 0.046), ('update', 0.044), ('stagewise', 0.042), ('things', 0.041), ('crfs', 0.041), ('crf', 0.041), ('murphy', 0.041), ('incoming', 0.041), ('bc', 0.04), ('rounds', 0.037), ('fink', 0.036), ('wc', 0.036), ('ce', 0.035), ('scenes', 0.035), ('vision', 0.034), ('stump', 0.033), ('evidences', 0.033), ('sends', 0.033), ('pixel', 0.033), ('multiclass', 0.032), ('additive', 0.032), ('thresholding', 0.031), ('sliding', 0.031), ('roads', 0.031), ('convolutions', 0.031), ('csail', 0.031), ('training', 0.03), ('xm', 0.03), ('evolution', 0.029), ('rejection', 0.029), ('masks', 0.029), ('search', 0.029), ('reduces', 0.029), ('connected', 0.028), ('graphs', 0.028), ('belief', 0.028), ('nodes', 0.028), ('runtime', 0.028), ('seeing', 0.028), ('irregular', 0.028), ('output', 0.027), ('recognizing', 0.026), ('conditional', 0.026), ('images', 0.026)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999958 47 nips-2004-Contextual Models for Object Detection Using Boosted Random Fields

Author: Antonio Torralba, Kevin P. Murphy, William T. Freeman

Abstract: We seek to both detect and segment objects in images. To exploit both local image data as well as contextual information, we introduce Boosted Random Fields (BRFs), which uses Boosting to learn the graph structure and local evidence of a conditional random field (CRF). The graph structure is learned by assembling graph fragments in an additive model. The connections between individual pixels are not very informative, but by using dense graphs, we can pool information from large regions of the image; dense models also support efficient inference. We show how contextual information from other objects can improve detection performance, both in terms of accuracy and speed, by using a computational cascade. We apply our system to detect stuff and things in office and street scenes. 1

2 0.15837646 19 nips-2004-An Application of Boosting to Graph Classification

Author: Taku Kudo, Eisaku Maeda, Yuji Matsumoto

Abstract: This paper presents an application of Boosting for classifying labeled graphs, general structures for modeling a number of real-world data, such as chemical compounds, natural language texts, and bio sequences. The proposal consists of i) decision stumps that use subgraph as features, and ii) a Boosting algorithm in which subgraph-based decision stumps are used as weak learners. We also discuss the relation between our algorithm and SVMs with convolution kernels. Two experiments using natural language data and chemical compounds show that our method achieves comparable or even better performance than SVMs with convolution kernels as well as improves the testing efficiency. 1

3 0.15343969 44 nips-2004-Conditional Random Fields for Object Recognition

Author: Ariadna Quattoni, Michael Collins, Trevor Darrell

Abstract: We present a discriminative part-based approach for the recognition of object classes from unsegmented cluttered scenes. Objects are modeled as flexible constellations of parts conditioned on local observations found by an interest operator. For each object class the probability of a given assignment of parts to local features is modeled by a Conditional Random Field (CRF). We propose an extension of the CRF framework that incorporates hidden variables and combines class conditional CRFs into a unified framework for part-based object recognition. The parameters of the CRF are estimated in a maximum likelihood framework and recognition proceeds by finding the most likely class under our model. The main advantage of the proposed CRF framework is that it allows us to relax the assumption of conditional independence of the observed data (i.e. local features) often used in generative approaches, an assumption that might be too restrictive for a considerable number of object classes.

4 0.131589 139 nips-2004-Optimal Aggregation of Classifiers and Boosting Maps in Functional Magnetic Resonance Imaging

Author: Vladimir Koltchinskii, Manel Martínez-ramón, Stefan Posse

Abstract: We study a method of optimal data-driven aggregation of classifiers in a convex combination and establish tight upper bounds on its excess risk with respect to a convex loss function under the assumption that the solution of optimal aggregation problem is sparse. We use a boosting type algorithm of optimal aggregation to develop aggregate classifiers of activation patterns in fMRI based on locally trained SVM classifiers. The aggregation coefficients are then used to design a ”boosting map” of the brain needed to identify the regions with most significant impact on classification. 1

5 0.11402566 189 nips-2004-The Power of Selective Memory: Self-Bounded Learning of Prediction Suffix Trees

Author: Ofer Dekel, Shai Shalev-shwartz, Yoram Singer

Abstract: Prediction suffix trees (PST) provide a popular and effective tool for tasks such as compression, classification, and language modeling. In this paper we take a decision theoretic view of PSTs for the task of sequence prediction. Generalizing the notion of margin to PSTs, we present an online PST learning algorithm and derive a loss bound for it. The depth of the PST generated by this algorithm scales linearly with the length of the input. We then describe a self-bounded enhancement of our learning algorithm which automatically grows a bounded-depth PST. We also prove an analogous mistake-bound for the self-bounded algorithm. The result is an efficient algorithm that neither relies on a-priori assumptions on the shape or maximal depth of the target PST nor does it require any parameters. To our knowledge, this is the first provably-correct PST learning algorithm which generates a bounded-depth PST while being competitive with any fixed PST determined in hindsight. 1

6 0.10153089 40 nips-2004-Common-Frame Model for Object Recognition

7 0.095219523 192 nips-2004-The power of feature clustering: An application to object detection

8 0.09013769 116 nips-2004-Message Errors in Belief Propagation

9 0.087970771 91 nips-2004-Joint Tracking of Pose, Expression, and Texture using Conditionally Gaussian Filters

10 0.084004 73 nips-2004-Generative Affine Localisation and Tracking

11 0.083603971 99 nips-2004-Learning Hyper-Features for Visual Identification

12 0.074542709 162 nips-2004-Semi-Markov Conditional Random Fields for Information Extraction

13 0.071899302 28 nips-2004-Bayesian inference in spiking neurons

14 0.066402353 134 nips-2004-Object Classification from a Single Example Utilizing Class Relevance Metrics

15 0.063076094 68 nips-2004-Face Detection --- Efficient and Rank Deficient

16 0.059892934 110 nips-2004-Matrix Exponential Gradient Updates for On-line Learning and Bregman Projection

17 0.059454713 32 nips-2004-Boosting on Manifolds: Adaptive Regularization of Base Classifiers

18 0.058843225 100 nips-2004-Learning Preferences for Multiclass Problems

19 0.057677485 89 nips-2004-Joint MRI Bias Removal Using Entropy Minimization Across Images

20 0.057534866 144 nips-2004-Parallel Support Vector Machines: The Cascade SVM


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.186), (1, 0.023), (2, 0.005), (3, -0.076), (4, 0.206), (5, 0.03), (6, 0.11), (7, -0.006), (8, -0.127), (9, -0.0), (10, -0.115), (11, -0.096), (12, 0.021), (13, 0.06), (14, -0.004), (15, -0.061), (16, 0.107), (17, 0.048), (18, 0.129), (19, 0.025), (20, 0.134), (21, -0.071), (22, -0.051), (23, 0.055), (24, -0.037), (25, 0.063), (26, 0.036), (27, -0.083), (28, -0.064), (29, -0.116), (30, 0.042), (31, -0.058), (32, 0.005), (33, -0.142), (34, -0.083), (35, 0.067), (36, 0.0), (37, -0.169), (38, -0.039), (39, -0.032), (40, -0.023), (41, 0.108), (42, -0.095), (43, -0.081), (44, 0.084), (45, 0.036), (46, 0.098), (47, -0.105), (48, 0.19), (49, -0.012)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96197259 47 nips-2004-Contextual Models for Object Detection Using Boosted Random Fields

Author: Antonio Torralba, Kevin P. Murphy, William T. Freeman

Abstract: We seek to both detect and segment objects in images. To exploit both local image data as well as contextual information, we introduce Boosted Random Fields (BRFs), which uses Boosting to learn the graph structure and local evidence of a conditional random field (CRF). The graph structure is learned by assembling graph fragments in an additive model. The connections between individual pixels are not very informative, but by using dense graphs, we can pool information from large regions of the image; dense models also support efficient inference. We show how contextual information from other objects can improve detection performance, both in terms of accuracy and speed, by using a computational cascade. We apply our system to detect stuff and things in office and street scenes. 1

2 0.53080606 19 nips-2004-An Application of Boosting to Graph Classification

Author: Taku Kudo, Eisaku Maeda, Yuji Matsumoto

Abstract: This paper presents an application of Boosting for classifying labeled graphs, general structures for modeling a number of real-world data, such as chemical compounds, natural language texts, and bio sequences. The proposal consists of i) decision stumps that use subgraph as features, and ii) a Boosting algorithm in which subgraph-based decision stumps are used as weak learners. We also discuss the relation between our algorithm and SVMs with convolution kernels. Two experiments using natural language data and chemical compounds show that our method achieves comparable or even better performance than SVMs with convolution kernels as well as improves the testing efficiency. 1

3 0.48205718 139 nips-2004-Optimal Aggregation of Classifiers and Boosting Maps in Functional Magnetic Resonance Imaging

Author: Vladimir Koltchinskii, Manel Martínez-ramón, Stefan Posse

Abstract: We study a method of optimal data-driven aggregation of classifiers in a convex combination and establish tight upper bounds on its excess risk with respect to a convex loss function under the assumption that the solution of optimal aggregation problem is sparse. We use a boosting type algorithm of optimal aggregation to develop aggregate classifiers of activation patterns in fMRI based on locally trained SVM classifiers. The aggregation coefficients are then used to design a ”boosting map” of the brain needed to identify the regions with most significant impact on classification. 1

4 0.47979486 44 nips-2004-Conditional Random Fields for Object Recognition

Author: Ariadna Quattoni, Michael Collins, Trevor Darrell

Abstract: We present a discriminative part-based approach for the recognition of object classes from unsegmented cluttered scenes. Objects are modeled as flexible constellations of parts conditioned on local observations found by an interest operator. For each object class the probability of a given assignment of parts to local features is modeled by a Conditional Random Field (CRF). We propose an extension of the CRF framework that incorporates hidden variables and combines class conditional CRFs into a unified framework for part-based object recognition. The parameters of the CRF are estimated in a maximum likelihood framework and recognition proceeds by finding the most likely class under our model. The main advantage of the proposed CRF framework is that it allows us to relax the assumption of conditional independence of the observed data (i.e. local features) often used in generative approaches, an assumption that might be too restrictive for a considerable number of object classes.

5 0.46915734 40 nips-2004-Common-Frame Model for Object Recognition

Author: Pierre Moreels, Pietro Perona

Abstract: A generative probabilistic model for objects in images is presented. An object consists of a constellation of features. Feature appearance and pose are modeled probabilistically. Scene images are generated by drawing a set of objects from a given database, with random clutter sprinkled on the remaining image surface. Occlusion is allowed. We study the case where features from the same object share a common reference frame. Moreover, parameters for shape and appearance densities are shared across features. This is to be contrasted with previous work on probabilistic ‘constellation’ models where features depend on each other, and each feature and model have different pose and appearance statistics [1, 2]. These two differences allow us to build models containing hundreds of features, as well as to train each model from a single example. Our model may also be thought of as a probabilistic revisitation of Lowe’s model [3, 4]. We propose an efficient entropy-minimization inference algorithm that constructs the best interpretation of a scene as a collection of objects and clutter. We test our ideas with experiments on two image databases. We compare with Lowe’s algorithm and demonstrate better performance, in particular in presence of large amounts of background clutter.

6 0.45490673 192 nips-2004-The power of feature clustering: An application to object detection

7 0.40314516 99 nips-2004-Learning Hyper-Features for Visual Identification

8 0.35715672 162 nips-2004-Semi-Markov Conditional Random Fields for Information Extraction

9 0.35565099 91 nips-2004-Joint Tracking of Pose, Expression, and Texture using Conditionally Gaussian Filters

10 0.3467108 141 nips-2004-Optimal sub-graphical models

11 0.34467849 73 nips-2004-Generative Affine Localisation and Tracking

12 0.33818278 116 nips-2004-Message Errors in Belief Propagation

13 0.33738664 191 nips-2004-The Variational Ising Classifier (VIC) Algorithm for Coherently Contaminated Data

14 0.33686247 61 nips-2004-Efficient Out-of-Sample Extension of Dominant-Set Clusters

15 0.33125073 89 nips-2004-Joint MRI Bias Removal Using Entropy Minimization Across Images

16 0.32412514 199 nips-2004-Using Machine Learning to Break Visual Human Interaction Proofs (HIPs)

17 0.31361866 182 nips-2004-Synergistic Face Detection and Pose Estimation with Energy-Based Models

18 0.28926891 207 nips-2004-ℓ₀-norm Minimization for Basis Selection

19 0.28847766 68 nips-2004-Face Detection --- Efficient and Rank Deficient

20 0.28533676 25 nips-2004-Assignment of Multiplicative Mixtures in Natural Images


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(13, 0.072), (15, 0.139), (17, 0.035), (21, 0.019), (26, 0.046), (31, 0.018), (33, 0.187), (35, 0.022), (50, 0.026), (53, 0.02), (56, 0.014), (67, 0.276), (87, 0.014), (88, 0.018)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.86806661 200 nips-2004-Using Random Forests in the Structured Language Model

Author: Peng Xu, Frederick Jelinek

Abstract: In this paper, we explore the use of Random Forests (RFs) in the structured language model (SLM), which uses rich syntactic information in predicting the next word based on words already seen. The goal in this work is to construct RFs by randomly growing Decision Trees (DTs) using syntactic information and investigate the performance of the SLM modeled by the RFs in automatic speech recognition. RFs, which were originally developed as classifiers, are a combination of decision tree classifiers. Each tree is grown based on random training data sampled independently and with the same distribution for all trees in the forest, and a random selection of possible questions at each node of the decision tree. Our approach extends the original idea of RFs to deal with the data sparseness problem encountered in language modeling. RFs have been studied in the context of n-gram language modeling and have been shown to generalize well to unseen data. We show in this paper that RFs using syntactic information can also achieve better performance in both perplexity (PPL) and word error rate (WER) in a large vocabulary speech recognition system, compared to a baseline that uses Kneser-Ney smoothing. 1

same-paper 2 0.82737601 47 nips-2004-Contextual Models for Object Detection Using Boosted Random Fields

Author: Antonio Torralba, Kevin P. Murphy, William T. Freeman

Abstract: We seek to both detect and segment objects in images. To exploit both local image data as well as contextual information, we introduce Boosted Random Fields (BRFs), which uses Boosting to learn the graph structure and local evidence of a conditional random field (CRF). The graph structure is learned by assembling graph fragments in an additive model. The connections between individual pixels are not very informative, but by using dense graphs, we can pool information from large regions of the image; dense models also support efficient inference. We show how contextual information from other objects can improve detection performance, both in terms of accuracy and speed, by using a computational cascade. We apply our system to detect stuff and things in office and street scenes. 1

3 0.66003221 130 nips-2004-Newscast EM

Author: Wojtek Kowalczyk, Nikos A. Vlassis

Abstract: We propose a gossip-based distributed algorithm for Gaussian mixture learning, Newscast EM. The algorithm operates on network topologies where each node observes a local quantity and can communicate with other nodes in an arbitrary point-to-point fashion. The main difference between Newscast EM and the standard EM algorithm is that the M-step in our case is implemented in a decentralized manner: (random) pairs of nodes repeatedly exchange their local parameter estimates and combine them by (weighted) averaging. We provide theoretical evidence and demonstrate experimentally that, under this protocol, nodes converge exponentially fast to the correct estimates in each M-step of the EM algorithm. 1

4 0.65769821 167 nips-2004-Semi-supervised Learning with Penalized Probabilistic Clustering

Author: Zhengdong Lu, Todd K. Leen

Abstract: While clustering is usually an unsupervised operation, there are circumstances in which we believe (with varying degrees of certainty) that items A and B should be assigned to the same cluster, while items A and C should not. We would like such pairwise relations to influence cluster assignments of out-of-sample data in a manner consistent with the prior knowledge expressed in the training set. Our starting point is probabilistic clustering based on Gaussian mixture models (GMM) of the data distribution. We express clustering preferences in the prior distribution over assignments of data points to clusters. This prior penalizes cluster assignments according to the degree with which they violate the preferences. We fit the model parameters with EM. Experiments on a variety of data sets show that PPC can consistently improve clustering results.

5 0.65748018 133 nips-2004-Nonparametric Transforms of Graph Kernels for Semi-Supervised Learning

Author: Xiaojin Zhu, Jaz Kandola, Zoubin Ghahramani, John D. Lafferty

Abstract: We present an algorithm based on convex optimization for constructing kernels for semi-supervised learning. The kernel matrices are derived from the spectral decomposition of graph Laplacians, and combine labeled and unlabeled data in a systematic fashion. Unlike previous work using diffusion kernels and Gaussian random field kernels, a nonparametric kernel approach is presented that incorporates order constraints during optimization. This results in flexible kernels and avoids the need to choose among different parametric forms. Our approach relies on a quadratically constrained quadratic program (QCQP), and is computationally feasible for large datasets. We evaluate the kernels on real datasets using support vector machines, with encouraging results. 1

6 0.65696275 31 nips-2004-Blind One-microphone Speech Separation: A Spectral Learning Approach

7 0.65659082 99 nips-2004-Learning Hyper-Features for Visual Identification

8 0.6557914 68 nips-2004-Face Detection --- Efficient and Rank Deficient

9 0.65487289 161 nips-2004-Self-Tuning Spectral Clustering

10 0.6542964 178 nips-2004-Support Vector Classification with Input Data Uncertainty

11 0.65404481 61 nips-2004-Efficient Out-of-Sample Extension of Dominant-Set Clusters

12 0.65323818 189 nips-2004-The Power of Selective Memory: Self-Bounded Learning of Prediction Suffix Trees

13 0.653198 16 nips-2004-Adaptive Discriminative Generative Model and Its Applications

14 0.65314358 77 nips-2004-Hierarchical Clustering of a Mixture Model

15 0.65244377 179 nips-2004-Surface Reconstruction using Learned Shape Models

16 0.65230221 127 nips-2004-Neighbourhood Components Analysis

17 0.65171969 25 nips-2004-Assignment of Multiplicative Mixtures in Natural Images

18 0.6515739 174 nips-2004-Spike Sorting: Bayesian Clustering of Non-Stationary Data

19 0.65144545 53 nips-2004-Discriminant Saliency for Visual Recognition from Cluttered Scenes

20 0.65122211 182 nips-2004-Synergistic Face Detection and Pose Estimation with Energy-Based Models