iccv iccv2013 iccv2013-104 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Ankit Gandhi, Karteek Alahari, C.V. Jawahar
Abstract: We aim to decompose a global histogram representation of an image into histograms of its associated objects and regions. This task is formulated as an optimization problem, given a set of linear classifiers, which can effectively discriminate the object categories present in the image. Our decomposition bypasses harder problems associated with accurately localizing and segmenting objects. We evaluate our method on a wide variety of composite histograms, and also compare it with MRF-based solutions. In addition to merely measuring the accuracy of decomposition, we also show the utility of the estimated object and background histograms for the task of image classification on the PASCAL VOC 2007 dataset.
Reference: text
sentIndex sentText sentNum sentScore
1 Jawahar1 1CVIT, IIIT Hyderabad, India 2Inria, France Abstract We aim to decompose a global histogram representation of an image into histograms of its associated objects and regions. [sent-3, score-0.565]
2 Our decomposition bypasses harder problems associated with accurately localizing and segmenting objects. [sent-5, score-0.298]
3 In addition to merely measuring the accuracy of decomposition, we also show the utility of the estimated object and background histograms for the task of image classification on the PASCAL VOC 2007 dataset. [sent-7, score-0.465]
4 are interested in obtaining the constituent histograms from a composite histogram. [sent-22, score-0.487]
5 Also, it has been observed that BoW histograms of single isolated objects are relatively easy to classify. [sent-27, score-0.299]
6 An important reason for this deterioration in performance is the fact that a classifier trained on single objects often fails to recognize the object when the global image representation (BoW) is “corrupted” by additional objects and clutter present in the image. [sent-33, score-0.333]
7 In this work, our aim is to decompose a global BoW histogram into multiple histograms corresponding to different categories present in the image, as shown in Figure 1. [sent-36, score-0.588]
8 We solve the problem by partitioning the image into regular cells and assigning weights that correspond to each of the categories. [sent-38, score-0.238]
9 Thus, the histogram of each of the categories in the image can be computed using a weighted sum of the cell histograms. [sent-40, score-0.359]
10 Histogram decomposition has many applications, and can be used in multiple settings to boost the classification performance as we show in the experiments section both when single and multiple categories are present in an image. [sent-44, score-0.336]
11 The decomposition can also be used for separating object and background histograms in an image. [sent-45, score-0.579]
12 We note that existing approaches for object detection and semantic segmentation can be adapted to solve the histogram decomposition problem. [sent-48, score-0.502]
13 This involves two steps: (i) Performing object detection or segmentation; and (ii) Computing the individual histograms for the classes using the bounding boxes or the segmentation masks obtained. [sent-49, score-0.521]
14 Many approaches have been proposed to overcome this by restricting the number of potential windows [15], segmenting the image [21], searching only the salient regions [1], sharing features across categories [3 1], speeding up the individual classifiers [30]. [sent-52, score-0.216]
15 In essence, using detection or segmentation approaches for solving the histogram decomposition problem would be an overkill. [sent-55, score-0.43]
16 They represent an image as a mixture of topics, and compute a histogram from a mixture of histograms corresponding to each topic. [sent-58, score-0.437]
17 This can be viewed as decomposing the image into object and background using an object detection method. [sent-65, score-0.234]
18 [25] on spatial saliency also partitions the image into regular cells and assigns weights to them. [sent-70, score-0.304]
19 Furthermore, it does not consider the spatial continuity of weights while assigning them to cells as we do. [sent-72, score-0.346]
20 We show how the formulation can be generalized to spatiallyconstrained decomposition in Section 3. [sent-78, score-0.279]
21 Our objective is × then to decompose the histogram h into k constituent histograms represented by x1, . [sent-90, score-0.606]
22 To solve the histogram decomposition problem, we begin by partitioning the image into M N regular rectangular yce plalsr. [sent-94, score-0.4]
23 t iLoenti hij hdeen iomteag teh ien histogram computed cintaden-pendently for each cell. [sent-95, score-0.221]
24 We introduce a binary variable bipj ∈ {0, 1} for each cell to denote whether it is part of an object ,f1r}om fo trhe e a pcthh category or tneo tw. [sent-96, score-0.528]
25 This problem can be solved in closed form by taking bipj to be 1 for the p that maximizes wTpbipj and 0 for all other p’s. [sent-107, score-0.366]
26 For instance, cells from 1Such histograms have been used successfully in the past [15]. [sent-109, score-0.461]
27 306 sky or road may be labelled as part of other object categories such as bus or car. [sent-110, score-0.303]
28 Furthermore, object cells of a spe- cific category may be scattered and spatially disconnected. [sent-111, score-0.362]
29 To make the formulation more realistic, we relax the assumption in (1) that all the cells are to be assigned to one of the k objects of interest. [sent-114, score-0.317]
30 p E2 ≤ 1, D : bipj ∈ {0,1}, : bipj − bip,j+1 = λi,j+1, where γ is a regularization parameter. [sent-129, score-0.654]
31 The constraint A defines an object class category as a weighted sum of his- tograms from multiple cells, similar to (1). [sent-130, score-0.261]
32 The constraint B allows some of the cells to remain unlabelled. [sent-131, score-0.29]
33 The SVM classifiers used in this formulation do not include a bias term, but it can be easily incorporated by augmenting every histogram xp with 1. [sent-135, score-0.394]
34 Empirically, we found the effect ofintroducing bias negligible, as we are using unnormalized histograms hij . [sent-136, score-0.29]
35 We relax the constraint D as bipj ∈ [0, 1] and solve the resulting linear program (LP) relaxatio∈n. [sent-153, score-0.409]
36 [T0,he1 spatial elvxete tnhets eofs uthltein ignd liinveiadurpa lr coogrnasmtitu (ent histograms can be obtained by rounding bipj ’s to their nearest integers. [sent-154, score-0.646]
37 The histograms of different categories in an image can be obtained directly using the solution of the LP relaxation, i. [sent-155, score-0.353]
38 taking the weighted sum of the cell histograms (LPrelax) or by first rounding-off the solution to the nearest integer and then adding the corresponding cell histograms (LP-round). [sent-157, score-0.72]
39 An MRF-based solution As noted earlier, the decomposition problem can be modelled as an MRF energy minimization problem. [sent-162, score-0.308]
40 equivalent t aok introducing binary sveatr Liab =les { bipj . [sent-169, score-0.353]
41 We observed (see Section 4) that most of the cells are assigned to the background in this solution. [sent-174, score-0.246]
42 Further, it focusses on obtaining integral solutions for segmenting an image, whereas the solution of the LP relaxation suffices for our histogram decomposition task. [sent-181, score-0.478]
43 307 ×× Figure 2: Incorporation of spatial pyramid histograms into our LP formulation. [sent-182, score-0.319]
44 We show the sub-regions considered while computing the spatial histogram for an object category p. [sent-183, score-0.374]
45 In addition to the constraints mentioned here, we add constraints that r1p, r2p and r3p should contain equal number of visual words, as they occupy the same area in the image, and a similar constraint for r4p, r5p, r6p and r7p. [sent-184, score-0.251]
46 Often, incorporating spatial information, such as in [16], and concatenating histograms from multiple subregions, has shown to improve the classification performance in many cases. [sent-187, score-0.401]
47 We introduce weak geometry constraints into the histograms, without affecting the linearity of the problem, inspired by the work on spatial histograms [16]. [sent-190, score-0.375]
48 The spatial region for an object category p is divided into 3 1, 2×2 and 1 1 grids giving rise tgoo a to pt aisl odifv eight sub-regions, as snhdo w1×n 1in g Figure 2in, gsi rmisielar to [4]. [sent-191, score-0.244]
49 The final representation of an object is obtained by concatenating histograms of the eight sub-regions. [sent-192, score-0.385]
50 In this formulation involving Spatial Pyramid Matching (SPM), we simultane- rp ously solve for bipj and sub-region histograms r1p, . [sent-197, score-0.708]
51 oigdrCaenmo n,tsertsjpaitnhste Ahite1h dvise- fines the histogram of object p as the concatenation of eight sub-region histograms shown in Figure 2. [sent-239, score-0.568]
52 The constraint A2 represents the histogram in terms of its cell histograms. [sent-240, score-0.343]
53 Constraints A4 and A6 correspond to the conditions mentioned in Figure 2, that the sum of the sub-region histograms is equal to the histogram of class p. [sent-242, score-0.518]
54 Note that only weak spatial constraints for sub-region histograms have been considered in the above formulation, so as to keep our problem linear. [sent-243, score-0.375]
55 How- × ever, it encodes some spatial information, which results in a better decomposition of the global histogram. [sent-245, score-0.314]
56 We use SVMs trained on spatial histograms of tight bounding boxes around the object as classifiers (wp), computed by dividing object bounding box into 3 1, 2 2 and 1p u×t e1 grids, as sihngow onb jeinc tth beo uil nudsintrgat iboonx i inn Figure ,2 . [sent-246, score-0.775]
57 , r7p approximately correspond to the histograms of subregions assumed. [sent-251, score-0.289]
58 Experiments and Results We demonstrate the performance of our method for decomposing the global BoW histogram of an image into its constituent histograms in a variety of settings. [sent-253, score-0.64]
59 The scale of each object in the composite image is measured as the percentage of the composite histogram it contributes to. [sent-263, score-0.486]
60 The purpose of introducing this dataset was to study the sensitivity and robustness of our formulation especially when the object size in the image, and k, the number of categories considered in the objective function (2), vary. [sent-265, score-0.23]
61 Flickr-M1 has 196 positive images containing both bus & car, and Flickr-M2 has 209 positive images with both bus & bicycle in them. [sent-271, score-0.397]
62 We set P (from constraint C in (2)) to 50% of the total cells in an image. [sent-279, score-0.29]
63 We begin by evaluating the performance of our histogram decomposition method on the CALTECH dataset. [sent-286, score-0.4]
64 We follow both these approaches on BoW histograms of the CALTECH dataset. [sent-289, score-0.253]
65 We evaluate the performance of our decomposition by obtaining the mean AP over all the classes, when the constituent (category-level) histograms are passed to the respective SVM classifiers. [sent-291, score-0.588]
66 Table 1 also shows the mAP over all the 20 Caltech classes when the problem is solved using the LP formulation with the constraint C, i. [sent-306, score-0.237]
67 We use the entire composite histogram of an image for the BoW method, while for the CV approach we assign each cell independently to atmost one object, and build the object histogram from histograms of cells that belong to the object. [sent-314, score-1.093]
68 Even when the scale of the object is small (10-30%), our method correctly discriminates the object histograms more than 63% of the time. [sent-321, score-0.397]
69 BoW uses the entire composite image histogram and CV uses histograms obtained via cell-based voting. [sent-326, score-0.552]
70 LP is solved for two classes bus and bicycle simultaneously. [sent-329, score-0.327]
71 The images and the corresponding weights obtained for their cells using the LP solution are shown. [sent-330, score-0.269]
72 The cells shown in red are weights of bus, while those in green are of bicycle. [sent-331, score-0.238]
73 Multiple object classification In this experiment we investigate how the presence of one object in an image can negatively affect the classification of others. [sent-336, score-0.246]
74 We consider the AP obtained when the entire histogram of an image is given to an SVM classifier (BoW) as the baseline. [sent-337, score-0.226]
75 Using the LP formulation proposed in Section 2, we split the image histogram into histograms ofconstituent objects, and the background/context. [sent-338, score-0.5]
76 Figure 5 shows the decomposition of the global histograms in a few examples containing bus and bicycle categories. [sent-339, score-0.736]
77 One approach to evaluate this decomposition is by using the constituent histograms directly in an object classifier. [sent-340, score-0.66]
78 Thus, we use the object-background feature representation of [23], where a histogram is represented by a concatenation of object and background histograms. [sent-342, score-0.324]
79 In LP-relax, we use soft assignment of cells whereas in LP-round, hard assignment of cells is used. [sent-349, score-0.441]
80 The background histogram is obtained by subtracting histograms of objects from the global image histogram. [sent-350, score-0.553]
81 We compute AP on the Flickr dataset with classifiers trained on features extracted from object bounding boxes, concatenated with the features extracted from the remainder of the image. [sent-351, score-0.325]
82 LP is solved for all images in the dataset to get the constituent histograms (for Flickr-M1, it is solved using classifiers for bus and car, and for FlickrM2, using classifiers for bus and bicycle). [sent-353, score-0.974]
83 Decomposition into object and background We now discuss the decomposition results on the PASCAL VOC 2007 dataset. [sent-362, score-0.326]
84 background labels to the image cells on a few sample images from the dataset. [sent-364, score-0.246]
85 This decomposition is evaluated in the context of the image classification problem. [sent-365, score-0.267]
86 Table 3 shows a comparison of our LP decomposition scheme with baseline methods. [sent-366, score-0.241]
87 We show an image and the corresponding weights of its cells obtained from our LP solution. [sent-368, score-0.238]
88 TestBB shows the AP when the decomposition is done using ground truth bounding boxes, DPM when using [9], Sem. [sent-392, score-0.276]
89 is a concatenation of individually normalized object and background histograms for the methods TestBB, DPM, Sem. [sent-396, score-0.393]
90 TestBB is the “golden” baseline, where the object histograms are extracted from ground truth bounding boxes, and used in combination with histograms from the remainder of the image. [sent-406, score-0.692]
91 Decomposition in a weakly supervised setting In the histogram decomposition formulation (2), we require a linear SVM classifier, which can discriminate the categories present in the image. [sent-425, score-0.626]
92 We now extend our approach to learn classifiers for a more general weakly supervised setting, where bounding box annotation is not available for most of the images. [sent-427, score-0.229]
93 Recent approaches, such as [20, 23], can be adapted to learn classifiers in a weakly supervised setting, but being based on object localization, they are computationally expensive. [sent-428, score-0.241]
94 Given an initial classifier for an object, we decompose histograms of training images with our LP formulation. [sent-430, score-0.345]
95 Next, we compute the new histograms of objects as a weighted sum of the cell histograms and re-train the classifiers with them. [sent-431, score-0.758]
96 We compare this decomposition scheme for the image classification task on PASCAL VOC 2007 to object-centric spatial pooling [23]. [sent-434, score-0.381]
97 Summary We proposed an effective method to decompose a global histogram of an image into histograms of its associated objects and regions. [sent-452, score-0.565]
98 Our approach solves the problem using an LP formulation, by taking an intermediate path between two harder problems, namely bounding box accurate object detection and pixel-accurate object segmentation. [sent-453, score-0.239]
99 We showed that a wide variety of composite histograms can be decomposed into their constituent histograms with our LP method. [sent-454, score-0.74]
100 We also demonstrated the application of histogram decomposition for improving the classification performance on multiple object and PASCAL VOC 2007 datasets using an object-background representation of an image. [sent-455, score-0.523]
wordName wordTfidf (topN-words)
[('lp', 0.42), ('bipj', 0.327), ('histograms', 0.253), ('decomposition', 0.216), ('cells', 0.208), ('bow', 0.194), ('voc', 0.191), ('histogram', 0.184), ('caltech', 0.183), ('pascal', 0.168), ('bus', 0.162), ('constituent', 0.119), ('composite', 0.115), ('ap', 0.113), ('wptxp', 0.109), ('classifiers', 0.1), ('flickr', 0.083), ('bipjhij', 0.082), ('testbb', 0.082), ('constraint', 0.082), ('cell', 0.077), ('ale', 0.074), ('bicycle', 0.073), ('object', 0.072), ('dpm', 0.07), ('categories', 0.069), ('cv', 0.068), ('spatial', 0.066), ('rp', 0.065), ('formulation', 0.063), ('bounding', 0.06), ('svm', 0.056), ('constraints', 0.056), ('dsift', 0.055), ('remainder', 0.054), ('classes', 0.053), ('boxes', 0.053), ('mrf', 0.052), ('decomposing', 0.052), ('category', 0.052), ('classification', 0.051), ('decompose', 0.05), ('rkp', 0.048), ('pooling', 0.048), ('xp', 0.047), ('segmenting', 0.047), ('objects', 0.046), ('ij', 0.045), ('weakly', 0.043), ('continuity', 0.042), ('karteek', 0.042), ('classifier', 0.042), ('message', 0.04), ('vedaldi', 0.039), ('solved', 0.039), ('trained', 0.039), ('russakovsky', 0.039), ('background', 0.038), ('hij', 0.037), ('vocabulary', 0.037), ('subregions', 0.036), ('harder', 0.035), ('llc', 0.034), ('neighbouring', 0.033), ('global', 0.032), ('bag', 0.032), ('energy', 0.032), ('solution', 0.031), ('concatenating', 0.031), ('neighbourhood', 0.031), ('occupy', 0.031), ('wp', 0.03), ('scattered', 0.03), ('clutter', 0.03), ('weights', 0.03), ('segmentation', 0.03), ('concatenation', 0.03), ('colour', 0.029), ('eight', 0.029), ('sum', 0.029), ('modelled', 0.029), ('localization', 0.028), ('sharma', 0.028), ('winn', 0.028), ('inclusion', 0.028), ('verbeek', 0.028), ('spm', 0.027), ('pami', 0.027), ('class', 0.026), ('recognize', 0.026), ('mentioned', 0.026), ('introducing', 0.026), ('supervised', 0.026), ('merely', 0.026), ('car', 0.026), ('utility', 0.025), ('baseline', 0.025), ('grids', 0.025), ('setting', 0.025), ('hard', 0.025)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000006 104 iccv-2013-Decomposing Bag of Words Histograms
Author: Ankit Gandhi, Karteek Alahari, C.V. Jawahar
Abstract: We aim to decompose a global histogram representation of an image into histograms of its associated objects and regions. This task is formulated as an optimization problem, given a set of linear classifiers, which can effectively discriminate the object categories present in the image. Our decomposition bypasses harder problems associated with accurately localizing and segmenting objects. We evaluate our method on a wide variety of composite histograms, and also compare it with MRF-based solutions. In addition to merely measuring the accuracy of decomposition, we also show the utility of the estimated object and background histograms for the task of image classification on the PASCAL VOC 2007 dataset.
2 0.18424124 89 iccv-2013-Constructing Adaptive Complex Cells for Robust Visual Tracking
Author: Dapeng Chen, Zejian Yuan, Yang Wu, Geng Zhang, Nanning Zheng
Abstract: Representation is a fundamental problem in object tracking. Conventional methods track the target by describing its local or global appearance. In this paper we present that, besides the two paradigms, the composition of local region histograms can also provide diverse and important object cues. We use cells to extract local appearance, and construct complex cells to integrate the information from cells. With different spatial arrangements of cells, complex cells can explore various contextual information at multiple scales, which is important to improve the tracking performance. We also develop a novel template-matching algorithm for object tracking, where the template is composed of temporal varying cells and has two layers to capture the target and background appearance respectively. An adaptive weight is associated with each complex cell to cope with occlusion as well as appearance variation. A fusion weight is associated with each complex cell type to preserve the global distinctiveness. Our algorithm is evaluated on 25 challenging sequences, and the results not only confirm the contribution of each component in our tracking system, but also outperform other competing trackers.
3 0.16656275 379 iccv-2013-Semantic Segmentation without Annotating Segments
Author: Wei Xia, Csaba Domokos, Jian Dong, Loong-Fah Cheong, Shuicheng Yan
Abstract: Numerous existing object segmentation frameworks commonly utilize the object bounding box as a prior. In this paper, we address semantic segmentation assuming that object bounding boxes are provided by object detectors, but no training data with annotated segments are available. Based on a set of segment hypotheses, we introduce a simple voting scheme to estimate shape guidance for each bounding box. The derived shape guidance is used in the subsequent graph-cut-based figure-ground segmentation. The final segmentation result is obtained by merging the segmentation results in the bounding boxes. We conduct an extensive analysis of the effect of object bounding box accuracy. Comprehensive experiments on both the challenging PASCAL VOC object segmentation dataset and GrabCut50 image segmentation dataset show that the proposed approach achieves competitive results compared to previous detection or bounding box prior based methods, as well as other state-of-the-art semantic segmentation methods.
4 0.1664997 377 iccv-2013-Segmentation Driven Object Detection with Fisher Vectors
Author: Ramazan Gokberk Cinbis, Jakob Verbeek, Cordelia Schmid
Abstract: We present an object detection system based on the Fisher vector (FV) image representation computed over SIFT and color descriptors. For computational and storage efficiency, we use a recent segmentation-based method to generate class-independent object detection hypotheses, in combination with data compression techniques. Our main contribution is a method to produce tentative object segmentation masks to suppress background clutter in the features. Re-weighting the local image features based on these masks is shown to improve object detection significantly. We also exploit contextual features in the form of a full-image FV descriptor, and an inter-category rescoring mechanism. Our experiments on the PASCAL VOC 2007 and 2010 datasets show that our detector improves over the current state-of-the-art detection results.
5 0.13388266 292 iccv-2013-Non-convex P-Norm Projection for Robust Sparsity
Author: Mithun Das Gupta, Sanjeev Kumar
Abstract: In this paper, we investigate the properties of Lp norm (p ≤ 1) within a projection framework. We start with the (KpK T≤ equations of the neoctni-olnin efraarm optimization problem a thnde then use its key properties to arrive at an algorithm for Lp norm projection on the non-negative simplex. We compare with L1projection which needs prior knowledge of the true norm, as well as hard thresholding based sparsificationproposed in recent compressed sensing literature. We show performance improvements compared to these techniques across different vision applications.
6 0.12696803 126 iccv-2013-Dynamic Label Propagation for Semi-supervised Multi-class Multi-label Classification
7 0.11728016 107 iccv-2013-Deformable Part Descriptors for Fine-Grained Recognition and Attribute Prediction
8 0.1110967 5 iccv-2013-A Color Constancy Model with Double-Opponency Mechanisms
9 0.10689336 197 iccv-2013-Hierarchical Joint Max-Margin Learning of Mid and Top Level Representations for Visual Recognition
10 0.10644633 411 iccv-2013-Symbiotic Segmentation and Part Localization for Fine-Grained Categorization
11 0.10313691 190 iccv-2013-Handling Occlusions with Franken-Classifiers
12 0.10239962 327 iccv-2013-Predicting an Object Location Using a Global Image Representation
13 0.10226782 59 iccv-2013-Bayesian Joint Topic Modelling for Weakly Supervised Object Localisation
14 0.099010386 250 iccv-2013-Lifting 3D Manhattan Lines from a Single Image
15 0.098528996 111 iccv-2013-Detecting Dynamic Objects with Multi-view Background Subtraction
16 0.098088011 181 iccv-2013-Frustratingly Easy NBNN Domain Adaptation
17 0.096785717 426 iccv-2013-Training Deformable Part Models with Decorrelated Features
18 0.096751742 21 iccv-2013-A Method of Perceptual-Based Shape Decomposition
19 0.095148399 61 iccv-2013-Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition
20 0.093028516 238 iccv-2013-Learning Graphs to Match
topicId topicWeight
[(0, 0.226), (1, 0.044), (2, 0.033), (3, -0.063), (4, 0.088), (5, 0.026), (6, -0.097), (7, 0.057), (8, -0.049), (9, -0.131), (10, 0.054), (11, 0.003), (12, -0.018), (13, -0.036), (14, 0.036), (15, -0.063), (16, 0.051), (17, 0.063), (18, 0.032), (19, -0.009), (20, -0.022), (21, 0.011), (22, -0.046), (23, -0.025), (24, -0.012), (25, 0.074), (26, 0.012), (27, -0.07), (28, 0.046), (29, 0.044), (30, 0.031), (31, -0.014), (32, -0.041), (33, 0.096), (34, -0.031), (35, -0.018), (36, -0.013), (37, -0.056), (38, 0.033), (39, 0.106), (40, 0.016), (41, 0.01), (42, 0.029), (43, 0.124), (44, 0.019), (45, 0.033), (46, 0.095), (47, -0.099), (48, -0.055), (49, -0.011)]
simIndex simValue paperId paperTitle
same-paper 1 0.95573932 104 iccv-2013-Decomposing Bag of Words Histograms
Author: Ankit Gandhi, Karteek Alahari, C.V. Jawahar
Abstract: We aim to decompose a global histogram representation of an image into histograms of its associated objects and regions. This task is formulated as an optimization problem, given a set of linear classifiers, which can effectively discriminate the object categories present in the image. Our decomposition bypasses harder problems associated with accurately localizing and segmenting objects. We evaluate our method on a wide variety of composite histograms, and also compare it with MRF-based solutions. In addition to merely measuring the accuracy of decomposition, we also show the utility of the estimated object and background histograms for the task of image classification on the PASCAL VOC 2007 dataset.
2 0.76682383 377 iccv-2013-Segmentation Driven Object Detection with Fisher Vectors
Author: Ramazan Gokberk Cinbis, Jakob Verbeek, Cordelia Schmid
Abstract: We present an object detection system based on the Fisher vector (FV) image representation computed over SIFT and color descriptors. For computational and storage efficiency, we use a recent segmentation-based method to generate class-independent object detection hypotheses, in combination with data compression techniques. Our main contribution is a method to produce tentative object segmentation masks to suppress background clutter in the features. Re-weighting the local image features based on these masks is shown to improve object detection significantly. We also exploit contextual features in the form of a full-image FV descriptor, and an inter-category rescoring mechanism. Our experiments on the PASCAL VOC 2007 and 2010 datasets show that our detector improves over the current state-of-the-art detection results.
3 0.73205197 349 iccv-2013-Regionlets for Generic Object Detection
Author: Xiaoyu Wang, Ming Yang, Shenghuo Zhu, Yuanqing Lin
Abstract: Generic object detection is confronted by dealing with different degrees of variations in distinct object classes with tractable computations, which demands for descriptive and flexible object representations that are also efficient to evaluate for many locations. In view of this, we propose to model an object class by a cascaded boosting classifier which integrates various types of features from competing local regions, named as regionlets. A regionlet is a base feature extraction region defined proportionally to a detection window at an arbitrary resolution (i.e. size and aspect ratio). These regionlets are organized in small groups with stable relative positions to delineate fine-grained spatial layouts inside objects. Their features are aggregated to a one-dimensional feature within one group so as to tolerate deformations. Then we evaluate the object bounding box proposal in selective search from segmentation cues, limiting the evaluation locations to thousands. Our approach significantly outperforms the state-of-the-art on popular multi-class detection benchmark datasets with a single method, without any contexts. It achieves the detec- tion mean average precision of 41. 7% on the PASCAL VOC 2007 dataset and 39. 7% on the VOC 2010 for 20 object categories. It achieves 14. 7% mean average precision on the ImageNet dataset for 200 object categories, outperforming the latest deformable part-based model (DPM) by 4. 7%.
4 0.71956819 109 iccv-2013-Detecting Avocados to Zucchinis: What Have We Done, and Where Are We Going?
Author: Olga Russakovsky, Jia Deng, Zhiheng Huang, Alexander C. Berg, Li Fei-Fei
Abstract: The growth of detection datasets and the multiple directions of object detection research provide both an unprecedented need and a great opportunity for a thorough evaluation of the current state of the field of categorical object detection. In this paper we strive to answer two key questions. First, where are we currently as a field: what have we done right, what still needs to be improved? Second, where should we be going in designing the next generation of object detectors? Inspired by the recent work of Hoiem et al. [10] on the standard PASCAL VOC detection dataset, we perform a large-scale study on the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) data. First, we quantitatively demonstrate that this dataset provides many of the same detection challenges as the PASCAL VOC. Due to its scale of 1000 object categories, ILSVRC also provides an excellent testbed for understanding the performance of detectors as a function of several key properties of the object classes. We conduct a series of analyses looking at how different detection methods perform on a number of imagelevel and object-class-levelproperties such as texture, color, deformation, and clutter. We learn important lessons of the current object detection methods and propose a number of insights for designing the next generation object detectors.
5 0.7195124 327 iccv-2013-Predicting an Object Location Using a Global Image Representation
Author: Jose A. Rodriguez Serrano, Diane Larlus
Abstract: We tackle the detection of prominent objects in images as a retrieval task: given a global image descriptor, we find the most similar images in an annotated dataset, and transfer the object bounding boxes. We refer to this approach as data driven detection (DDD), that is an alternative to sliding windows. Previous works have used similar notions but with task-independent similarities and representations, i.e. they were not tailored to the end-goal of localization. This article proposes two contributions: (i) a metric learning algorithm and (ii) a representation of images as object probability maps, that are both optimized for detection. We show experimentally that these two contributions are crucial to DDD, do not require costly additional operations, and in some cases yield comparable or better results than state-of-the-art detectors despite conceptual simplicity and increased speed. As an application of prominent object detection, we improve fine-grained categorization by precropping images with the proposed approach.
6 0.7067942 77 iccv-2013-Codemaps - Segment, Classify and Search Objects Locally
7 0.67742205 390 iccv-2013-Shufflets: Shared Mid-level Parts for Fast Object Detection
8 0.66902202 379 iccv-2013-Semantic Segmentation without Annotating Segments
9 0.65522778 388 iccv-2013-Shape Index Descriptors Applied to Texture-Based Galaxy Analysis
10 0.65084696 198 iccv-2013-Hierarchical Part Matching for Fine-Grained Visual Categorization
11 0.63930637 411 iccv-2013-Symbiotic Segmentation and Part Localization for Fine-Grained Categorization
12 0.6168167 248 iccv-2013-Learning to Rank Using Privileged Information
13 0.61663502 416 iccv-2013-The Interestingness of Images
14 0.60817754 193 iccv-2013-Heterogeneous Auto-similarities of Characteristics (HASC): Exploiting Relational Information for Classification
15 0.60673541 169 iccv-2013-Fine-Grained Categorization by Alignments
16 0.60127336 426 iccv-2013-Training Deformable Part Models with Decorrelated Features
17 0.58885223 61 iccv-2013-Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition
18 0.58849323 63 iccv-2013-Bounded Labeling Function for Global Segmentation of Multi-part Objects with Geometric Constraints
19 0.5860154 107 iccv-2013-Deformable Part Descriptors for Fine-Grained Recognition and Attribute Prediction
20 0.58515626 186 iccv-2013-GrabCut in One Cut
topicId topicWeight
[(2, 0.108), (4, 0.024), (7, 0.014), (13, 0.013), (26, 0.082), (31, 0.053), (35, 0.247), (42, 0.087), (64, 0.057), (73, 0.022), (89, 0.183), (98, 0.013)]
simIndex simValue paperId paperTitle
1 0.89749134 90 iccv-2013-Content-Aware Rotation
Author: Kaiming He, Huiwen Chang, Jian Sun
Abstract: We present an image editing tool called Content-Aware Rotation. Casually shot photos can appear tilted, and are often corrected by rotation and cropping. This trivial solution may remove desired content and hurt image integrity. Instead of doing rigid rotation, we propose a warping method that creates the perception of rotation and avoids cropping. Human vision studies suggest that the perception of rotation is mainly due to horizontal/vertical lines. We design an optimization-based method that preserves the rotation of horizontal/vertical lines, maintains the completeness of the image content, and reduces the warping distortion. An efficient algorithm is developed to address the challenging optimization. We demonstrate our content-aware rotation method on a variety of practical cases.
2 0.85276818 119 iccv-2013-Discriminant Tracking Using Tensor Representation with Semi-supervised Improvement
Author: Jin Gao, Junliang Xing, Weiming Hu, Steve Maybank
Abstract: Visual tracking has witnessed growing methods in object representation, which is crucial to robust tracking. The dominant mechanism in object representation is using image features encoded in a vector as observations to perform tracking, without considering that an image is intrinsically a matrix, or a 2nd-order tensor. Thus approaches following this mechanism inevitably lose a lot of useful information, and therefore cannot fully exploit the spatial correlations within the 2D image ensembles. In this paper, we address an image as a 2nd-order tensor in its original form, and find a discriminative linear embedding space approximation to the original nonlinear submanifold embedded in the tensor space based on the graph embedding framework. We specially design two graphs for characterizing the intrinsic local geometrical structure of the tensor space, so as to retain more discriminant information when reducing the dimension along certain tensor dimensions. However, spatial correlations within a tensor are not limited to the elements along these dimensions. This means that some part of the discriminant information may not be encoded in the embedding space. We introduce a novel technique called semi-supervised improvement to iteratively adjust the embedding space to compensate for the loss of discriminant information, hence improving the performance of our tracker. Experimental results on challenging videos demonstrate the effectiveness and robustness of the proposed tracker.
same-paper 3 0.85021281 104 iccv-2013-Decomposing Bag of Words Histograms
Author: Ankit Gandhi, Karteek Alahari, C.V. Jawahar
Abstract: We aim to decompose a global histogram representation of an image into histograms of its associated objects and regions. This task is formulated as an optimization problem, given a set of linear classifiers, which can effectively discriminate the object categories present in the image. Our decomposition bypasses harder problems associated with accurately localizing and segmenting objects. We evaluate our method on a wide variety of composite histograms, and also compare it with MRF-based solutions. In addition to merely measuring the accuracy of decomposition, we also show the utility of the estimated object and background histograms for the task of image classification on the PASCAL VOC 2007 dataset.
4 0.82565182 403 iccv-2013-Strong Appearance and Expressive Spatial Models for Human Pose Estimation
Author: Leonid Pishchulin, Mykhaylo Andriluka, Peter Gehler, Bernt Schiele
Abstract: Typical approaches to articulated pose estimation combine spatial modelling of the human body with appearance modelling of body parts. This paper aims to push the state-of-the-art in articulated pose estimation in two ways. First we explore various types of appearance representations aiming to substantially improve the bodypart hypotheses. And second, we draw on and combine several recently proposed powerful ideas such as more flexible spatial models as well as image-conditioned spatial models. In a series of experiments we draw several important conclusions: (1) we show that the proposed appearance representations are complementary; (2) we demonstrate that even a basic tree-structure spatial human body model achieves state-ofthe-art performance when augmented with the proper appearance representation; and (3) we show that the combination of the best performing appearance model with a flexible image-conditioned spatial model achieves the best result, significantly improving over the state of the art, on the “Leeds Sports Poses ” and “Parse ” benchmarks.
5 0.75357866 21 iccv-2013-A Method of Perceptual-Based Shape Decomposition
Author: Chang Ma, Zhongqian Dong, Tingting Jiang, Yizhou Wang, Wen Gao
Abstract: In thispaper, wepropose a novelperception-based shape decomposition method which aims to decompose a shape into semantically meaningful parts. In addition to three popular perception rules (the Minima rule, the Short-cut rule and the Convexity rule) in shape decomposition, we propose a new rule named part-similarity rule to encourage consistent partition of similar parts. The problem is formulated as a quadratically constrained quadratic program (QCQP) problem and is solved by a trust-region method. Experiment results on MPEG-7 dataset show that we can get a more consistent shape decomposition with human perception compared with other state-of-the-art methods both qualitatively and quantitatively. Finally, we show the advantage of semantic parts over non-meaningful parts in object detection on the ETHZ dataset.
6 0.7460649 36 iccv-2013-Accurate and Robust 3D Facial Capture Using a Single RGBD Camera
7 0.73266882 426 iccv-2013-Training Deformable Part Models with Decorrelated Features
8 0.73206818 316 iccv-2013-Pictorial Human Spaces: How Well Do Humans Perceive a 3D Articulated Pose?
9 0.72839773 204 iccv-2013-Human Attribute Recognition by Rich Appearance Dictionary
10 0.7209807 107 iccv-2013-Deformable Part Descriptors for Fine-Grained Recognition and Attribute Prediction
11 0.72069937 340 iccv-2013-Real-Time Articulated Hand Pose Estimation Using Semi-supervised Transductive Regression Forests
12 0.72028577 361 iccv-2013-Robust Trajectory Clustering for Motion Segmentation
13 0.71867621 24 iccv-2013-A Non-parametric Bayesian Network Prior of Human Pose
14 0.7174648 411 iccv-2013-Symbiotic Segmentation and Part Localization for Fine-Grained Categorization
15 0.71610695 171 iccv-2013-Fix Structured Learning of 2013 ICCV paper k2opt.pdf
16 0.71398348 65 iccv-2013-Breaking the Chain: Liberation from the Temporal Markov Assumption for Tracking Human Poses
17 0.7126621 379 iccv-2013-Semantic Segmentation without Annotating Segments
18 0.71055466 153 iccv-2013-Face Recognition Using Face Patch Networks
19 0.71046472 383 iccv-2013-Semi-supervised Learning for Large Scale Image Cosegmentation
20 0.70970553 143 iccv-2013-Estimating Human Pose with Flowing Puppets