cvpr cvpr2013 cvpr2013-417 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jian Dong, Wei Xia, Qiang Chen, Jianshi Feng, Zhongyang Huang, Shuicheng Yan
Abstract: In this paper, we introduce a subcategory-aware object classification framework to boost category level object classification performance. Motivated by the observation of considerable intra-class diversities and inter-class ambiguities in many current object classification datasets, we explicitly split data into subcategories by ambiguity guided subcategory mining. We then train an individual model for each subcategory rather than attempt to represent an object category with a monolithic model. More specifically, we build the instance affinity graph by combining both intraclass similarity and inter-class ambiguity. Visual subcategories, which correspond to the dense subgraphs, are detected by the graph shift algorithm and seamlessly integrated into the state-of-the-art detection assisted classification framework. Finally the responses from subcategory models are aggregated by subcategory-aware kernel regression. The extensive experiments over the PASCAL VOC 2007 and PASCAL VOC 2010 databases show the state-ofthe-art performance from our framework.
Reference: text
sentIndex sentText sentNum sentScore
1 com Abstract In this paper, we introduce a subcategory-aware object classification framework to boost category level object classification performance. [sent-6, score-0.367]
2 Motivated by the observation of considerable intra-class diversities and inter-class ambiguities in many current object classification datasets, we explicitly split data into subcategories by ambiguity guided subcategory mining. [sent-7, score-1.614]
3 We then train an individual model for each subcategory rather than attempt to represent an object category with a monolithic model. [sent-8, score-0.98]
4 Visual subcategories, which correspond to the dense subgraphs, are detected by the graph shift algorithm and seamlessly integrated into the state-of-the-art detection assisted classification framework. [sent-10, score-0.46]
5 Finally the responses from subcategory models are aggregated by subcategory-aware kernel regression. [sent-11, score-0.799]
6 This framework combines local feature extraction, feature encoding and feature pooling to generate global image representations, and represents each object category with a monolithic model, such as a support vector machine classifier. [sent-15, score-0.229]
7 However, the large intra-class diversities induced by pose, viewpoint and appearance variations [27] make it difficult to build an accurate monolithic model for each category, especially when there are many ambiguous samples. [sent-16, score-0.246]
8 In feature space, these subcategories are essentially far away from each other. [sent-18, score-0.323]
9 Furthermore, the ambiguous sofalike chairs look more like sofas than common chairs. [sent-19, score-0.218]
10 In this case, representing all chairs with a monolithic model will egory mining and subcategory-aware object classification framework. [sent-20, score-0.552]
11 For each category, training samples are automatically grouped into subcategories based on both intra-class similarity and inter-class ambiguity. [sent-21, score-0.442]
12 An individual subcategory model is constructed for each detected subcategory. [sent-22, score-0.801]
13 The final classification results are obtained by aggregating responses from all subcategory models. [sent-23, score-0.889]
14 Hence, it is intuitively beneficial to model each subcategory independently. [sent-25, score-0.78]
15 These considerable intra-class diversities and inter-class ambiguities are common in the challenging real world datasets [13, 37], which makes the subcategory mining necessary. [sent-26, score-1.063]
16 Clustering all training data of an object category based on intra-class similarity seems to be a natural strategy for subcategory mining, since objects belonging to the same subcategory should intuitively have larger similarity in terms of appearance and shape. [sent-27, score-1.772]
17 However, in the context of generic object classification, subcategories mined with only intra-class visual similarity cues are unnecessary to be 888882222277555 optimal due to the ignorance of valuable inter-class information [8]. [sent-28, score-0.453]
18 By noting the ambiguous chair sample distribution near the decision boundary, all chairs should be intuitively divided into separate subcategories. [sent-33, score-0.292]
19 The proper split as indicated in Figure 1 will make all subcategories linearly separable from other categories, which is only achievable with the assistance of inter-class information. [sent-34, score-0.374]
20 The above observation inspires us to propose an ambiguity guided subcategory mining approach to explore the intrinsic subcategory structure embedded in each category. [sent-35, score-2.034]
21 With subcategory awareness, we can boost category level classification by subcategory-aware object classification (SAOC). [sent-36, score-1.115]
22 As indicated in Figure 1, we split data into subcategories by ambiguity guided subcategory mining and train an individual model for each subcategory. [sent-37, score-1.577]
23 The final classification results are generated by aggregating subcategory responses through subcategory-aware kernel regression. [sent-39, score-0.889]
24 First, we propose a novel ambiguity guided subcategory mining approach, which gracefully integrates the intra-class similarity and inter-class ambiguity for effective subcategory mining. [sent-41, score-2.287]
25 Second, we provide an effective subcategory-aware object classification framework based on the current detection assisted classification framework [21, 3 1]. [sent-42, score-0.317]
26 Our ambiguity guided subcategory min- ing approach can be seamlessly integrated into such framework. [sent-43, score-1.129]
27 Utilizing mined subcategories can improve both detection and classification performance and allow more effective subcategory level interaction in the fusion model. [sent-44, score-1.318]
28 In this work, we show that properly splitting the data into subcategories will boost the performance of the stateof-the-art pipeline. [sent-56, score-0.369]
29 Some recent works begin to investigate the visual subcategory structure embedded in each category [10, 18, 7, 39, 1, 11], which leads to considerable improvement in object detection performance. [sent-63, score-0.932]
30 In this work, we borrow the idea of uncertainty piloted classification and propose an ambiguity guided subcategory mining approach under the graph shift [25] framework. [sent-71, score-1.539]
31 8 8 82 2 28 6 6 first processed by each learnt subcategory model including detection and classification models. [sent-72, score-0.913]
32 Then the responses from all subcategory models are fed into the fusion model to generate the final category level classification results. [sent-73, score-0.992]
33 We will first introduce each component of the framework and emphasize how subcategory mining fits into each step later. [sent-76, score-0.964]
34 For detection, each subcategory is characterized by one shape-based sliding window detector [16, 38] and one appearance-based selective window detector [34, 33], respectively. [sent-78, score-0.798]
35 For classification, we follow the state-of-the-art pipeline [5] and train a classifier for each subcategory individually. [sent-80, score-0.803]
36 The fusion model mainly aims to: (1) boost the classification performance by complementary detection results, (2) utilize the context of all categories for reweighting, and (3) fuse the subcategory level results into final category level results. [sent-82, score-1.158]
37 First, we construct a middle level representation for each training/testing image by concatenating classification scores and the leading two detection scores from each subcategory model. [sent-84, score-0.932]
38 1) The subcategory information can be used to initialize both detection and classification models to better handle the rich intra-class diversities in challenging datasets. [sent-88, score-0.992]
39 Less diversity in each subcategory will lead to a simpler learning problem, which can be better characterized by current state-of-the-art models, such as the Deformable Part based Model (DPM) for detection and the foreground BoW models involved in GHM. [sent-89, score-0.842]
40 2) The subcategory awareness will lead to more effective fusion models. [sent-90, score-0.928]
41 First, subcategory awareness allows us to model the subcategory level interaction. [sent-91, score-1.682]
42 For example, occluded chairs and sitting persons often occur together and should boost the classification scores of each other. [sent-92, score-0.248]
43 Only by subcategory awareness can such underlying correlation be captured effectively. [sent-95, score-0.883]
44 Second, the subcategory awareness is able to reduce the false boosting caused by ambiguity. [sent-96, score-0.904]
45 With subcategory awareness, the response of diningtable will not be boosted as there is no boosting correlation between the sofa-like chairs and diningtables. [sent-100, score-0.913]
46 Ambiguity Guided Subcategory Mining In this section, we will introduce how to find the subcategories by our ambiguity guided subcategory mining approach as illustrated in Figure 3. [sent-102, score-1.577]
47 For a classification problem, a training set of M samples are given and represented by the matrix X = 888882222299777 instance affinity graph is built by combining both intra-class similarity and inter-class ambiguity. [sent-104, score-0.274]
48 Then dense subgraphs are detected within the affinity graph by performing graph shift. [sent-105, score-0.224]
49 Though it is a common similarity metric for object classification, appearance similarity only is not enough × for our SAOC framework, as in SAOC classification and detection are closely integrated. [sent-122, score-0.287]
50 Subcategory mining only based on appearance similarity may lead to poor detectors, which in turn harms the overall performance. [sent-123, score-0.264]
51 This is intuitive as even there are many subcategories spreading separately in the feature space, if none of subcategories are close to samples of other categories, a single classifier may be enough to correctly classify all these subcategories. [sent-140, score-0.683]
52 On the contrary, if some subcategories are near the decision boundary, separate classifiers should be trained for these ambiguous subcategories. [sent-141, score-0.479]
53 Otherwise the ambiguous subcategories may decrease the classification performance of categories near the decision boundary. [sent-142, score-0.568]
54 As ambiguity is critical for object classification, subcategory mining should be guided by ambiguity instead of only relying on intra-class data distribution. [sent-143, score-1.477]
55 Before introducing how to combine sample similarity and ambiguity into a unified framework, we need to first explicitly define the ambiguity measure. [sent-144, score-0.407]
56 The ambiguity will be high for those training samples lying close to the decision boundary, and thus such samples should be more likely to form a separate subcategory. [sent-155, score-0.308]
57 Subcategory Mining by Graph Shift Intuitively, the subcategory mining algorithm is expected to satisfy the following three properties. [sent-158, score-0.964]
58 Clustering methods based on only intra-class data distribution may fail to detect the ambiguous subcategories on the decision boundary and lead to subcategories imperfect for classification. [sent-162, score-0.784]
59 (a) Kmeans (b) Spectral clustering (c) Graph shift Figure 4: The subcategory mining results on synthetic data from kmeans, spectral clustering and graph shift. [sent-167, score-1.23]
60 Kmeans and spectral clustering cluster the dots relying on only intra-class information, which leads to non-linearly separable subcategories from triangles. [sent-172, score-0.435]
61 However, by utilizing the inter-class information, all three subcategories mined by the ambiguity guided graph shift are linearly separable from triangles, which is desired for classification. [sent-173, score-0.837]
62 into coherent groups without explicit outlier handling may fail to find the true subcategory structure. [sent-175, score-0.78]
63 The traditional partition methods, such as k-means and spectral clustering methods, are not expected to always work well for subcategory mining due to their insisting on partitioning all the input data and inability to integrate the inter-class information. [sent-176, score-1.065]
64 The graph shift algorithm [25], which is efficient and robust for graph mode seeking, appears to be particularly suitable for our subcategory mining problem as it directly works on graph, allows one to extract as many clusters as desired, and leaves the outlier points ungrouped. [sent-178, score-1.214]
65 More importantly, the ambiguity can be seamlessly integrated into the graph shift framework. [sent-179, score-0.396]
66 The graph shift algorithm shares the similar spirit with mean shift [6] algorithm and evolves through iterative expansion and shrink procedures. [sent-180, score-0.324]
67 The main difference is that mean shift operates directly on the feature space, while graph shift operates on the affinity graph. [sent-181, score-0.296]
68 The simulation results for comparing our ambiguity guided graph shift (AGS) with kmeans and spectral clustering are provided in Figure 4, from which we can see that our AGS can lead to subcategories more suitable for boosting classification. [sent-182, score-0.916]
69 The diagonal elements of A represent the ambiguity of the samples while the non-diagonal element measures the similarity between samples. [sent-189, score-0.271]
70 ΔMore specifically, i n{ ythi ∈s paper sample similarity an=d ambiguity are integrated iann dth iensc poadpeedr as the edge weights of a graph, whose nodes represent the instances of the specific object category. [sent-191, score-0.29]
71 Hence subcategories should correspond to those strongly connected subgraphs. [sent-192, score-0.323]
72 The graph shift algorithm provides a complementary neighbourhood expansion procedure to expand the supporting vertices. [sent-204, score-0.232]
73 Like mean shift algorithm, the graph shift algorithm starts from an individual sample and evolves towards the mode of G. [sent-206, score-0.32]
74 Figure 5 displays our subcategory mining results for bus and chair categories. [sent-224, score-1.005]
75 Each row on the left side shows one discovered subcategory while right side images are detected as outliers and left ungrouped. [sent-225, score-0.801]
76 For the bus category, the first 3 subcategories correspond to 3 different views of buses. [sent-226, score-0.323]
77 We note the shape and appearance of the last subcategory show much larger diversity than other subcategories. [sent-228, score-0.78]
78 Hence the subcategory mining results should be the combination effects of both appearance similarity and shape similarity, which can be observed from the discovered subcategories. [sent-231, score-1.025]
79 Some subcategories may not have common shapes, but have similar local patterns. [sent-232, score-0.323]
80 For example, chairs of the 2nd subcategory all have the stripe-like patterns. [sent-233, score-0.892]
81 We note again the last detected subcategory looks like sofas. [sent-234, score-0.801]
82 Besides being different from other chair subcategories, the ambiguity with sofa is also one of the main reasons that these images form a separate subcategory. [sent-235, score-0.234]
83 Subcategory Mining Method Comparison We extensively evaluate the effectiveness of different subcategory mining approaches on the VOC 2007 dataset, as the ground-truth of its testing set is released. [sent-238, score-0.964]
84 DPM-spectral, DPM-GS and DPM-AGS replace the aspect ratio based initialization with spectral clustering, 8 8 83 3 32 0 0 Table 1: Classification results (AP in %) comparison for different subcategory mining approaches category, the winner is shown in bold font. [sent-242, score-1.002]
85 As detection assisted classification has become a standard approach for classification on PASCAL VOC. [sent-287, score-0.285]
86 We augment FVGHM with detection context information and utilize the resulting FVGHM-CTX as the starting point to evaluate different subcategory mining methods. [sent-288, score-1.029]
87 The subcategory number is determined by the expansion size of the graph shift algorithm. [sent-292, score-0.975]
88 Here the expansion size is decided by crossvalidation, and the subcategory number is generally from 2 to 5. [sent-293, score-0.811]
89 When compared with other leading techniques in subcategory based detection, our method obtains the best results for most categories, achieving superior performance on categories with requires manually labelling the pose of each image, performs quite well on articulated categories. [sent-300, score-0.816]
90 The inferior performance of our ambiguity guided mining framework on articulated categories is mainly due to the limited discriminative ability of current similarity metric. [sent-301, score-0.571]
91 The number of subcategories is also determined by cross-validation as mentioned above. [sent-312, score-0.323]
92 We note that all the leading classification methods combine object classification and object detection to achieve higher accuracy. [sent-314, score-0.287]
93 However, most of the previous methods simply fuse the outputs of a monolithic classification model and a monolithic detection at category level. [sent-315, score-0.411]
94 This limitation prevents them from grasping the informative subcategory structure and the interaction among the subcategories. [sent-316, score-0.78]
95 By effectively employing the subcategory structure, we can further improve the state-of-the-art performance by 2. [sent-317, score-0.78]
96 Conclusions and Future Work In this paper, we proposed an ambiguity guided subcategory mining and subcategory-aware object classification framework for object classification. [sent-352, score-1.408]
97 We modeled the subcategory mining as a dense subgraph seeking problem. [sent-353, score-1.009]
98 This general scheme allows us to gracefully embed intra-class similarity and inter-class ambiguity into a unified framework. [sent-354, score-0.253]
99 Ambiguity guided subcategory mining results are then seamlessly integrated into the subcategory-aware detection assisted object classification framework. [sent-356, score-1.367]
100 In the future, we plan to further explore whether our ambiguity guided subcategory mining can be extended for object segmentation and also develop a more efficient and scalable version of current framework to handle bigger data. [sent-358, score-1.286]
wordName wordTfidf (topN-words)
[('subcategory', 0.78), ('subcategories', 0.323), ('mining', 0.184), ('ambiguity', 0.173), ('guided', 0.117), ('chairs', 0.112), ('monolithic', 0.11), ('shift', 0.105), ('awareness', 0.103), ('saoc', 0.094), ('classification', 0.09), ('diversities', 0.079), ('voc', 0.068), ('stqp', 0.063), ('assisted', 0.062), ('similarity', 0.061), ('graph', 0.059), ('category', 0.058), ('ambiguous', 0.057), ('replicator', 0.056), ('singapore', 0.049), ('sofas', 0.049), ('diningtables', 0.047), ('ghm', 0.047), ('boost', 0.046), ('pascal', 0.043), ('detection', 0.043), ('chair', 0.041), ('decision', 0.041), ('dpm', 0.04), ('spectral', 0.038), ('samples', 0.037), ('mined', 0.037), ('subgraphs', 0.037), ('categories', 0.036), ('seamlessly', 0.035), ('clustering', 0.032), ('object', 0.032), ('ags', 0.031), ('fvghm', 0.031), ('insisting', 0.031), ('piloted', 0.031), ('ytay', 0.031), ('dai', 0.031), ('fk', 0.031), ('expansion', 0.031), ('pooling', 0.029), ('kmeans', 0.029), ('assistance', 0.028), ('zhongyang', 0.028), ('maximizers', 0.028), ('buses', 0.028), ('dynamics', 0.027), ('affinity', 0.027), ('mode', 0.027), ('fusion', 0.026), ('strange', 0.026), ('xi', 0.025), ('egory', 0.024), ('evolves', 0.024), ('harzallah', 0.024), ('seeking', 0.024), ('integrated', 0.024), ('separable', 0.023), ('misalignments', 0.023), ('pipeline', 0.023), ('utilize', 0.022), ('hog', 0.022), ('evolutionary', 0.022), ('customized', 0.022), ('divvala', 0.022), ('simplex', 0.022), ('contrary', 0.022), ('complicated', 0.021), ('grouped', 0.021), ('boosting', 0.021), ('dense', 0.021), ('near', 0.021), ('modes', 0.021), ('cth', 0.021), ('boundary', 0.021), ('detected', 0.021), ('lbp', 0.021), ('bow', 0.02), ('ambiguities', 0.02), ('separate', 0.02), ('responses', 0.019), ('level', 0.019), ('lead', 0.019), ('leads', 0.019), ('complementary', 0.019), ('gracefully', 0.019), ('neighbours', 0.019), ('vq', 0.018), ('critical', 0.018), ('neighbourhood', 0.018), ('sliding', 0.018), ('clean', 0.018), ('classifiers', 0.017), ('game', 0.017)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999958 417 cvpr-2013-Subcategory-Aware Object Classification
Author: Jian Dong, Wei Xia, Qiang Chen, Jianshi Feng, Zhongyang Huang, Shuicheng Yan
Abstract: In this paper, we introduce a subcategory-aware object classification framework to boost category level object classification performance. Motivated by the observation of considerable intra-class diversities and inter-class ambiguities in many current object classification datasets, we explicitly split data into subcategories by ambiguity guided subcategory mining. We then train an individual model for each subcategory rather than attempt to represent an object category with a monolithic model. More specifically, we build the instance affinity graph by combining both intraclass similarity and inter-class ambiguity. Visual subcategories, which correspond to the dense subgraphs, are detected by the graph shift algorithm and seamlessly integrated into the state-of-the-art detection assisted classification framework. Finally the responses from subcategory models are aggregated by subcategory-aware kernel regression. The extensive experiments over the PASCAL VOC 2007 and PASCAL VOC 2010 databases show the state-ofthe-art performance from our framework.
2 0.13030928 445 cvpr-2013-Understanding Bayesian Rooms Using Composite 3D Object Models
Author: Luca Del_Pero, Joshua Bowdish, Bonnie Kermgard, Emily Hartley, Kobus Barnard
Abstract: We develop a comprehensive Bayesian generative model for understanding indoor scenes. While it is common in this domain to approximate objects with 3D bounding boxes, we propose using strong representations with finer granularity. For example, we model a chair as a set of four legs, a seat and a backrest. We find that modeling detailed geometry improves recognition and reconstruction, and enables more refined use of appearance for scene understanding. We demonstrate this with a new likelihood function that re- wards 3D object hypotheses whose 2D projection is more uniform in color distribution. Such a measure would be confused by background pixels if we used a bounding box to represent a concave object like a chair. Complex objects are modeled using a set or re-usable 3D parts, and we show that this representation captures much of the variation among object instances with relatively few parameters. We also designed specific data-driven inference mechanismsfor eachpart that are shared by all objects containing that part, which helps make inference transparent to the modeler. Further, we show how to exploit contextual relationships to detect more objects, by, for example, proposing chairs around and underneath tables. We present results showing the benefits of each of these innovations. The performance of our approach often exceeds that of state-of-the-art methods on the two tasks of room layout estimation and object recognition, as evaluated on two bench mark data sets used in this domain. work. 1) Detailed geometric models, such as tables with legs and top (bottom left), provide better reconstructions than plain boxes (top right), when supported by image features such as geometric context [5] (top middle), or an approach to using color introduced here. 2) Non convex models allow for complex configurations, such as a chair under a table (bottom middle). 3) 3D contextual relationships, such as chairs being around a table, allow identifying objects supported by little image evidence, like the chair behind the table (bottom right). Best viewed in color.
3 0.090439335 136 cvpr-2013-Discriminatively Trained And-Or Tree Models for Object Detection
Author: Xi Song, Tianfu Wu, Yunde Jia, Song-Chun Zhu
Abstract: This paper presents a method of learning reconfigurable And-Or Tree (AOT) models discriminatively from weakly annotated data for object detection. To explore the appearance and geometry space of latent structures effectively, we first quantize the image lattice using an overcomplete set of shape primitives, and then organize them into a directed acyclic And-Or Graph (AOG) by exploiting their compositional relations. We allow overlaps between child nodes when combining them into a parent node, which is equivalent to introducing an appearance Or-node implicitly for the overlapped portion. The learning of an AOT model consists of three components: (i) Unsupervised sub-category learning (i.e., branches of an object Or-node) with the latent structures in AOG being integrated out. (ii) Weaklysupervised part configuration learning (i.e., seeking the globally optimal parse trees in AOG for each sub-category). To search the globally optimal parse tree in AOG efficiently, we propose a dynamic programming (DP) algorithm. (iii) Joint appearance and structural parameters training under latent structural SVM framework. In experiments, our method is tested on PASCAL VOC 2007 and 2010 detection , benchmarks of 20 object classes and outperforms comparable state-of-the-art methods.
4 0.079255126 67 cvpr-2013-Blocks That Shout: Distinctive Parts for Scene Classification
Author: Mayank Juneja, Andrea Vedaldi, C.V. Jawahar, Andrew Zisserman
Abstract: The automatic discovery of distinctive parts for an object or scene class is challenging since it requires simultaneously to learn the part appearance and also to identify the part occurrences in images. In this paper, we propose a simple, efficient, and effective method to do so. We address this problem by learning parts incrementally, starting from a single part occurrence with an Exemplar SVM. In this manner, additional part instances are discovered and aligned reliably before being considered as training examples. We also propose entropy-rank curves as a means of evaluating the distinctiveness of parts shareable between categories and use them to select useful parts out of a set of candidates. We apply the new representation to the task of scene categorisation on the MIT Scene 67 benchmark. We show that our method can learn parts which are significantly more informative and for a fraction of the cost, compared to previouspart-learning methods such as Singh et al. [28]. We also show that a well constructed bag of words or Fisher vector model can substantially outperform the previous state-of- the-art classification performance on this data.
5 0.073192842 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
Author: Ian Endres, Kevin J. Shih, Johnston Jiaa, Derek Hoiem
Abstract: We propose a method to learn a diverse collection of discriminative parts from object bounding box annotations. Part detectors can be trained and applied individually, which simplifies learning and extension to new features or categories. We apply the parts to object category detection, pooling part detections within bottom-up proposed regions and using a boosted classifier with proposed sigmoid weak learners for scoring. On PASCAL VOC 2010, we evaluate the part detectors ’ ability to discriminate and localize annotated keypoints. Our detection system is competitive with the best-existing systems, outperforming other HOG-based detectors on the more deformable categories.
6 0.071086913 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection
7 0.067903168 371 cvpr-2013-SCaLE: Supervised and Cascaded Laplacian Eigenmaps for Visual Object Recognition Based on Nearest Neighbors
8 0.06584882 144 cvpr-2013-Efficient Maximum Appearance Search for Large-Scale Object Detection
9 0.062947579 247 cvpr-2013-Learning Class-to-Image Distance with Object Matchings
10 0.061296284 134 cvpr-2013-Discriminative Sub-categorization
11 0.060875546 364 cvpr-2013-Robust Object Co-detection
12 0.057574458 221 cvpr-2013-Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection
13 0.0575056 163 cvpr-2013-Fast, Accurate Detection of 100,000 Object Classes on a Single Machine
14 0.055285342 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels
15 0.051694732 40 cvpr-2013-An Approach to Pose-Based Action Recognition
16 0.051241037 215 cvpr-2013-Improved Image Set Classification via Joint Sparse Approximated Nearest Subspaces
17 0.050097808 388 cvpr-2013-Semi-supervised Learning of Feature Hierarchies for Object Detection in a Video
18 0.049196452 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection
19 0.048364189 325 cvpr-2013-Part Discovery from Partial Correspondence
20 0.048316833 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
topicId topicWeight
[(0, 0.13), (1, -0.037), (2, -0.005), (3, -0.009), (4, 0.049), (5, 0.008), (6, 0.013), (7, 0.022), (8, -0.026), (9, -0.026), (10, -0.034), (11, -0.033), (12, 0.002), (13, -0.063), (14, 0.003), (15, -0.031), (16, -0.007), (17, -0.012), (18, -0.013), (19, -0.007), (20, 0.01), (21, 0.005), (22, 0.072), (23, 0.016), (24, 0.073), (25, 0.004), (26, 0.008), (27, 0.02), (28, -0.019), (29, 0.0), (30, -0.066), (31, 0.034), (32, -0.027), (33, -0.02), (34, 0.026), (35, 0.019), (36, 0.013), (37, 0.018), (38, 0.013), (39, -0.083), (40, -0.033), (41, -0.021), (42, -0.043), (43, 0.065), (44, 0.021), (45, 0.003), (46, 0.016), (47, -0.026), (48, -0.023), (49, -0.019)]
simIndex simValue paperId paperTitle
same-paper 1 0.9166888 417 cvpr-2013-Subcategory-Aware Object Classification
Author: Jian Dong, Wei Xia, Qiang Chen, Jianshi Feng, Zhongyang Huang, Shuicheng Yan
Abstract: In this paper, we introduce a subcategory-aware object classification framework to boost category level object classification performance. Motivated by the observation of considerable intra-class diversities and inter-class ambiguities in many current object classification datasets, we explicitly split data into subcategories by ambiguity guided subcategory mining. We then train an individual model for each subcategory rather than attempt to represent an object category with a monolithic model. More specifically, we build the instance affinity graph by combining both intraclass similarity and inter-class ambiguity. Visual subcategories, which correspond to the dense subgraphs, are detected by the graph shift algorithm and seamlessly integrated into the state-of-the-art detection assisted classification framework. Finally the responses from subcategory models are aggregated by subcategory-aware kernel regression. The extensive experiments over the PASCAL VOC 2007 and PASCAL VOC 2010 databases show the state-ofthe-art performance from our framework.
Author: Xiaolong Wang, Liang Lin, Lichao Huang, Shuicheng Yan
Abstract: This paper proposes a reconfigurable model to recognize and detect multiclass (or multiview) objects with large variation in appearance. Compared with well acknowledged hierarchical models, we study two advanced capabilities in hierarchy for object modeling: (i) “switch” variables(i.e. or-nodes) for specifying alternative compositions, and (ii) making local classifiers (i.e. leaf-nodes) shared among different classes. These capabilities enable us to account well for structural variabilities while preserving the model compact. Our model, in the form of an And-Or Graph, comprises four layers: a batch of leaf-nodes with collaborative edges in bottom for localizing object parts; the or-nodes over bottom to activate their children leaf-nodes; the andnodes to classify objects as a whole; one root-node on the top for switching multiclass classification, which is also an or-node. For model training, we present an EM-type algorithm, namely dynamical structural optimization (DSO), to iteratively determine the structural configuration, (e.g., leaf-node generation associated with their parent or-nodes and shared across other classes), along with optimizing multi-layer parameters. The proposed method is valid on challenging databases, e.g., PASCAL VOC2007and UIUCPeople, and it achieves state-of-the-arts performance.
3 0.73397607 247 cvpr-2013-Learning Class-to-Image Distance with Object Matchings
Author: Guang-Tong Zhou, Tian Lan, Weilong Yang, Greg Mori
Abstract: We conduct image classification by learning a class-toimage distance function that matches objects. The set of objects in training images for an image class are treated as a collage. When presented with a test image, the best matching between this collage of training image objects and those in the test image is found. We validate the efficacy of the proposed model on the PASCAL 07 and SUN 09 datasets, showing that our model is effective for object classification and scene classification tasks. State-of-the-art image classification results are obtained, and qualitative results demonstrate that objects can be accurately matched.
4 0.7213428 144 cvpr-2013-Efficient Maximum Appearance Search for Large-Scale Object Detection
Author: Qiang Chen, Zheng Song, Rogerio Feris, Ankur Datta, Liangliang Cao, Zhongyang Huang, Shuicheng Yan
Abstract: In recent years, efficiency of large-scale object detection has arisen as an important topic due to the exponential growth in the size of benchmark object detection datasets. Most current object detection methods focus on improving accuracy of large-scale object detection with efficiency being an afterthought. In this paper, we present the Efficient Maximum Appearance Search (EMAS) model which is an order of magnitude faster than the existing state-of-the-art large-scale object detection approaches, while maintaining comparable accuracy. Our EMAS model consists of representing an image as an ensemble of densely sampled feature points with the proposed Pointwise Fisher Vector encoding method, so that the learnt discriminative scoring function can be applied locally. Consequently, the object detection problem is transformed into searching an image sub-area for maximum local appearance probability, thereby making EMAS an order of magnitude faster than the traditional detection methods. In addition, the proposed model is also suitable for incorporating global context at a negligible extra computational cost. EMAS can also incorporate fusion of multiple features, which greatly improves its performance in detecting multiple object categories. Our experiments show that the proposed algorithm can perform detection of 1000 object classes in less than one minute per image on the Image Net ILSVRC2012 dataset and for 107 object classes in less than 5 seconds per image for the SUN09 dataset using a single CPU.
5 0.71305579 364 cvpr-2013-Robust Object Co-detection
Author: Xin Guo, Dong Liu, Brendan Jou, Mojun Zhu, Anni Cai, Shih-Fu Chang
Abstract: Object co-detection aims at simultaneous detection of objects of the same category from a pool of related images by exploiting consistent visual patterns present in candidate objects in the images. The related image set may contain a mixture of annotated objects and candidate objects generated by automatic detectors. Co-detection differs from the conventional object detection paradigm in which detection over each test image is determined one-by-one independently without taking advantage of common patterns in the data pool. In this paper, we propose a novel, robust approach to dramatically enhance co-detection by extracting a shared low-rank representation of the object instances in multiple feature spaces. The idea is analogous to that of the well-known Robust PCA [28], but has not been explored in object co-detection so far. The representation is based on a linear reconstruction over the entire data set and the low-rank approach enables effective removal of noisy and outlier samples. The extracted low-rank representation can be used to detect the target objects by spectral clustering. Extensive experiments over diverse benchmark datasets demonstrate consistent and significant performance gains of the proposed method over the state-of-the-art object codetection method and the generic object detection methods without co-detection formulations.
6 0.71277958 67 cvpr-2013-Blocks That Shout: Distinctive Parts for Scene Classification
7 0.68350607 183 cvpr-2013-GRASP Recurring Patterns from a Single View
8 0.68147922 445 cvpr-2013-Understanding Bayesian Rooms Using Composite 3D Object Models
9 0.67539537 416 cvpr-2013-Studying Relationships between Human Gaze, Description, and Computer Vision
10 0.65875542 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
11 0.64704007 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
12 0.64075059 136 cvpr-2013-Discriminatively Trained And-Or Tree Models for Object Detection
13 0.63349926 239 cvpr-2013-Kernel Null Space Methods for Novelty Detection
14 0.62716812 134 cvpr-2013-Discriminative Sub-categorization
15 0.61579674 371 cvpr-2013-SCaLE: Supervised and Cascaded Laplacian Eigenmaps for Visual Object Recognition Based on Nearest Neighbors
16 0.61492044 30 cvpr-2013-Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses
17 0.61403114 382 cvpr-2013-Scene Text Recognition Using Part-Based Tree-Structured Character Detection
18 0.6110841 403 cvpr-2013-Sparse Output Coding for Large-Scale Visual Recognition
19 0.60577369 325 cvpr-2013-Part Discovery from Partial Correspondence
20 0.60427088 163 cvpr-2013-Fast, Accurate Detection of 100,000 Object Classes on a Single Machine
topicId topicWeight
[(10, 0.099), (16, 0.018), (26, 0.075), (28, 0.014), (33, 0.232), (50, 0.206), (65, 0.011), (67, 0.077), (69, 0.084), (80, 0.013), (87, 0.07)]
simIndex simValue paperId paperTitle
1 0.84353244 8 cvpr-2013-A Fast Approximate AIB Algorithm for Distributional Word Clustering
Author: Lei Wang, Jianjia Zhang, Luping Zhou, Wanqing Li
Abstract: Distributional word clustering merges the words having similar probability distributions to attain reliable parameter estimation, compact classification models and even better classification performance. Agglomerative Information Bottleneck (AIB) is one of the typical word clustering algorithms and has been applied to both traditional text classification and recent image recognition. Although enjoying theoretical elegance, AIB has one main issue on its computational efficiency, especially when clustering a large number of words. Different from existing solutions to this issue, we analyze the characteristics of its objective function the loss of mutual information, and show that by merely using the ratio of word-class joint probabilities of each word, good candidate word pairs for merging can be easily identified. Based on this finding, we propose a fast approximate AIB algorithm and show that it can significantly improve the computational efficiency of AIB while well maintaining or even slightly increasing its classification performance. Experimental study on both text and image classification benchmark data sets shows that our algorithm can achieve more than 100 times speedup on large real data sets over the state-of-the-art method.
same-paper 2 0.84287274 417 cvpr-2013-Subcategory-Aware Object Classification
Author: Jian Dong, Wei Xia, Qiang Chen, Jianshi Feng, Zhongyang Huang, Shuicheng Yan
Abstract: In this paper, we introduce a subcategory-aware object classification framework to boost category level object classification performance. Motivated by the observation of considerable intra-class diversities and inter-class ambiguities in many current object classification datasets, we explicitly split data into subcategories by ambiguity guided subcategory mining. We then train an individual model for each subcategory rather than attempt to represent an object category with a monolithic model. More specifically, we build the instance affinity graph by combining both intraclass similarity and inter-class ambiguity. Visual subcategories, which correspond to the dense subgraphs, are detected by the graph shift algorithm and seamlessly integrated into the state-of-the-art detection assisted classification framework. Finally the responses from subcategory models are aggregated by subcategory-aware kernel regression. The extensive experiments over the PASCAL VOC 2007 and PASCAL VOC 2010 databases show the state-ofthe-art performance from our framework.
3 0.82164454 243 cvpr-2013-Large-Scale Video Summarization Using Web-Image Priors
Author: Aditya Khosla, Raffay Hamid, Chih-Jen Lin, Neel Sundaresan
Abstract: Given the enormous growth in user-generated videos, it is becoming increasingly important to be able to navigate them efficiently. As these videos are generally of poor quality, summarization methods designed for well-produced videos do not generalize to them. To address this challenge, we propose to use web-images as a prior to facilitate summarization of user-generated videos. Our main intuition is that people tend to take pictures of objects to capture them in a maximally informative way. Such images could therefore be used as prior information to summarize videos containing a similar set of objects. In this work, we apply our novel insight to develop a summarization algorithm that uses the web-image based prior information in an unsupervised manner. Moreover, to automatically evaluate summarization algorithms on a large scale, we propose a framework that relies on multiple summaries obtained through crowdsourcing. We demonstrate the effectiveness of our evaluation framework by comparing its performance to that ofmultiple human evaluators. Finally, wepresent resultsfor our framework tested on hundreds of user-generated videos.
4 0.81936985 336 cvpr-2013-Poselet Key-Framing: A Model for Human Activity Recognition
Author: Michalis Raptis, Leonid Sigal
Abstract: In this paper, we develop a new model for recognizing human actions. An action is modeled as a very sparse sequence of temporally local discriminative keyframes collections of partial key-poses of the actor(s), depicting key states in the action sequence. We cast the learning of keyframes in a max-margin discriminative framework, where we treat keyframes as latent variables. This allows us to (jointly) learn a set of most discriminative keyframes while also learning the local temporal context between them. Keyframes are encoded using a spatially-localizable poselet-like representation with HoG and BoW components learned from weak annotations; we rely on structured SVM formulation to align our components and minefor hard negatives to boost localization performance. This results in a model that supports spatio-temporal localization and is insensitive to dropped frames or partial observations. We show classification performance that is competitive with the state of the art on the benchmark UT-Interaction dataset and illustrate that our model outperforms prior methods in an on-line streaming setting.
5 0.81243652 321 cvpr-2013-PDM-ENLOR: Learning Ensemble of Local PDM-Based Regressions
Author: Yen H. Le, Uday Kurkure, Ioannis A. Kakadiaris
Abstract: Statistical shape models, such as Active Shape Models (ASMs), sufferfrom their inability to represent a large range of variations of a complex shape and to account for the large errors in detection of model points. We propose a novel method (dubbed PDM-ENLOR) that overcomes these limitations by locating each shape model point individually using an ensemble of local regression models and appearance cues from selected model points. Our method first detects a set of reference points which were selected based on their saliency during training. For each model point, an ensemble of regressors is built. From the locations of the detected reference points, each regressor infers a candidate location for that model point using local geometric constraints, encoded by a point distribution model (PDM). The final location of that point is determined as a weighted linear combination, whose coefficients are learnt from the training data, of candidates proposed from its ensemble ’s component regressors. We use different subsets of reference points as explanatory variables for the component regressors to provide varying degrees of locality for the models in each ensemble. This helps our ensemble model to capture a larger range of shape variations as compared to a single PDM. We demonstrate the advantages of our method on the challenging problem of segmenting gene expression images of mouse brain.
6 0.79531616 413 cvpr-2013-Story-Driven Summarization for Egocentric Video
7 0.7952047 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
8 0.7941125 311 cvpr-2013-Occlusion Patterns for Object Class Detection
9 0.7873776 4 cvpr-2013-3D Visual Proxemics: Recognizing Human Interactions in 3D from a Single Image
10 0.78706414 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics
11 0.78688508 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
12 0.78669953 292 cvpr-2013-Multi-agent Event Detection: Localization and Role Assignment
13 0.78663379 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities
14 0.78662705 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval
15 0.7853452 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection
16 0.78495884 172 cvpr-2013-Finding Group Interactions in Social Clutter
17 0.78490698 445 cvpr-2013-Understanding Bayesian Rooms Using Composite 3D Object Models
18 0.78410035 440 cvpr-2013-Tracking People and Their Objects
19 0.7839843 104 cvpr-2013-Deep Convolutional Network Cascade for Facial Point Detection
20 0.78396571 414 cvpr-2013-Structure Preserving Object Tracking