cvpr cvpr2013 cvpr2013-183 knowledge-graph by maker-knowledge-mining

183 cvpr-2013-GRASP Recurring Patterns from a Single View

Source: pdf

Author: Jingchen Liu, Yanxi Liu

Abstract: We propose a novel unsupervised method for discovering recurring patterns from a single view. A key contribution of our approach is the formulation and validation of a joint assignment optimization problem where multiple visual words and object instances of a potential recurring pattern are considered simultaneously. The optimization is achieved by a greedy randomized adaptive search procedure (GRASP) with moves specifically designed for fast convergence. We have quantified systematically the performance of our approach under stressed conditions of the input (missing features, geometric distortions). We demonstrate that our proposed algorithm outperforms state of the art methods for recurring pattern discovery on a diverse set of 400+ real world and synthesized test images.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 GRASP Recurring Patterns from a Single View Jingchen Liu1 Yanxi Liu1,2 1 Computer Science and Engineering, 2 Electrical Engineering The Pennsylvania State University University Park, PA 16802, USA { j ingchen , yanxi } @ c s e . [sent-1, score-0.105]

2 edu Abstract We propose a novel unsupervised method for discovering recurring patterns from a single view. [sent-3, score-1.024]

3 A key contribution of our approach is the formulation and validation of a joint assignment optimization problem where multiple visual words and object instances of a potential recurring pattern are considered simultaneously. [sent-4, score-1.281]

4 The optimization is achieved by a greedy randomized adaptive search procedure (GRASP) with moves specifically designed for fast convergence. [sent-5, score-0.133]

5 We have quantified systematically the performance of our approach under stressed conditions of the input (missing features, geometric distortions). [sent-6, score-0.079]

6 We demonstrate that our proposed algorithm outperforms state of the art methods for recurring pattern discovery on a diverse set of 400+ real world and synthesized test images. [sent-7, score-1.102]

7 Introduction Similar yet non-identical objects, such as animals in a herd, cars on the street, faces in a crowd or goods on a supermarket shelf, are ubiquitous. [sent-9, score-0.038]

8 There has been a surge of interest in unsupervised visual perception of such nearidentical objects [1, 2, 3, 4, 5, 6, 7, 8], echoing an observation that much of our understanding of the world is based on the perception and recognition of shared or repeated structures [9]. [sent-10, score-0.244]

9 To capture the recurrence nature within such patterns, we use the term recurring pattern to refer to the ensemble of multiple instances of a common visual object or object for short, which may or may not correspond to a complete physical object. [sent-11, score-1.147]

10 As shown in Figure 1, each object of a recurring pattern is a geometric composition (green arcs) of visual words (distinct red iconic shapes), where partial matching among the objects is permitted. [sent-12, score-1.217]

11 The recognition of recurring patterns has applications in effective image segmentation [4], compression and super-resolution [2], retrieval [5] and organization of unlabeled data [7]. [sent-13, score-1.001]

12 More fundamentally, a recurring pattern is a domain independent representation for semantically meaningful mid-level grouping (a) a 6-instance recur ing pat ern (b) an 8-instance recur ing pat ern Figure 1. [sent-14, score-1.336]

13 Unsupervised discovery of recurring patterns in real images by our proposed algorithm, where partial matching and low visual word recall rates (75% for (a), 71% for (b)) are allowed. [sent-15, score-1.307]

14 Two classic approaches for recurring pattern detection are: (A) pairwise visual-word-matching which matches pairs of visual words across all objects [7]; and (B) pairwise object-matching which matches feature point correspondences between a pair of objects [12, 5, 4]. [sent-17, score-1.332]

15 (2) Visual wordpair matching also suffers from missing feature points (low visual word recall rate), as shown by our quantitative evaluations (Section 4). [sent-19, score-0.317]

16 (3) Whether it is better to match object-pairs or visual word-pairs is unknown in advance, and due to the lack of a global decision mechanism, current pairwise-matching systems do not afford flexible and adaptive switching between the two. [sent-20, score-0.139]

17 We are thus motivated to propose an alternative jointoptimization framework for recurring pattern discovery by matching along both visual word and object dimensions si222000000311 multaneously (Fig. [sent-21, score-1.325]

18 Related Work Recurring pattern discovery has been referred to in the literature as common visual pattern discovery [14, 5], corecognition/segmentation of objects [15, 16, 4], and highorder structural semantics learning [7]. [sent-25, score-0.496]

19 [15, 17, 18] achieve unsupervised detection/segmentation of two objects in two separate images. [sent-26, score-0.07]

20 Yuan and Wu [14] use spatial random partitioning to detect object pair(s) from one or a pair of images; Cho et al. [sent-27, score-0.018]

21 formulate the same problem as correspondence association solved by MCMC exploration [16] and graph matching [3], respectively. [sent-28, score-0.143]

22 [5] adopts graph matching to detect multiple recurring patterns between two images. [sent-29, score-1.055]

23 To detect more than 2 recurring in- stances, Cho et al. [sent-30, score-0.925]

24 generalize feature correspondence association under a many-to-many constraint and perform multiple object matching using agglomerative clustering [19] and MCMC association [4]. [sent-31, score-0.191]

25 [7] use the approach of pairwise visual word-matching, while assuming that visual words can be detected on all recurring instances (i. [sent-34, score-1.161]

26 Our method differs from previous work in two significant ways: (1) it solves a simultaneous visual word-object assignment problem; and (2) it explicitly and effectively deals with missing/spurious feature points in recurring patterns (feature recall rate from an image can be lower than 100%). [sent-37, score-1.218]

27 Another line of related work is unsupervised category discovery, e. [sent-38, score-0.043]

28 Our approach We start with a formalization of the concept of a recurring pattern and its components (Fig. [sent-44, score-1.051]

29 The key technical steps are the selection and grouping of representative feature points into key visual words and the exploration of the structural consistency among their topology/geometry by using GRASP optimization to discover recurring patterns. [sent-47, score-1.248]

30 Formalization of Recurring Patterns We define a recurring pattern to have at least two visual objects. [sent-50, score-1.061]

31 Likewise each object of a recurring pattern is required to have at least two distinct visual words. [sent-51, score-1.082]

32 Thus, the smallest recurring pattern is conceptually a 4-tuple structure satisfying certain affinity constraints (Figure 2 (a)). [sent-52, score-1.083]

33 The visual word distinctiveness requirement forces each object of a recurring pattern to have a compact representation (no nested recurrence of visual words within each object), thus qualifying it to serve as a structural-primitive for recurring pattern discovery. [sent-53, score-2.456]

34 More importantly, this definition ensures the uniqueness of each recurring pattern while maximizing number of object instances. [sent-54, score-1.015]

35 Mathematically, we construct a recurring pattern Ω as a 2D feature-assignment matrix where each row corresponds to a visual word and each column corresponds to a visual object (Figure 2 (b)), that is, ΩM,N (m, n) = fi, where fi corresponds to a feature point, m = 1. [sent-55, score-1.395]

36 N, and M and N are the number of visual words and objects, respectively. [sent-61, score-0.131]

37 ΩM,N (m, n) = 0 is used to indicate a corresponding feature point is missing. [sent-62, score-0.031]

38 (a) two potential objects of a smallest recurring pattern, n1, n2, each of which contains two visual words m1 , m2 ; (b) The 2D feature assignment matrix, where each row corresponds to a visual word and each column to a visual object. [sent-64, score-1.446]

39 Visual Word Extraction Given a set of feature points F = {fi |i = 1, . [sent-67, score-0.058]

40 , vSeInFT a), s a ovifsu faeal uwroerd p oWint sis F a =sub {sfet| iof = =F 1 s,u. [sent-72, score-0.049]

41 tKha}t all feature points in W share strong appearance similarity. [sent-75, score-0.058]

42 Let vi be the normalized descriptor of fi, such that ? [sent-76, score-0.021]

43 An overview of the proposed method: (a) input image; (b) extracted and clustered feature points with top 20 clusters color coded; (c) GRASP optimization framework; (d) automatically discovered recurring pattern after a joint optimization process. [sent-79, score-1.118]

44 1, we define a normalized affinity metric between features fi, fj as A(i,j) =vTivsjt−d a{vvpTg{vvq|pTp,vqq| =p,q 1, =2, 1. [sent-80, score-0.041]

45 (2) Starting with an initial assignment of W = {i, j} where A(iS, tjar) i sn gm waxithim aunm in among aigl n mfeeantutr oef pairs i n{ ,Fj, we use a forward selection scheme where new feature points fi are sequentially included into W that maximizes Eqn. [sent-89, score-0.185]

46 2 can no longer be increased, the growing of the current W stops and the extraction process then continues on F − W to find the next visual word. [sent-92, score-0.11]

47 Our visual uweosrd o forward-selection mtheeth noexd d vififseurasl significantly ifrsuoaml K-means, in that we only extract inlier subsets of F to form a vocabulary of key visual-words for recurring patterns, while ignoring a considerable amount of background noise or outliers. [sent-93, score-1.012]

48 For efficiency, the affinity matrix A can be made sparse by setting A(i, j) = 0 for A(i, j) < τ. [sent-94, score-0.041]

49 In our experiments, we set τ = 2 to remove feature pairs with distance that exceeds ‘two-sigma’ . [sent-95, score-0.052]

50 Given the sparsity of A, typically 30 ∼ 200 valid visual words can be extracted from a single image depending on tlh we image nco bnete enxtt raancdt ereds forolumtio an. [sent-96, score-0.177]

51 s Ideally, different feature points from the same visualword W should be present in the corresponding relative locations of all N objects of a recurring pattern, i. [sent-97, score-0.992]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('recurring', 0.907), ('grasp', 0.163), ('word', 0.106), ('discovery', 0.088), ('pattern', 0.088), ('patterns', 0.074), ('yanxi', 0.07), ('visual', 0.066), ('words', 0.065), ('fi', 0.062), ('recurrence', 0.059), ('recur', 0.059), ('formalization', 0.056), ('ern', 0.052), ('exploration', 0.048), ('pat', 0.047), ('assignment', 0.044), ('unsupervised', 0.043), ('mcmc', 0.043), ('affinity', 0.041), ('association', 0.041), ('cho', 0.035), ('tkha', 0.035), ('stressed', 0.035), ('multaneously', 0.035), ('ingchen', 0.035), ('jingchen', 0.035), ('qualifying', 0.035), ('surge', 0.035), ('matching', 0.035), ('stances', 0.033), ('randomized', 0.031), ('feature', 0.031), ('recall', 0.031), ('highorder', 0.031), ('moves', 0.03), ('pairwise', 0.03), ('iof', 0.029), ('iconic', 0.029), ('gau', 0.029), ('obt', 0.028), ('instances', 0.027), ('arcs', 0.027), ('distinctiveness', 0.027), ('shelf', 0.027), ('adaptive', 0.027), ('perception', 0.027), ('points', 0.027), ('objects', 0.027), ('afford', 0.026), ('pennsylvania', 0.026), ('grouping', 0.025), ('continues', 0.025), ('sub', 0.025), ('quantified', 0.025), ('smallest', 0.024), ('greedy', 0.024), ('agglomerative', 0.024), ('corresponds', 0.023), ('tlh', 0.023), ('formal', 0.023), ('nested', 0.023), ('nco', 0.023), ('conceptually', 0.023), ('joint', 0.023), ('optimization', 0.021), ('missing', 0.021), ('fundamentally', 0.021), ('likewise', 0.021), ('exceeds', 0.021), ('distinct', 0.021), ('matches', 0.021), ('adopts', 0.021), ('gm', 0.021), ('potential', 0.021), ('vi', 0.021), ('inlier', 0.02), ('switching', 0.02), ('sis', 0.02), ('structural', 0.02), ('engineering', 0.02), ('organization', 0.02), ('uniqueness', 0.02), ('wa', 0.019), ('crowd', 0.019), ('treatment', 0.019), ('forces', 0.019), ('key', 0.019), ('systematically', 0.019), ('stops', 0.019), ('animals', 0.019), ('rate', 0.019), ('coded', 0.019), ('mathematically', 0.019), ('world', 0.019), ('correspondence', 0.019), ('classic', 0.019), ('distortions', 0.019), ('deals', 0.019), ('detect', 0.018)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9999997 183 cvpr-2013-GRASP Recurring Patterns from a Single View

Author: Jingchen Liu, Yanxi Liu

2 0.072591007 434 cvpr-2013-Topical Video Object Discovery from Key Frames by Modeling Word Co-occurrence Prior

Author: Gangqiang Zhao, Junsong Yuan, Gang Hua

Abstract: A topical video object refers to an object that is frequently highlighted in a video. It could be, e.g., the product logo and the leading actor/actress in a TV commercial. We propose a topic model that incorporates a word co-occurrence prior for efficient discovery of topical video objects from a set of key frames. Previous work using topic models, such as Latent Dirichelet Allocation (LDA), for video object discovery often takes a bag-of-visual-words representation, which ignored important co-occurrence information among the local features. We show that such data driven co-occurrence information from bottom-up can conveniently be incorporated in LDA with a Gaussian Markov prior, which combines top down probabilistic topic modeling with bottom up priors in a unified model. Our experiments on challenging videos demonstrate that the proposed approach can discover different types of topical objects despite variations in scale, view-point, color and lighting changes, or even partial occlusions. The efficacy of the co-occurrence prior is clearly demonstrated when comparing with topic models without such priors.

3 0.069390908 456 cvpr-2013-Visual Place Recognition with Repetitive Structures

Author: Akihiko Torii, Josef Sivic, Tomáš Pajdla, Masatoshi Okutomi

Abstract: Repeated structures such as building facades, fences or road markings often represent a significant challenge for place recognition. Repeated structures are notoriously hard for establishing correspondences using multi-view geometry. Even more importantly, they violate thefeature independence assumed in the bag-of-visual-words representation which often leads to over-counting evidence and significant degradation of retrieval performance. In this work we show that repeated structures are not a nuisance but, when appropriately represented, theyform an importantdistinguishing feature for many places. We describe a representation of repeated structures suitable for scalable retrieval. It is based on robust detection of repeated image structures and a simple modification of weights in the bag-of-visual-word model. Place recognition results are shown on datasets of street-level imagery from Pittsburgh and San Francisco demonstrating significant gains in recognition performance compared to the standard bag-of-visual-words baseline and more recently proposed burstiness weighting.

4 0.056974262 200 cvpr-2013-Harvesting Mid-level Visual Concepts from Large-Scale Internet Images

Author: Quannan Li, Jiajun Wu, Zhuowen Tu

Abstract: Obtaining effective mid-level representations has become an increasingly important task in computer vision. In this paper, we propose a fully automatic algorithm which harvests visual concepts from a large number of Internet images (more than a quarter of a million) using text-based queries. Existing approaches to visual concept learning from Internet images either rely on strong supervision with detailed manual annotations or learn image-level classifiers only. Here, we take the advantage of having massive wellorganized Google and Bing image data; visual concepts (around 14, 000) are automatically exploited from images using word-based queries. Using the learned visual concepts, we show state-of-the-art performances on a variety of benchmark datasets, which demonstrate the effectiveness of the learned mid-level representations: being able to generalize well to general natural images. Our method shows significant improvement over the competing systems in image classification, including those with strong supervision.

5 0.056468762 8 cvpr-2013-A Fast Approximate AIB Algorithm for Distributional Word Clustering

Author: Lei Wang, Jianjia Zhang, Luping Zhou, Wanqing Li

Abstract: Distributional word clustering merges the words having similar probability distributions to attain reliable parameter estimation, compact classification models and even better classification performance. Agglomerative Information Bottleneck (AIB) is one of the typical word clustering algorithms and has been applied to both traditional text classification and recent image recognition. Although enjoying theoretical elegance, AIB has one main issue on its computational efficiency, especially when clustering a large number of words. Different from existing solutions to this issue, we analyze the characteristics of its objective function the loss of mutual information, and show that by merely using the ratio of word-class joint probabilities of each word, good candidate word pairs for merging can be easily identified. Based on this finding, we propose a fast approximate AIB algorithm and show that it can significantly improve the computational efficiency of AIB while well maintaining or even slightly increasing its classification performance. Experimental study on both text and image classification benchmark data sets shows that our algorithm can achieve more than 100 times speedup on large real data sets over the state-of-the-art method.

6 0.042528447 165 cvpr-2013-Fast Energy Minimization Using Learned State Filters

7 0.041478373 450 cvpr-2013-Unsupervised Joint Object Discovery and Segmentation in Internet Images

8 0.041024439 234 cvpr-2013-Joint Spectral Correspondence for Disparate Image Matching

9 0.040747017 80 cvpr-2013-Category Modeling from Just a Single Labeling: Use Depth Information to Guide the Learning of 2D Models

10 0.037762471 5 cvpr-2013-A Bayesian Approach to Multimodal Visual Dictionary Learning

11 0.037668254 325 cvpr-2013-Part Discovery from Partial Correspondence

12 0.037127934 53 cvpr-2013-BFO Meets HOG: Feature Extraction Based on Histograms of Oriented p.d.f. Gradients for Image Classification

13 0.035117894 301 cvpr-2013-Multi-target Tracking by Rank-1 Tensor Approximation

14 0.03511022 20 cvpr-2013-A New Model and Simple Algorithms for Multi-label Mumford-Shah Problems

15 0.035016712 138 cvpr-2013-Efficient 2D-to-3D Correspondence Filtering for Scalable 3D Object Recognition

16 0.034846485 106 cvpr-2013-Deformable Graph Matching

17 0.032936007 382 cvpr-2013-Scene Text Recognition Using Part-Based Tree-Structured Character Detection

18 0.032135736 235 cvpr-2013-Jointly Aligning and Segmenting Multiple Web Photo Streams for the Inference of Collective Photo Storylines

19 0.031761691 214 cvpr-2013-Image Understanding from Experts' Eyes by Modeling Perceptual Skill of Diagnostic Reasoning Processes

20 0.031632446 311 cvpr-2013-Occlusion Patterns for Object Class Detection

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.081), (1, -0.011), (2, 0.003), (3, 0.006), (4, 0.025), (5, 0.001), (6, -0.013), (7, -0.018), (8, -0.017), (9, -0.002), (10, 0.005), (11, 0.011), (12, 0.003), (13, -0.014), (14, -0.013), (15, -0.049), (16, 0.007), (17, 0.014), (18, 0.051), (19, -0.021), (20, 0.047), (21, -0.011), (22, 0.012), (23, -0.025), (24, 0.025), (25, 0.002), (26, 0.011), (27, 0.026), (28, -0.024), (29, -0.035), (30, 0.013), (31, 0.019), (32, -0.031), (33, 0.026), (34, 0.04), (35, 0.031), (36, -0.009), (37, 0.045), (38, 0.012), (39, -0.057), (40, -0.039), (41, -0.03), (42, -0.005), (43, 0.025), (44, -0.025), (45, 0.011), (46, -0.016), (47, -0.039), (48, -0.027), (49, -0.04)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.91219068 183 cvpr-2013-GRASP Recurring Patterns from a Single View

Author: Jingchen Liu, Yanxi Liu

2 0.7776072 8 cvpr-2013-A Fast Approximate AIB Algorithm for Distributional Word Clustering

Author: Lei Wang, Jianjia Zhang, Luping Zhou, Wanqing Li

3 0.71314692 275 cvpr-2013-Lp-Norm IDF for Large Scale Image Search

Author: Liang Zheng, Shengjin Wang, Ziqiong Liu, Qi Tian

Abstract: The Inverse Document Frequency (IDF) is prevalently utilized in the Bag-of-Words based image search. The basic idea is to assign less weight to terms with high frequency, and vice versa. However, the estimation of visual word frequency is coarse and heuristic. Therefore, the effectiveness of the conventional IDF routine is marginal, and far from optimal. To tackle thisproblem, thispaper introduces a novel IDF expression by the use of Lp-norm pooling technique. . edu . cn qit i @ c s an . ut s a . edu ? ? ? ? ? ? ? ? Carefully designed, the proposed IDF takes into account the term frequency, document frequency, the complexity of images, as well as the codebook information. Optimizing the IDF function towards optimal balancing between TF and pIDF weights yields the so-called Lp-norm IDF (pIDF). WpIDe sFho wwe ithghatts sth yeie clodsnv tehnetio son-acla IlDleFd i Ls a special case of our generalized version, and two novel IDFs, i.e. the average IDF and the max IDF, can also be derived from our formula. Further, by counting for the term-frequency in each image, the proposed Lp-norm IDF helps to alleviate the viismuaalg we,o trhde b purrosptionseesds phenomenon. Our method is evaluated through extensive experiments on three benchmark datasets (Oxford 5K, Paris 6K and Flickr 1M). We report a performance improvement of as large as 27.1% over the baseline approach. Moreover, since the Lp-norm IDF is computed offline, no extra computation or memory cost is introduced to the system at all.

4 0.67803347 434 cvpr-2013-Topical Video Object Discovery from Key Frames by Modeling Word Co-occurrence Prior

Author: Gangqiang Zhao, Junsong Yuan, Gang Hua

5 0.67043358 456 cvpr-2013-Visual Place Recognition with Repetitive Structures

Author: Akihiko Torii, Josef Sivic, Tomáš Pajdla, Masatoshi Okutomi

6 0.66360235 53 cvpr-2013-BFO Meets HOG: Feature Extraction Based on Histograms of Oriented p.d.f. Gradients for Image Classification

7 0.65825945 200 cvpr-2013-Harvesting Mid-level Visual Concepts from Large-Scale Internet Images

8 0.58849806 382 cvpr-2013-Scene Text Recognition Using Part-Based Tree-Structured Character Detection

9 0.58380187 78 cvpr-2013-Capturing Layers in Image Collections with Componential Models: From the Layered Epitome to the Componential Counting Grid

10 0.58236426 417 cvpr-2013-Subcategory-Aware Object Classification

11 0.57573235 130 cvpr-2013-Discriminative Color Descriptors

12 0.56700295 210 cvpr-2013-Illumination Estimation Based on Bilayer Sparse Coding

13 0.55964214 178 cvpr-2013-From Local Similarity to Global Coding: An Application to Image Classification

14 0.55874932 11 cvpr-2013-A Genetic Algorithm-Based Solver for Very Large Jigsaw Puzzles

15 0.54641342 157 cvpr-2013-Exploring Implicit Image Statistics for Visual Representativeness Modeling

16 0.53339458 450 cvpr-2013-Unsupervised Joint Object Discovery and Segmentation in Internet Images

17 0.52860802 34 cvpr-2013-Adaptive Active Learning for Image Classification

18 0.52173311 129 cvpr-2013-Discriminative Brain Effective Connectivity Analysis for Alzheimer's Disease: A Kernel Learning Approach upon Sparse Gaussian Bayesian Network

19 0.51970536 28 cvpr-2013-A Thousand Frames in Just a Few Words: Lingual Description of Videos through Latent Topics and Sparse Object Stitching

20 0.51760709 442 cvpr-2013-Transfer Sparse Coding for Robust Image Representation

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(10, 0.1), (16, 0.023), (26, 0.031), (28, 0.015), (33, 0.247), (67, 0.067), (69, 0.052), (80, 0.292), (87, 0.049)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.83419394 272 cvpr-2013-Long-Term Occupancy Analysis Using Graph-Based Optimisation in Thermal Imagery

Author: Rikke Gade, Anders Jørgensen, Thomas B. Moeslund

Abstract: This paper presents a robust occupancy analysis system for thermal imaging. Reliable detection of people is very hard in crowded scenes, due to occlusions and segmentation problems. We therefore propose a framework that optimises the occupancy analysis over long periods by including information on the transition in occupancy, whenpeople enter or leave the monitored area. In stable periods, with no activity close to the borders, people are detected and counted which contributes to a weighted histogram. When activity close to the border is detected, local tracking is applied in order to identify a crossing. After a full sequence, the number of people during all periods are estimated using a probabilistic graph search optimisation. The system is tested on a total of 51,000 frames, captured in sports arenas. The mean error for a 30-minute period containing 3-13 people is 4.44 %, which is a half of the error percentage optained by detection only, and better than the results of comparable work. The framework is also tested on a public available dataset from an outdoor scene, which proves the generality of the method.

same-paper 2 0.80801702 183 cvpr-2013-GRASP Recurring Patterns from a Single View

Author: Jingchen Liu, Yanxi Liu

3 0.79280317 335 cvpr-2013-Poselet Conditioned Pictorial Structures

Author: Leonid Pishchulin, Mykhaylo Andriluka, Peter Gehler, Bernt Schiele

Abstract: In this paper we consider the challenging problem of articulated human pose estimation in still images. We observe that despite high variability of the body articulations, human motions and activities often simultaneously constrain the positions of multiple body parts. Modelling such higher order part dependencies seemingly comes at a cost of more expensive inference, which resulted in their limited use in state-of-the-art methods. In this paper we propose a model that incorporates higher order part dependencies while remaining efficient. We achieve this by defining a conditional model in which all body parts are connected a-priori, but which becomes a tractable tree-structured pictorial structures model once the image observations are available. In order to derive a set of conditioning variables we rely on the poselet-based features that have been shown to be effective for people detection but have so far found limited application for articulated human pose estimation. We demon- strate the effectiveness of our approach on three publicly available pose estimation benchmarks improving or being on-par with state of the art in each case.

4 0.77994877 153 cvpr-2013-Expanded Parts Model for Human Attribute and Action Recognition in Still Images

Author: Gaurav Sharma, Frédéric Jurie, Cordelia Schmid

Abstract: We propose a new model for recognizing human attributes (e.g. wearing a suit, sitting, short hair) and actions (e.g. running, riding a horse) in still images. The proposed model relies on a collection of part templates which are learnt discriminatively to explain specific scale-space locations in the images (in human centric coordinates). It avoids the limitations of highly structured models, which consist of a few (i.e. a mixture of) ‘average ’ templates. To learn our model, we propose an algorithm which automatically mines out parts and learns corresponding discriminative templates with their respective locations from a large number of candidate parts. We validate the method on recent challenging datasets: (i) Willow 7 actions [7], (ii) 27 Human Attributes (HAT) [25], and (iii) Stanford 40 actions [37]. We obtain convincing qualitative and state-of-the-art quantitative results on the three datasets.

5 0.77766466 210 cvpr-2013-Illumination Estimation Based on Bilayer Sparse Coding

Author: Bing Li, Weihua Xiong, Weiming Hu, Houwen Peng

Abstract: Computational color constancy is a very important topic in computer vision and has attracted many researchers ’ attention. Recently, lots of research has shown the effects of using high level visual content cues for improving illumination estimation. However, nearly all the existing methods are essentially combinational strategies in which image ’s content analysis is only used to guide the combination or selection from a variety of individual illumination estimation methods. In this paper, we propose a novel bilayer sparse coding model for illumination estimation that considers image similarity in terms of both low level color distribution and high level image scene content simultaneously. For the purpose, the image ’s scene content information is integrated with its color distribution to obtain optimal illumination estimation model. The experimental results on real-world image sets show that our algorithm is superior to some prevailing illumination estimation methods, even better than some combinational methods.

6 0.76350582 273 cvpr-2013-Looking Beyond the Image: Unsupervised Learning for Object Saliency and Detection

7 0.73842812 388 cvpr-2013-Semi-supervised Learning of Feature Hierarchies for Object Detection in a Video

8 0.73213059 206 cvpr-2013-Human Pose Estimation Using Body Parts Dependent Joint Regressors

9 0.71654612 225 cvpr-2013-Integrating Grammar and Segmentation for Human Pose Estimation

10 0.71317029 60 cvpr-2013-Beyond Physical Connections: Tree Models in Human Pose Estimation

11 0.71048695 334 cvpr-2013-Pose from Flow and Flow from Pose

12 0.70614511 322 cvpr-2013-PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors

13 0.70542526 28 cvpr-2013-A Thousand Frames in Just a Few Words: Lingual Description of Videos through Latent Topics and Sparse Object Stitching

14 0.70467496 439 cvpr-2013-Tracking Human Pose by Tracking Symmetric Parts

15 0.70457667 89 cvpr-2013-Computationally Efficient Regression on a Dependency Graph for Human Pose Estimation

16 0.70438933 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities

17 0.70290798 207 cvpr-2013-Human Pose Estimation Using a Joint Pixel-wise and Part-wise Formulation

18 0.70288628 459 cvpr-2013-Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots

19 0.69932628 355 cvpr-2013-Representing Videos Using Mid-level Discriminative Patches

20 0.69850141 120 cvpr-2013-Detecting and Naming Actors in Movies Using Generative Appearance Models