iccv iccv2013 iccv2013-119 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jin Gao, Junliang Xing, Weiming Hu, Steve Maybank
Abstract: Visual tracking has witnessed growing methods in object representation, which is crucial to robust tracking. The dominant mechanism in object representation is using image features encoded in a vector as observations to perform tracking, without considering that an image is intrinsically a matrix, or a 2nd-order tensor. Thus approaches following this mechanism inevitably lose a lot of useful information, and therefore cannot fully exploit the spatial correlations within the 2D image ensembles. In this paper, we address an image as a 2nd-order tensor in its original form, and find a discriminative linear embedding space approximation to the original nonlinear submanifold embedded in the tensor space based on the graph embedding framework. We specially design two graphs for characterizing the intrinsic local geometrical structure of the tensor space, so as to retain more discriminant information when reducing the dimension along certain tensor dimensions. However, spatial correlations within a tensor are not limited to the elements along these dimensions. This means that some part of the discriminant information may not be encoded in the embedding space. We introduce a novel technique called semi-supervised improvement to iteratively adjust the embedding space to compensate for the loss of discriminant information, hence improving the performance of our tracker. Experimental results on challenging videos demonstrate the effectiveness and robustness of the proposed tracker.
Reference: text
sentIndex sentText sentNum sentScore
1 i a Abstract Visual tracking has witnessed growing methods in object representation, which is crucial to robust tracking. [sent-2, score-0.188]
2 The dominant mechanism in object representation is using image features encoded in a vector as observations to perform tracking, without considering that an image is intrinsically a matrix, or a 2nd-order tensor. [sent-3, score-0.195]
3 Thus approaches following this mechanism inevitably lose a lot of useful information, and therefore cannot fully exploit the spatial correlations within the 2D image ensembles. [sent-4, score-0.184]
4 In this paper, we address an image as a 2nd-order tensor in its original form, and find a discriminative linear embedding space approximation to the original nonlinear submanifold embedded in the tensor space based on the graph embedding framework. [sent-5, score-2.269]
5 We specially design two graphs for characterizing the intrinsic local geometrical structure of the tensor space, so as to retain more discriminant information when reducing the dimension along certain tensor dimensions. [sent-6, score-2.128]
6 However, spatial correlations within a tensor are not limited to the elements along these dimensions. [sent-7, score-0.944]
7 This means that some part of the discriminant information may not be encoded in the embedding space. [sent-8, score-0.485]
8 We introduce a novel technique called semi-supervised improvement to iteratively adjust the embedding space to compensate for the loss of discriminant information, hence improving the performance of our tracker. [sent-9, score-0.604]
9 Introduction Robust visual tracking is an essential component of many practical computer vision applications such as surveillance, vehicle navigation, and human computer interface. [sent-12, score-0.129]
10 Despite much effort resulting in many novel trackers, tracking generic objects remains a challenging problem . [sent-13, score-0.129]
11 partial occlusions, illumination changes, cluttered and moving backgrounds) appearance variations (see a more detailed discussion in [3 1]). [sent-22, score-0.038]
12 Many tracking systems construct an adaptive appearance model based on the collected image patches in previous frames. [sent-23, score-0.167]
13 This model is used to find the most likely image patch in the current frame. [sent-24, score-0.052]
14 There are many object representation methods proposed for visual tracking. [sent-26, score-0.044]
15 Some tracking approaches [21, 35] adopt holistic gray-scale image-asvector representation. [sent-27, score-0.129]
16 1 minimization based visual tracking approaches [17, 18, 3, 34, 33], which exploit the sparse representation of the image patch. [sent-29, score-0.173]
17 [8, 9] construct multiple basic appearance models by sparse principal component analysis (SPCA) of a set of feature templates (e. [sent-31, score-0.038]
18 These representation methods ignore that an image is intrinsically a matrix, or a 2nd-order tensor. [sent-34, score-0.152]
19 In [2, 10] only Haar-like features used, but great improvements are achieved by novel appearance models. [sent-37, score-0.038]
20 [12, 15] only use HOGs but apply novel appearance models to achieve good results. [sent-39, score-0.038]
21 [1] robustly combine multiple patch votes with each image patch represented by only gray-scale histogram features. [sent-41, score-0.104]
22 These representation methods have their own advantages for their specifically designed appearance models. [sent-42, score-0.082]
23 However, a lot ofuseful information is missed when extracting features. [sent-43, score-0.075]
24 image-as-matrix representation) can retain much more useful information because the original image data structure is preserved. [sent-46, score-0.046]
25 [29, 7, 27]) on tensor11556699 based subspace learning, particularly for face recognition. [sent-49, score-0.051]
26 Also, many previous visual tracking approaches use the tensor concept. [sent-50, score-0.917]
27 [13, 25, 24]) conduct PCA in the mode-k flattened matrix; others (e. [sent-53, score-0.077]
28 [26, 11]) adopt covariance tracking technique [19] in the mode-k flattened ma- trix. [sent-55, score-0.241]
29 Additionally, the dimension reduction based subspace learning methods used in [13, 25, 24] ignore a very important problem proposed in [27]. [sent-57, score-0.186]
30 The problem is that correlations within a tensor are not limited to the elements along certain tensor dimensions. [sent-58, score-1.732]
31 Some part of the discriminant information may not be encoded in the first few dimensions of the derived subspace. [sent-59, score-0.247]
32 This may lead to subspace learning degradations and result in tracking distractions. [sent-60, score-0.215]
33 [27] propose to rearrange elements in the tensor to solve the subspace learning degradation problem, the exhaustive element rearranging makes it unsuitable for real-time tracking. [sent-64, score-1.011]
34 Inspired by these findings, we propose a new discriminant tracking approach which adopts a 2nd-order tensor (image-as-matrix) representation. [sent-65, score-1.146]
35 Then, we embed the target and background tensor samples into two specially designed graphs, so that the object can be effectively separated from the background in the graph embedding framework for dimension reduction. [sent-67, score-1.201]
36 It is noted that, this approach can be extended by using higher-order tensor representation (e. [sent-68, score-0.832]
37 3rd-order tensor with a feature vector for each pixel, see [25, 24, 26, 11] for more details), although we only use gray-scale image-as-matrix representation. [sent-70, score-0.788]
38 Because the correlations within a 2nd-order tensor are not limited to the elements along particular columns and rows, the discriminative embedding space derived from dimension reduction may not encode enough of the discriminant information. [sent-71, score-1.489]
39 We improve the classification accuracy of our tensor representation based tracking approach by using the available unlabeled tensor samples. [sent-72, score-1.851]
40 By this improvement, we can adjust the discriminative embedding space so that most of the discriminant information is encoded in it. [sent-74, score-0.589]
41 At each iteration, a number of unlabeled tensor samples are selected and used to learn a new discriminative embedding space. [sent-76, score-1.207]
42 The learned embedding spaces from different iterations and the one trained using only the labeled samples are combined linearly to form a final adjusted embedding space which encodes most of the discriminant information. [sent-77, score-0.706]
43 That is to say, we make use of the unlabeled samples in an inductive fashion, which is very different from most semi-supervised tracking approaches (e. [sent-78, score-0.306]
44 [6, 10, 12]), in which all the unla- beled samples are used for training without selection. [sent-80, score-0.086]
45 The new semi-supervised improvement technique adopts a novel strategy to address the two questions: 1) how to select the unlabeled samples; 2) what class labels should be assigned to the selected unlabeled samples. [sent-81, score-0.324]
46 It is also very different from some margin improving techniques, where the unlabeled samples with the highest classification confidences are selected and the class labels that are predicted by the current classifier are assigned to them, as in Selftraining [20], ASSEMBLE [4]. [sent-82, score-0.25]
47 These techniques may increase the classification margin, but they do not provide any novel information to adjust the discriminative embedding space. [sent-83, score-0.342]
48 We plug the margin improving technique ASSEMBLE into our system and make a direct comparison with our method in Section 3. [sent-84, score-0.142]
49 We elaborate the important components of the proposed approach in this section, in particular the tensor based linear embedding and derivation of the proposed semi-supervised improvement technique. [sent-88, score-1.07]
50 Before all of these, we first review some terminology for tensor operations. [sent-89, score-0.854]
51 Terminology for tensor operations A tensor is a higher order generalization of a vector (1storder tensor) and a matrix (2nd-order tensor). [sent-92, score-1.576]
52 A tensor is a multilinear mapping over a set of vector spaces. [sent-93, score-0.823]
53 ×mn order tensor is denoted as A ∈ Rm1 ×m2 , and its eorledmeren tetns are represented by ai1,. [sent-97, score-0.788]
54 The inner product of two nth-order tensors is defined as ? [sent-101, score-0.062]
55 , and the distance bettwheee nno rAm a onfd A AB i ss ? [sent-131, score-0.066]
56 2,n adn-dor tdheer dtiesntsaoncr case, ttwhee norm aisn cda Blled is st ? [sent-138, score-0.139]
57 T nhoer mm iosd cea-llke dv tehceto Frrso are uths en rcomlu amndn vectors o ? [sent-145, score-0.044]
58 The inverse operation osf rmomode fl-kat flattening i tse nmsoorde A-k. [sent-153, score-0.112]
59 folding, which restores the original tensor A from A(k) . [sent-154, score-0.829]
60 ×mn 11557700 Figure 1: Block diagram of the proposed tracker. [sent-161, score-0.028]
61 MA(k) , sfooll boew ceodm by mtedod beyk folding. [sent-188, score-0.044]
62 Note that for tensors and matrices of the appropriate sizes, A U V = A ×n V U and (pAro p×r tUe) s i×z V, A A= ×A ×U ( ×VUV). [sent-189, score-0.062]
63 M =or eA d ×etaiVls o ×f the tensor algebra are given =in A[23×]. [sent-190, score-0.816]
64 Tensor based linear embedding Previous work has demonstrated that the image variations of many objects can be modeled by low dimensional linear spaces. [sent-193, score-0.238]
65 However, the typical algorithms either only consider an image as a high dimensional vector, or can not fully detect the intrinsic local geometrical and discriminative structure of the collected image patches in the tensor form. [sent-194, score-0.94]
66 Then, a particular question arises: how to find an effective linear embedding space approximation to the original nonlinear submanifold embedded in the tensor space. [sent-195, score-1.199]
67 Graph embedding for dimension reduction [28, 22] provides us an innovation to this question in the sense of local isometry. [sent-196, score-0.381]
68 Generally case: We express the training sample set in the tensor form as {Xi ∈ Rm1 , i= 1, 2, . [sent-197, score-0.788]
69 We build two graphs: an Mintrinsi⊆c graph G and a penalty graph Gp to model the local geometrical dan ad pdeinscalrtiymin gartaipvhe sGtructure of M. [sent-206, score-0.142]
70 edtr Wcalp bned th deis edge weight tmruactrtuicrees ooff ? [sent-208, score-0.083]
wordName wordTfidf (topN-words)
[('tensor', 0.788), ('embedding', 0.238), ('discriminant', 0.188), ('tracking', 0.129), ('mn', 0.118), ('unlabeled', 0.102), ('correlations', 0.089), ('mk', 0.089), ('submanifold', 0.082), ('flattening', 0.077), ('flattened', 0.077), ('hogs', 0.074), ('geometrical', 0.072), ('adjust', 0.067), ('assemble', 0.066), ('terminology', 0.066), ('tensors', 0.062), ('encoded', 0.059), ('intrinsically', 0.055), ('specially', 0.053), ('ignore', 0.053), ('patch', 0.052), ('graphs', 0.051), ('subspace', 0.051), ('retain', 0.046), ('dimension', 0.045), ('beled', 0.044), ('bned', 0.044), ('ceodm', 0.044), ('distractions', 0.044), ('junliang', 0.044), ('lxing', 0.044), ('ofuseful', 0.044), ('selftraining', 0.044), ('spca', 0.044), ('uths', 0.044), ('vuv', 0.044), ('weiming', 0.044), ('representation', 0.044), ('improvement', 0.044), ('intrinsic', 0.043), ('samples', 0.042), ('margin', 0.042), ('adopts', 0.041), ('dcs', 0.041), ('haarlike', 0.041), ('alk', 0.041), ('bky', 0.041), ('restores', 0.041), ('elements', 0.04), ('folding', 0.039), ('tue', 0.039), ('cda', 0.039), ('deis', 0.039), ('ina', 0.039), ('appearance', 0.038), ('reduction', 0.037), ('discriminative', 0.037), ('mechanism', 0.037), ('rearrange', 0.037), ('rearranging', 0.037), ('tdheer', 0.037), ('embedded', 0.036), ('sor', 0.035), ('multilinear', 0.035), ('tse', 0.035), ('rob', 0.035), ('degradations', 0.035), ('onfd', 0.035), ('graph', 0.035), ('technique', 0.035), ('nlpr', 0.034), ('norm', 0.033), ('inductive', 0.033), ('innovation', 0.033), ('plug', 0.033), ('hea', 0.033), ('confidences', 0.032), ('improving', 0.032), ('cas', 0.031), ('nno', 0.031), ('lot', 0.031), ('aisn', 0.03), ('hue', 0.03), ('witnessed', 0.03), ('crucial', 0.029), ('unsuitable', 0.029), ('degradation', 0.029), ('adam', 0.029), ('diagram', 0.028), ('algebra', 0.028), ('question', 0.028), ('nonlinear', 0.027), ('along', 0.027), ('inevitably', 0.027), ('kwon', 0.027), ('navigation', 0.027), ('grabner', 0.027), ('characterizing', 0.027), ('extrinsic', 0.027)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 119 iccv-2013-Discriminant Tracking Using Tensor Representation with Semi-supervised Improvement
Author: Jin Gao, Junliang Xing, Weiming Hu, Steve Maybank
Abstract: Visual tracking has witnessed growing methods in object representation, which is crucial to robust tracking. The dominant mechanism in object representation is using image features encoded in a vector as observations to perform tracking, without considering that an image is intrinsically a matrix, or a 2nd-order tensor. Thus approaches following this mechanism inevitably lose a lot of useful information, and therefore cannot fully exploit the spatial correlations within the 2D image ensembles. In this paper, we address an image as a 2nd-order tensor in its original form, and find a discriminative linear embedding space approximation to the original nonlinear submanifold embedded in the tensor space based on the graph embedding framework. We specially design two graphs for characterizing the intrinsic local geometrical structure of the tensor space, so as to retain more discriminant information when reducing the dimension along certain tensor dimensions. However, spatial correlations within a tensor are not limited to the elements along these dimensions. This means that some part of the discriminant information may not be encoded in the embedding space. We introduce a novel technique called semi-supervised improvement to iteratively adjust the embedding space to compensate for the loss of discriminant information, hence improving the performance of our tracker. Experimental results on challenging videos demonstrate the effectiveness and robustness of the proposed tracker.
2 0.40299323 362 iccv-2013-Robust Tucker Tensor Decomposition for Effective Image Representation
Author: Miao Zhang, Chris Ding
Abstract: Many tensor based algorithms have been proposed for the study of high dimensional data in a large variety ofcomputer vision and machine learning applications. However, most of the existing tensor analysis approaches are based on Frobenius norm, which makes them sensitive to outliers, because they minimize the sum of squared errors and enlarge the influence of both outliers and large feature noises. In this paper, we propose a robust Tucker tensor decomposition model (RTD) to suppress the influence of outliers, which uses L1-norm loss function. Yet, the optimization on L1-norm based tensor analysis is much harder than standard tensor decomposition. In this paper, we propose a simple and efficient algorithm to solve our RTD model. Moreover, tensor factorization-based image storage needs much less space than PCA based methods. We carry out extensive experiments to evaluate the proposed algorithm, and verify the robustness against image occlusions. Both numerical and visual results show that our RTD model is consistently better against the existence of outliers than previous tensor and PCA methods.
3 0.15046419 184 iccv-2013-Global Fusion of Relative Motions for Robust, Accurate and Scalable Structure from Motion
Author: Pierre Moulon, Pascal Monasse, Renaud Marlet
Abstract: Multi-view structure from motion (SfM) estimates the position and orientation of pictures in a common 3D coordinate frame. When views are treated incrementally, this external calibration can be subject to drift, contrary to global methods that distribute residual errors evenly. We propose a new global calibration approach based on the fusion of relative motions between image pairs. We improve an existing method for robustly computing global rotations. We present an efficient a contrario trifocal tensor estimation method, from which stable and precise translation directions can be extracted. We also define an efficient translation registration method that recovers accurate camera positions. These components are combined into an original SfM pipeline. Our experiments show that, on most datasets, it outperforms in accuracy other existing incremental and global pipelines. It also achieves strikingly good running times: it is about 20 times faster than the other global method we could compare to, and as fast as the best incremental method. More importantly, it features better scalability properties.
4 0.093428701 134 iccv-2013-Efficient Higher-Order Clustering on the Grassmann Manifold
Author: Suraj Jain, Venu Madhav Govindu
Abstract: The higher-order clustering problem arises when data is drawn from multiple subspaces or when observations fit a higher-order parametric model. Most solutions to this problem either decompose higher-order similarity measures for use in spectral clustering or explicitly use low-rank matrix representations. In this paper we present our approach of Sparse Grassmann Clustering (SGC) that combines attributes of both categories. While we decompose the higherorder similarity tensor, we cluster data by directly finding a low dimensional representation without explicitly building a similarity matrix. By exploiting recent advances in online estimation on the Grassmann manifold (GROUSE) we develop an efficient and accurate algorithm that works with individual columns of similarities or partial observations thereof. Since it avoids the storage and decomposition of large similarity matrices, our method is efficient, scalable and has low memory requirements even for large-scale data. We demonstrate the performance of our SGC method on a variety of segmentation problems including planar seg- mentation of Kinect depth maps and motion segmentation of the Hopkins 155 dataset for which we achieve performance comparable to the state-of-the-art.
5 0.09098848 116 iccv-2013-Directed Acyclic Graph Kernels for Action Recognition
Author: Ling Wang, Hichem Sahbi
Abstract: One of the trends of action recognition consists in extracting and comparing mid-level features which encode visual and motion aspects of objects into scenes. However, when scenes contain high-level semantic actions with many interacting parts, these mid-level features are not sufficient to capture high level structures as well as high order causal relationships between moving objects resulting into a clear drop in performances. In this paper, we address this issue and we propose an alternative action recognition method based on a novel graph kernel. In the main contributions of this work, we first describe actions in videos using directed acyclic graphs (DAGs), that naturally encode pairwise interactions between moving object parts, and then we compare these DAGs by analyzing the spectrum of their sub-patterns that capture complex higher order interactions. This extraction and comparison process is computationally tractable, re- sulting from the acyclic property of DAGs, and it also defines a positive semi-definite kernel. When plugging the latter into support vector machines, we obtain an action recognition algorithm that overtakes related work, including graph-based methods, on a standard evaluation dataset.
6 0.082578972 318 iccv-2013-PixelTrack: A Fast Adaptive Algorithm for Tracking Non-rigid Objects
7 0.079698123 257 iccv-2013-Log-Euclidean Kernels for Sparse Representation and Dictionary Learning
8 0.078239053 100 iccv-2013-Curvature-Aware Regularization on Riemannian Submanifolds
9 0.076150075 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning
10 0.07237263 209 iccv-2013-Image Guided Depth Upsampling Using Anisotropic Total Generalized Variation
11 0.072131701 212 iccv-2013-Image Set Classification Using Holistic Multiple Order Statistics Features and Localized Multi-kernel Metric Learning
12 0.071297921 307 iccv-2013-Parallel Transport of Deformations in Shape Space of Elastic Surfaces
13 0.070497945 366 iccv-2013-STAR3D: Simultaneous Tracking and Reconstruction of 3D Objects Using RGB-D Data
14 0.06932345 55 iccv-2013-Automatic Kronecker Product Model Based Detection of Repeated Patterns in 2D Urban Images
15 0.067976907 6 iccv-2013-A Convex Optimization Framework for Active Learning
16 0.066437095 17 iccv-2013-A Global Linear Method for Camera Pose Registration
17 0.0631129 298 iccv-2013-Online Robust Non-negative Dictionary Learning for Visual Tracking
18 0.061394382 425 iccv-2013-Tracking via Robust Multi-task Multi-view Joint Sparse Representation
19 0.060744751 347 iccv-2013-Recursive Estimation of the Stein Center of SPD Matrices and Its Applications
20 0.059889626 120 iccv-2013-Discriminative Label Propagation for Multi-object Tracking with Sporadic Appearance Features
topicId topicWeight
[(0, 0.132), (1, -0.009), (2, -0.037), (3, -0.013), (4, -0.055), (5, 0.011), (6, -0.024), (7, 0.069), (8, 0.017), (9, 0.024), (10, -0.029), (11, -0.069), (12, -0.03), (13, 0.018), (14, 0.077), (15, 0.013), (16, 0.039), (17, 0.002), (18, -0.007), (19, 0.014), (20, 0.001), (21, 0.056), (22, -0.048), (23, -0.038), (24, 0.007), (25, 0.071), (26, 0.039), (27, 0.068), (28, -0.091), (29, 0.091), (30, -0.042), (31, -0.061), (32, -0.08), (33, 0.031), (34, 0.038), (35, 0.126), (36, 0.115), (37, 0.081), (38, -0.138), (39, 0.142), (40, -0.1), (41, 0.048), (42, 0.037), (43, 0.065), (44, 0.073), (45, -0.039), (46, -0.205), (47, 0.246), (48, 0.147), (49, -0.157)]
simIndex simValue paperId paperTitle
same-paper 1 0.94869834 119 iccv-2013-Discriminant Tracking Using Tensor Representation with Semi-supervised Improvement
Author: Jin Gao, Junliang Xing, Weiming Hu, Steve Maybank
Abstract: Visual tracking has witnessed growing methods in object representation, which is crucial to robust tracking. The dominant mechanism in object representation is using image features encoded in a vector as observations to perform tracking, without considering that an image is intrinsically a matrix, or a 2nd-order tensor. Thus approaches following this mechanism inevitably lose a lot of useful information, and therefore cannot fully exploit the spatial correlations within the 2D image ensembles. In this paper, we address an image as a 2nd-order tensor in its original form, and find a discriminative linear embedding space approximation to the original nonlinear submanifold embedded in the tensor space based on the graph embedding framework. We specially design two graphs for characterizing the intrinsic local geometrical structure of the tensor space, so as to retain more discriminant information when reducing the dimension along certain tensor dimensions. However, spatial correlations within a tensor are not limited to the elements along these dimensions. This means that some part of the discriminant information may not be encoded in the embedding space. We introduce a novel technique called semi-supervised improvement to iteratively adjust the embedding space to compensate for the loss of discriminant information, hence improving the performance of our tracker. Experimental results on challenging videos demonstrate the effectiveness and robustness of the proposed tracker.
2 0.86995161 362 iccv-2013-Robust Tucker Tensor Decomposition for Effective Image Representation
Author: Miao Zhang, Chris Ding
Abstract: Many tensor based algorithms have been proposed for the study of high dimensional data in a large variety ofcomputer vision and machine learning applications. However, most of the existing tensor analysis approaches are based on Frobenius norm, which makes them sensitive to outliers, because they minimize the sum of squared errors and enlarge the influence of both outliers and large feature noises. In this paper, we propose a robust Tucker tensor decomposition model (RTD) to suppress the influence of outliers, which uses L1-norm loss function. Yet, the optimization on L1-norm based tensor analysis is much harder than standard tensor decomposition. In this paper, we propose a simple and efficient algorithm to solve our RTD model. Moreover, tensor factorization-based image storage needs much less space than PCA based methods. We carry out extensive experiments to evaluate the proposed algorithm, and verify the robustness against image occlusions. Both numerical and visual results show that our RTD model is consistently better against the existence of outliers than previous tensor and PCA methods.
3 0.61983591 347 iccv-2013-Recursive Estimation of the Stein Center of SPD Matrices and Its Applications
Author: Hesamoddin Salehian, Guang Cheng, Baba C. Vemuri, Jeffrey Ho
Abstract: Symmetric positive-definite (SPD) matrices are ubiquitous in Computer Vision, Machine Learning and Medical Image Analysis. Finding the center/average of a population of such matrices is a common theme in many algorithms such as clustering, segmentation, principal geodesic analysis, etc. The center of a population of such matrices can be defined using a variety of distance/divergence measures as the minimizer of the sum of squared distances/divergences from the unknown center to the members of the population. It is well known that the computation of the Karcher mean for the space of SPD matrices which is a negativelycurved Riemannian manifold is computationally expensive. Recently, the LogDet divergence-based center was shown to be a computationally attractive alternative. However, the LogDet-based mean of more than two matrices can not be computed in closed form, which makes it computationally less attractive for large populations. In this paper we present a novel recursive estimator for center based on the Stein distance which is the square root of the LogDet di– vergence that is significantly faster than the batch mode computation of this center. The key theoretical contribution is a closed-form solution for the weighted Stein center of two SPD matrices, which is used in the recursive computation of the Stein center for a population of SPD matrices. Additionally, we show experimental evidence of the convergence of our recursive Stein center estimator to the batch mode Stein center. We present applications of our recursive estimator to K-means clustering and image indexing depicting significant time gains over corresponding algorithms that use the batch mode computations. For the latter application, we develop novel hashing functions using the Stein distance and apply it to publicly available data sets, and experimental results have shown favorable com– ∗This research was funded in part by the NIH grant NS066340 to BCV. †Corresponding author parisons to other competing methods.
4 0.51435399 257 iccv-2013-Log-Euclidean Kernels for Sparse Representation and Dictionary Learning
Author: Peihua Li, Qilong Wang, Wangmeng Zuo, Lei Zhang
Abstract: The symmetric positive de?nite (SPD) matrices have been widely used in image and vision problems. Recently there are growing interests in studying sparse representation (SR) of SPD matrices, motivated by the great success of SR for vector data. Though the space of SPD matrices is well-known to form a Lie group that is a Riemannian manifold, existing work fails to take full advantage of its geometric structure. This paper attempts to tackle this problem by proposing a kernel based method for SR and dictionary learning (DL) of SPD matrices. We disclose that the space of SPD matrices, with the operations of logarithmic multiplication and scalar logarithmic multiplication de?ned in the Log-Euclidean framework, is a complete inner product space. We can thus develop a broad family of kernels that satis?es Mercer’s condition. These kernels characterize the geodesic distance and can be computed ef?ciently. We also consider the geometric structure in the DL process by updating atom matrices in the Riemannian space instead of in the Euclidean space. The proposed method is evaluated with various vision problems and shows notable per- formance gains over state-of-the-arts.
5 0.47230822 184 iccv-2013-Global Fusion of Relative Motions for Robust, Accurate and Scalable Structure from Motion
Author: Pierre Moulon, Pascal Monasse, Renaud Marlet
Abstract: Multi-view structure from motion (SfM) estimates the position and orientation of pictures in a common 3D coordinate frame. When views are treated incrementally, this external calibration can be subject to drift, contrary to global methods that distribute residual errors evenly. We propose a new global calibration approach based on the fusion of relative motions between image pairs. We improve an existing method for robustly computing global rotations. We present an efficient a contrario trifocal tensor estimation method, from which stable and precise translation directions can be extracted. We also define an efficient translation registration method that recovers accurate camera positions. These components are combined into an original SfM pipeline. Our experiments show that, on most datasets, it outperforms in accuracy other existing incremental and global pipelines. It also achieves strikingly good running times: it is about 20 times faster than the other global method we could compare to, and as fast as the best incremental method. More importantly, it features better scalability properties.
6 0.45678341 25 iccv-2013-A Novel Earth Mover's Distance Methodology for Image Matching with Gaussian Mixture Models
7 0.44363296 55 iccv-2013-Automatic Kronecker Product Model Based Detection of Repeated Patterns in 2D Urban Images
8 0.40136054 138 iccv-2013-Efficient and Robust Large-Scale Rotation Averaging
9 0.37842155 17 iccv-2013-A Global Linear Method for Camera Pose Registration
10 0.37011918 134 iccv-2013-Efficient Higher-Order Clustering on the Grassmann Manifold
12 0.3497785 60 iccv-2013-Bayesian Robust Matrix Factorization for Image and Video Processing
13 0.33171421 357 iccv-2013-Robust Matrix Factorization with Unknown Noise
15 0.30459538 413 iccv-2013-Target-Driven Moire Pattern Synthesis by Phase Modulation
16 0.2989876 425 iccv-2013-Tracking via Robust Multi-task Multi-view Joint Sparse Representation
17 0.29587191 168 iccv-2013-Finding the Best from the Second Bests - Inhibiting Subjective Bias in Evaluation of Visual Tracking Algorithms
18 0.29573286 364 iccv-2013-SGTD: Structure Gradient and Texture Decorrelating Regularization for Image Decomposition
19 0.29169241 303 iccv-2013-Orderless Tracking through Model-Averaged Posterior Estimation
20 0.28933752 295 iccv-2013-On One-Shot Similarity Kernels: Explicit Feature Maps and Properties
topicId topicWeight
[(2, 0.063), (7, 0.017), (26, 0.077), (31, 0.039), (35, 0.33), (40, 0.019), (42, 0.132), (64, 0.054), (73, 0.019), (89, 0.154)]
simIndex simValue paperId paperTitle
1 0.85122538 90 iccv-2013-Content-Aware Rotation
Author: Kaiming He, Huiwen Chang, Jian Sun
Abstract: We present an image editing tool called Content-Aware Rotation. Casually shot photos can appear tilted, and are often corrected by rotation and cropping. This trivial solution may remove desired content and hurt image integrity. Instead of doing rigid rotation, we propose a warping method that creates the perception of rotation and avoids cropping. Human vision studies suggest that the perception of rotation is mainly due to horizontal/vertical lines. We design an optimization-based method that preserves the rotation of horizontal/vertical lines, maintains the completeness of the image content, and reduces the warping distortion. An efficient algorithm is developed to address the challenging optimization. We demonstrate our content-aware rotation method on a variety of practical cases.
same-paper 2 0.77155459 119 iccv-2013-Discriminant Tracking Using Tensor Representation with Semi-supervised Improvement
Author: Jin Gao, Junliang Xing, Weiming Hu, Steve Maybank
Abstract: Visual tracking has witnessed growing methods in object representation, which is crucial to robust tracking. The dominant mechanism in object representation is using image features encoded in a vector as observations to perform tracking, without considering that an image is intrinsically a matrix, or a 2nd-order tensor. Thus approaches following this mechanism inevitably lose a lot of useful information, and therefore cannot fully exploit the spatial correlations within the 2D image ensembles. In this paper, we address an image as a 2nd-order tensor in its original form, and find a discriminative linear embedding space approximation to the original nonlinear submanifold embedded in the tensor space based on the graph embedding framework. We specially design two graphs for characterizing the intrinsic local geometrical structure of the tensor space, so as to retain more discriminant information when reducing the dimension along certain tensor dimensions. However, spatial correlations within a tensor are not limited to the elements along these dimensions. This means that some part of the discriminant information may not be encoded in the embedding space. We introduce a novel technique called semi-supervised improvement to iteratively adjust the embedding space to compensate for the loss of discriminant information, hence improving the performance of our tracker. Experimental results on challenging videos demonstrate the effectiveness and robustness of the proposed tracker.
3 0.7485249 104 iccv-2013-Decomposing Bag of Words Histograms
Author: Ankit Gandhi, Karteek Alahari, C.V. Jawahar
Abstract: We aim to decompose a global histogram representation of an image into histograms of its associated objects and regions. This task is formulated as an optimization problem, given a set of linear classifiers, which can effectively discriminate the object categories present in the image. Our decomposition bypasses harder problems associated with accurately localizing and segmenting objects. We evaluate our method on a wide variety of composite histograms, and also compare it with MRF-based solutions. In addition to merely measuring the accuracy of decomposition, we also show the utility of the estimated object and background histograms for the task of image classification on the PASCAL VOC 2007 dataset.
4 0.71173185 403 iccv-2013-Strong Appearance and Expressive Spatial Models for Human Pose Estimation
Author: Leonid Pishchulin, Mykhaylo Andriluka, Peter Gehler, Bernt Schiele
Abstract: Typical approaches to articulated pose estimation combine spatial modelling of the human body with appearance modelling of body parts. This paper aims to push the state-of-the-art in articulated pose estimation in two ways. First we explore various types of appearance representations aiming to substantially improve the bodypart hypotheses. And second, we draw on and combine several recently proposed powerful ideas such as more flexible spatial models as well as image-conditioned spatial models. In a series of experiments we draw several important conclusions: (1) we show that the proposed appearance representations are complementary; (2) we demonstrate that even a basic tree-structure spatial human body model achieves state-ofthe-art performance when augmented with the proper appearance representation; and (3) we show that the combination of the best performing appearance model with a flexible image-conditioned spatial model achieves the best result, significantly improving over the state of the art, on the “Leeds Sports Poses ” and “Parse ” benchmarks.
5 0.64945507 36 iccv-2013-Accurate and Robust 3D Facial Capture Using a Single RGBD Camera
Author: Yen-Lin Chen, Hsiang-Tao Wu, Fuhao Shi, Xin Tong, Jinxiang Chai
Abstract: This paper presents an automatic and robust approach that accurately captures high-quality 3D facial performances using a single RGBD camera. The key of our approach is to combine the power of automatic facial feature detection and image-based 3D nonrigid registration techniques for 3D facial reconstruction. In particular, we develop a robust and accurate image-based nonrigid registration algorithm that incrementally deforms a 3D template mesh model to best match observed depth image data and important facial features detected from single RGBD images. The whole process is fully automatic and robust because it is based on single frame facial registration framework. The system is flexible because it does not require any strong 3D facial priors such as blendshape models. We demonstrate the power of our approach by capturing a wide range of 3D facial expressions using a single RGBD camera and achieve state-of-the-art accuracy by comparing against alternative methods.
6 0.63767606 316 iccv-2013-Pictorial Human Spaces: How Well Do Humans Perceive a 3D Articulated Pose?
7 0.63523138 21 iccv-2013-A Method of Perceptual-Based Shape Decomposition
8 0.59593189 362 iccv-2013-Robust Tucker Tensor Decomposition for Effective Image Representation
9 0.59272176 204 iccv-2013-Human Attribute Recognition by Rich Appearance Dictionary
10 0.5896979 340 iccv-2013-Real-Time Articulated Hand Pose Estimation Using Semi-supervised Transductive Regression Forests
11 0.58966112 11 iccv-2013-A Fully Hierarchical Approach for Finding Correspondences in Non-rigid Shapes
12 0.58931601 379 iccv-2013-Semantic Segmentation without Annotating Segments
13 0.58862281 65 iccv-2013-Breaking the Chain: Liberation from the Temporal Markov Assumption for Tracking Human Poses
14 0.58799791 321 iccv-2013-Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model
15 0.58671403 171 iccv-2013-Fix Structured Learning of 2013 ICCV paper k2opt.pdf
16 0.58601058 383 iccv-2013-Semi-supervised Learning for Large Scale Image Cosegmentation
17 0.58489931 329 iccv-2013-Progressive Multigrid Eigensolvers for Multiscale Spectral Segmentation
18 0.58342773 181 iccv-2013-Frustratingly Easy NBNN Domain Adaptation
19 0.5823105 30 iccv-2013-A Simple Model for Intrinsic Image Decomposition with Depth Cues
20 0.5816673 62 iccv-2013-Bird Part Localization Using Exemplar-Based Models with Enforced Pose and Subcategory Consistency