iccv iccv2013 iccv2013-362 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Miao Zhang, Chris Ding
Abstract: Many tensor based algorithms have been proposed for the study of high dimensional data in a large variety ofcomputer vision and machine learning applications. However, most of the existing tensor analysis approaches are based on Frobenius norm, which makes them sensitive to outliers, because they minimize the sum of squared errors and enlarge the influence of both outliers and large feature noises. In this paper, we propose a robust Tucker tensor decomposition model (RTD) to suppress the influence of outliers, which uses L1-norm loss function. Yet, the optimization on L1-norm based tensor analysis is much harder than standard tensor decomposition. In this paper, we propose a simple and efficient algorithm to solve our RTD model. Moreover, tensor factorization-based image storage needs much less space than PCA based methods. We carry out extensive experiments to evaluate the proposed algorithm, and verify the robustness against image occlusions. Both numerical and visual results show that our RTD model is consistently better against the existence of outliers than previous tensor and PCA methods.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract Many tensor based algorithms have been proposed for the study of high dimensional data in a large variety ofcomputer vision and machine learning applications. [sent-5, score-0.518]
2 However, most of the existing tensor analysis approaches are based on Frobenius norm, which makes them sensitive to outliers, because they minimize the sum of squared errors and enlarge the influence of both outliers and large feature noises. [sent-6, score-0.561]
3 In this paper, we propose a robust Tucker tensor decomposition model (RTD) to suppress the influence of outliers, which uses L1-norm loss function. [sent-7, score-0.645]
4 Yet, the optimization on L1-norm based tensor analysis is much harder than standard tensor decomposition. [sent-8, score-1.014]
5 Moreover, tensor factorization-based image storage needs much less space than PCA based methods. [sent-10, score-0.664]
6 Both numerical and visual results show that our RTD model is consistently better against the existence of outliers than previous tensor and PCA methods. [sent-12, score-0.557]
7 Introduction Image or video storage and denoising problems are two important research topics in computer vision area, especially with the development of online social media, which provides tons of images and videos everyday. [sent-14, score-0.217]
8 In a typical image storage problem, an image is represented as a 1-d long feature vector, and then this long vector denotes one data point in a high dimensional space. [sent-15, score-0.202]
9 The 1-d vector denotation of an image makes it convenient for subspace learning, such as principal component analysis (PCA)[19] and linear discriminant analysis (LDA)[2] used in face recognition area. [sent-17, score-0.146]
10 Recently, some of other subspace learning algorithms applied on 1-d vector data are studied, such as locality preChris Ding University of Texas at Arlington 701 S. [sent-18, score-0.046]
11 However, the 1-d vector denotation strategy as a whole ignores the neighborhood feature information within one image, while 2-d matrix denotation retains the important spatial relationship between features within one image. [sent-21, score-0.164]
12 Therefore, a lot of tensor decomposition techniques are studied in computer vision applications. [sent-22, score-0.627]
13 For example, Shashua and Levine [16] adopted rank-one decomposition to represent images, which was described in detail in [18]. [sent-23, score-0.137]
14 [23], and the method projected the original images onto one two dimensional space. [sent-27, score-0.048]
15 Ding and Ye proposed a two dimensional singular value decomposition (2DSVD) [6], which computes principal eigenvectors of row-row and column-column covariance matrices. [sent-28, score-0.301]
16 Other tensor decomposition methods are also proposed and some of them are proven to be equivalent to 2DSVD and GLRAM in [11]. [sent-29, score-0.627]
17 High order singular value decomposition (HOSVD) [14] were proposed for higher dimensional tensor by Vasilescu and Terzopoulos [20]. [sent-30, score-0.721]
18 In above tensor analysis algorithms, an image is denoted by a 2-d matrix or second order tensor as itself, which retains the neighborhood information within the image itself, and then a set of images can be denoted by a third-order tensor. [sent-31, score-1.046]
19 In this paper, we propose a robust Tucker tensor decomposition (RTD) model to deal with images occluded by noisy information, and also propose a simple yet computationally efficient algorithm to solve the L1-norm based Tucker tensor decomposition optimization. [sent-39, score-1.378]
20 We also carry out extensive experiments in face recognition, and verify the robustness of the proposed method to image occlusions. [sent-41, score-0.047]
21 Robust Tucker Tensor Decomposition (RTD) Standard Tucker tensor decomposition [14] uses reconstructed tensor Y to approximate the original tensor X, = ? [sent-44, score-1.68]
22 (2)), which simplifies the tensor constructing expressions in next sections. [sent-68, score-0.49]
23 Y = U ⊗1 V ⊗2 W ⊗3 S (2) Tucker tensor decomposition has the following cost function [18], U,Vm,iWn,S? [sent-69, score-0.627]
24 UTU = I,VTV = I,WTW = (3) I It is well-known that the solution to the above optimization is given by high order singular value decomposition (HOSVD) [14], which will be introduced in the algorithm part. [sent-78, score-0.203]
25 As we can see, the standard Tucker tensor decomposi- × tion uses Frobenius norm to decompose the original tensor. [sent-79, score-0.55]
26 Frobenius norm is known for being sensitive to outliers and feature noises, because it sums the squared errors. [sent-80, score-0.116]
27 While, L1-norm just sums the absolute value of error, which reduces the influence of the outliers comparing to the Frobenius norm. [sent-81, score-0.067]
28 So the more robust against outlier version of Tucker tensor decomposition is formulated using L1-norm. [sent-82, score-0.645]
29 Tnhe×renfore, the robust Tucker tensor decomposition (RTD) is formulated as, U,Vm,iWn,S? [sent-90, score-0.645]
30 Figure 1 and Figure 2 illustrate the reconstructed effect on AT&T; data set, with existence of two different occlusion strategies, which will be explained in details in the experiment part. [sent-107, score-0.142]
31 In both figures, images of the second row represent the reconstructed images by RTD and those of the fourth row represent images reconstructed by Tucker tensor decomposition. [sent-108, score-0.773]
32 In both noise and corruption cases, Robust Tucker decomposition gives clearly better reconstruction. [sent-109, score-0.169]
33 Efficient Algorithm for Robust Tucker Tensor Decomposition The standard Tucker decomposition can be efficiently solved using the HOSVD algorithm [14]. [sent-111, score-0.171]
34 In this paper, we propose an efficient algorithm to solve robust Tucker tensor decomposition. [sent-112, score-0.542]
35 (7)) with simple exact solution; Another is a standard Tucker tensor decomposition of Eq. [sent-117, score-0.661]
36 We first rewrite the objective function of robust Tucker tensor decomposition equivalently as U,Vm,Win,S,E ? [sent-121, score-0.645]
37 Samples of occluded images and reconstructed images on AT&T; face data. [sent-139, score-0.195]
38 First row is the input occluded images; Second row is from RTD; Third row is from L1PCA; Fourth row is from Tucker decomposition; Fifth row is from PCA. [sent-140, score-0.197]
39 In this step, we solve U, V , W and S together while fixing E. [sent-160, score-0.063]
40 2μ||Q − U ⊗1V ⊗2W ⊗3S||2F, (10) UTU = I,VTV = I,WTW = I where Q = X − E +μA; (11) This is exactly the usual Tucker tensor decomposition. [sent-164, score-0.49]
41 Samples of type 2 (mixed) occluded images and reconstructed images using different methods of AT&T; data set. [sent-168, score-0.194]
42 The first row is from input occluded images; the second row is from RTDreconstructed images; the third row is from L1 PCA; the fourth row is from Tucker tensor; and the fifth row is from PCA. [sent-169, score-0.239]
43 The cross corruptions can only be removed by RTD. [sent-170, score-0.045]
44 U is given by the P eigenvectors with largest eigenvalues of F, where Fii? [sent-172, score-0.058]
45 V is given by the Q eigenvectors with largest eigenvalues of G, where Gjj? [sent-183, score-0.058]
46 W is given by the R eigenvectors with largest eigenvalues of H, where Hkk? [sent-194, score-0.058]
47 The converged solutions from different initialization are very close to each other[15], and there are no visible differences in the reconstructed images. [sent-215, score-0.073]
48 The following are examples from AT&T; dataset, whose tensor size is 56x46x400. [sent-225, score-0.49]
49 Efficient Algorithm for L1-PCA In standard computer vision problems, each image is converted to a vector and a set of images is represented by a matrix. [sent-242, score-0.054]
50 The advantage of tensor approach is that each image retains its 2D form in tensor representation and thus tensor analysis retains more information on image collections. [sent-244, score-1.562]
51 We need to compare the tensor approaches with matrix approaches. [sent-245, score-0.49]
52 Denote the singular value decomposition (SVD) of Q as Q = FΣGT (26) Only first k largest singular values and associated singular vectors are needed. [sent-286, score-0.335]
53 Experiments In this section, three benchmark face databases AT&T;, YALE and CMU PIE are used to evaluate the effectiveness of our proposed RTD tensor factorization approach. [sent-289, score-0.545]
54 2989 10 AT&T;: The AT&T; face data contains 400 upright face images of 40 individuals, collected by AT&T; Laboratories Cambridge. [sent-326, score-0.08]
55 4632 10 CMU PIE: CMU PIE is a face database of 41,368 images of 68 people, collected by Carnegie Mellon Robotics Institute between October and December 2000. [sent-394, score-0.05]
56 We randomly select 10 images from each class with different combinations of pose, face expression and illumination condi22445533 tion. [sent-396, score-0.121]
57 Corrupted Images × For evaluation purpose, we generate occluded images from the above three image data sets. [sent-399, score-0.072]
58 One added advantage of this approach is that we can compare the reconstructed images with the original uncorrupt images to assess the effectiveness of removing the corruption (occlusion). [sent-400, score-0.19]
59 We use two type ofocclusions added to the original input images to evaluate the effectiveness of proposed RTD tensor method against outliers. [sent-401, score-0.584]
60 First, square block occlusions with different size are added. [sent-402, score-0.069]
61 The occlusion is generated as the following, given the size of occlusion d, we randomly pick up the d d block position for each image, and we set ppiicxekls u pin t hthei sd d× ×d bdl area toos zero. [sent-403, score-0.199]
62 Second, mixed occlusions with 3 different corrupting methods are added to the original images. [sent-406, score-0.105]
63 First corruption methods are called cross occlusions, and the cross has specified length l and width w. [sent-407, score-0.122]
64 For each class, we randomly select m images to add cross occlusions. [sent-408, score-0.136]
65 We also randomly select the position ofthe cross, and set the pixels in the cross to the average pixel value ofthe whole data set. [sent-409, score-0.095]
66 To make the occlusions realistic and diversified, for each class, on the basis ofcross occlusions, we randomly select m images to add square block occlusions introduced above. [sent-410, score-0.196]
67 Similarly, for each class, we randomly select m images to add rectangular occlusions. [sent-412, score-0.091]
68 We randomly set the sizes of each rectangle within a permitted range [a, b], and within each rectangle, some of the pixels are set to 0, and the rest are set to 1. [sent-413, score-0.085]
69 The first row in × ×× Figure 2 demonstrates this mixed occlusion method. [sent-414, score-0.147]
70 For AT&T; data set, an 8 8 occlusion is added to every image of each class sine t,h aen f 8irs×t type olufs oioccnl iuss iaoddne. [sent-416, score-0.139]
71 Fdo tor tehvee rsye ciomnadg type aofc hocc clalusssion, within each class of images, we first randomly select m = 2 images to add the cross, and for each selected image the length of cross is l = 22 and width is w = 3. [sent-417, score-0.186]
72 Second, we randomly select m = 2 images to add the square block. [sent-418, score-0.091]
73 Third, we randomly select m = 2 images to add the rectangle, and for each added rectangle, the sizes are random within a ranger of [a, b] = [4, 10]. [sent-419, score-0.135]
74 Experiment Results In this section, we compare the performance of our RTD method with standard Tucker tensor method, L1-norm PCA method (L1PCA) and standard PCA method at storage space, the noise reduction effect and classification accuracy. [sent-424, score-0.756]
75 One of the biggest advantage of our proposed RTD method is to save image storage space, because for Tucker tensor decomposition methods, to reconstruct the images, we only need to store U, V and W, the core tensor S can be calculated using U, V , W. [sent-425, score-1.291]
76 The sizes of U, V , W are ni P, nj Q, nk R, respectively. [sent-426, score-0.173]
77 So the storage space for× our ,L n1-n×orQm, tnens×orR are ni ×× P + nj Q + nk R While for PCA based methods, U and V need to be stored, and the sizes of U and V are p k and k n respectively, aanndd thheere s p e=s ni U× nj da Vnd a n = p nk. [sent-427, score-0.451]
78 S ano dth ke storage space efloyr, PCA based meth×od ns would be ni nj k+k nk The parameters we used in our experiment for each data set is given in Table 8. [sent-428, score-0.323]
79 Accordingly, the needed storage space for each method on every data sets can be calculated, which are given in Table 2, 3, 4. [sent-429, score-0.174]
80 Then X + O are the input data to tensor decompositions and PCA. [sent-433, score-0.49]
81 For occluded data, we take the original images as the approximation of the true noise-free images, and consider ? [sent-441, score-0.072]
82 The noise-free error for each method is listed in Table 2, 3, 4 for the first type of occlusion and Table 5, 6, 7 for the second type of occlusion. [sent-452, score-0.15]
83 We can see (1) the noise-free errors for RTD and L1PCA are always smaller than those for Tucker decomposition and PCA; This shows the effectiveness of L1 norm for removing corruptions. [sent-453, score-0.188]
84 (2) Noise-free errors for RTD are always smaller than those for L1PCA; This demonstrates the advantage of Tensor decomposition approach. [sent-454, score-0.137]
85 A byproduct of image denoising is improved classification accuracy. [sent-455, score-0.047]
86 Here we perform classification as the demonstration and evaluation ofdenoising effectiveness ofthe proposed RTD. [sent-456, score-0.049]
87 Classification accuracy on occluded image data are listed in Table 2, 3, 4 for the first type of occlusion and Table 5, 6, 7 for the second type 22445544 of occlusion. [sent-458, score-0.179]
88 For each class, we randomly split the images into 2 parts, and then we set each of the two parts as training set and the rest part as testing set. [sent-460, score-0.048]
89 The reported accuracy is the average of 100 times of cross validations. [sent-461, score-0.045]
90 Reconstruction Images and Discussion Figure 1 and Figure 2 demonstrate the sample occluded images and the corresponding reconstructed images from different methods. [sent-464, score-0.165]
91 As we can see, the reconstructed images from our RTD method reduce the occlusion more successfully than other methods, which is also shown by the noisefree error in Table 2, 3, 4, 5, 6, 7, the noise-free error of our methods are smaller than other methods. [sent-465, score-0.208]
92 Our method needs far less storage space than PCA based methods, for example, the storage for PCA based method is 119,040 for AT&T; data set, while for our RTD method, the storage is only 19,672, that is to say, PCA based methods need 6 times bigger storage than tensor methods do on AT&T; data set. [sent-466, score-1.186]
93 Classification accuracies on the reconstructed images from RTD method are higher in most cases, which demonstrated the effectiveness our method. [sent-467, score-0.118]
94 Conclusion In this paper, we propose an L1-norm based robust Tucker tensor decomposition (RTD) method, which is effective for correcting corrupted images. [sent-469, score-0.701]
95 Our method requires far less storage space than PCA based methods. [sent-470, score-0.174]
96 Both numerical and visual results are consistently better for images with outliers or noisy features than standard PCA, L1PCA and standard Tucker tensor decomposition methods. [sent-473, score-0.782]
97 2-dimensional singular value decomposition for 2d maps and images. [sent-525, score-0.203]
98 Robust l1 norm factorization in the presence of outliers and missing data by alternative convex programming. [sent-564, score-0.074]
99 on the global convergence of hosvd and parafac algorithms. [sent-589, score-0.118]
100 A review of fast l1-minimization algorithms for robust face recognition. [sent-628, score-0.048]
wordName wordTfidf (topN-words)
[('rtd', 0.53), ('tensor', 0.49), ('tucker', 0.419), ('storage', 0.174), ('pca', 0.141), ('decomposition', 0.137), ('hosvd', 0.118), ('uv', 0.114), ('alm', 0.112), ('errorclass', 0.098), ('acc', 0.087), ('ijk', 0.086), ('utu', 0.081), ('pie', 0.08), ('eijk', 0.079), ('reconstructed', 0.073), ('occlusion', 0.069), ('nj', 0.069), ('singular', 0.066), ('denotation', 0.059), ('qijkqi', 0.059), ('arlington', 0.058), ('corrupted', 0.056), ('kkt', 0.056), ('cmu', 0.053), ('yijk', 0.052), ('occluded', 0.052), ('mixed', 0.049), ('outliers', 0.048), ('yale', 0.048), ('retains', 0.046), ('nk', 0.045), ('cross', 0.045), ('glram', 0.039), ('mein', 0.039), ('nedderman', 0.039), ('pijk', 0.039), ('principal', 0.037), ('jj', 0.037), ('kk', 0.037), ('frobenius', 0.036), ('occlusions', 0.036), ('vtv', 0.035), ('xijk', 0.035), ('vasilescu', 0.035), ('ni', 0.035), ('solve', 0.034), ('standard', 0.034), ('ding', 0.034), ('rectangle', 0.033), ('eigenvectors', 0.033), ('block', 0.033), ('corruption', 0.032), ('wwt', 0.03), ('miao', 0.03), ('uut', 0.03), ('face', 0.03), ('row', 0.029), ('fixing', 0.029), ('type', 0.029), ('dimensional', 0.028), ('randomly', 0.028), ('multilinear', 0.028), ('shashua', 0.028), ('lagrangian', 0.027), ('norm', 0.026), ('pij', 0.026), ('lagrange', 0.026), ('locality', 0.026), ('effectiveness', 0.025), ('eigenvalues', 0.025), ('sizes', 0.024), ('subgradient', 0.024), ('classification', 0.024), ('denoising', 0.023), ('squared', 0.023), ('fifth', 0.023), ('error', 0.023), ('texas', 0.022), ('select', 0.022), ('class', 0.021), ('add', 0.021), ('noises', 0.021), ('drive', 0.02), ('images', 0.02), ('subspace', 0.02), ('added', 0.02), ('fourth', 0.019), ('numerical', 0.019), ('resized', 0.019), ('sums', 0.019), ('multiplier', 0.019), ('rank', 0.019), ('tx', 0.018), ('robust', 0.018), ('verify', 0.017), ('ye', 0.017), ('uso', 0.017), ('lathauwer', 0.017), ('wrde', 0.017)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 362 iccv-2013-Robust Tucker Tensor Decomposition for Effective Image Representation
Author: Miao Zhang, Chris Ding
Abstract: Many tensor based algorithms have been proposed for the study of high dimensional data in a large variety ofcomputer vision and machine learning applications. However, most of the existing tensor analysis approaches are based on Frobenius norm, which makes them sensitive to outliers, because they minimize the sum of squared errors and enlarge the influence of both outliers and large feature noises. In this paper, we propose a robust Tucker tensor decomposition model (RTD) to suppress the influence of outliers, which uses L1-norm loss function. Yet, the optimization on L1-norm based tensor analysis is much harder than standard tensor decomposition. In this paper, we propose a simple and efficient algorithm to solve our RTD model. Moreover, tensor factorization-based image storage needs much less space than PCA based methods. We carry out extensive experiments to evaluate the proposed algorithm, and verify the robustness against image occlusions. Both numerical and visual results show that our RTD model is consistently better against the existence of outliers than previous tensor and PCA methods.
2 0.40299323 119 iccv-2013-Discriminant Tracking Using Tensor Representation with Semi-supervised Improvement
Author: Jin Gao, Junliang Xing, Weiming Hu, Steve Maybank
Abstract: Visual tracking has witnessed growing methods in object representation, which is crucial to robust tracking. The dominant mechanism in object representation is using image features encoded in a vector as observations to perform tracking, without considering that an image is intrinsically a matrix, or a 2nd-order tensor. Thus approaches following this mechanism inevitably lose a lot of useful information, and therefore cannot fully exploit the spatial correlations within the 2D image ensembles. In this paper, we address an image as a 2nd-order tensor in its original form, and find a discriminative linear embedding space approximation to the original nonlinear submanifold embedded in the tensor space based on the graph embedding framework. We specially design two graphs for characterizing the intrinsic local geometrical structure of the tensor space, so as to retain more discriminant information when reducing the dimension along certain tensor dimensions. However, spatial correlations within a tensor are not limited to the elements along these dimensions. This means that some part of the discriminant information may not be encoded in the embedding space. We introduce a novel technique called semi-supervised improvement to iteratively adjust the embedding space to compensate for the loss of discriminant information, hence improving the performance of our tracker. Experimental results on challenging videos demonstrate the effectiveness and robustness of the proposed tracker.
3 0.093827397 184 iccv-2013-Global Fusion of Relative Motions for Robust, Accurate and Scalable Structure from Motion
Author: Pierre Moulon, Pascal Monasse, Renaud Marlet
Abstract: Multi-view structure from motion (SfM) estimates the position and orientation of pictures in a common 3D coordinate frame. When views are treated incrementally, this external calibration can be subject to drift, contrary to global methods that distribute residual errors evenly. We propose a new global calibration approach based on the fusion of relative motions between image pairs. We improve an existing method for robustly computing global rotations. We present an efficient a contrario trifocal tensor estimation method, from which stable and precise translation directions can be extracted. We also define an efficient translation registration method that recovers accurate camera positions. These components are combined into an original SfM pipeline. Our experiments show that, on most datasets, it outperforms in accuracy other existing incremental and global pipelines. It also achieves strikingly good running times: it is about 20 times faster than the other global method we could compare to, and as fast as the best incremental method. More importantly, it features better scalability properties.
4 0.081637092 434 iccv-2013-Unifying Nuclear Norm and Bilinear Factorization Approaches for Low-Rank Matrix Decomposition
Author: Ricardo Cabral, Fernando De_La_Torre, João P. Costeira, Alexandre Bernardino
Abstract: Low rank models have been widely usedfor the representation of shape, appearance or motion in computer vision problems. Traditional approaches to fit low rank models make use of an explicit bilinear factorization. These approaches benefit from fast numerical methods for optimization and easy kernelization. However, they suffer from serious local minima problems depending on the loss function and the amount/type of missing data. Recently, these lowrank models have alternatively been formulated as convex problems using the nuclear norm regularizer; unlike factorization methods, their numerical solvers are slow and it is unclear how to kernelize them or to impose a rank a priori. This paper proposes a unified approach to bilinear factorization and nuclear norm regularization, that inherits the benefits of both. We analyze the conditions under which these approaches are equivalent. Moreover, based on this analysis, we propose a new optimization algorithm and a “rank continuation ” strategy that outperform state-of-theart approaches for Robust PCA, Structure from Motion and Photometric Stereo with outliers and missing data.
5 0.068934202 134 iccv-2013-Efficient Higher-Order Clustering on the Grassmann Manifold
Author: Suraj Jain, Venu Madhav Govindu
Abstract: The higher-order clustering problem arises when data is drawn from multiple subspaces or when observations fit a higher-order parametric model. Most solutions to this problem either decompose higher-order similarity measures for use in spectral clustering or explicitly use low-rank matrix representations. In this paper we present our approach of Sparse Grassmann Clustering (SGC) that combines attributes of both categories. While we decompose the higherorder similarity tensor, we cluster data by directly finding a low dimensional representation without explicitly building a similarity matrix. By exploiting recent advances in online estimation on the Grassmann manifold (GROUSE) we develop an efficient and accurate algorithm that works with individual columns of similarities or partial observations thereof. Since it avoids the storage and decomposition of large similarity matrices, our method is efficient, scalable and has low memory requirements even for large-scale data. We demonstrate the performance of our SGC method on a variety of segmentation problems including planar seg- mentation of Kinect depth maps and motion segmentation of the Hopkins 155 dataset for which we achieve performance comparable to the state-of-the-art.
6 0.063129403 310 iccv-2013-Partial Sum Minimization of Singular Values in RPCA for Low-Level Vision
7 0.061933331 354 iccv-2013-Robust Dictionary Learning by Error Source Decomposition
8 0.059747811 357 iccv-2013-Robust Matrix Factorization with Unknown Noise
9 0.05961597 280 iccv-2013-Multi-view 3D Reconstruction from Uncalibrated Radially-Symmetric Cameras
10 0.056366019 94 iccv-2013-Correntropy Induced L2 Graph for Robust Subspace Clustering
11 0.052659675 11 iccv-2013-A Fully Hierarchical Approach for Finding Correspondences in Non-rigid Shapes
12 0.051697969 392 iccv-2013-Similarity Metric Learning for Face Recognition
13 0.051035572 21 iccv-2013-A Method of Perceptual-Based Shape Decomposition
14 0.050156284 116 iccv-2013-Directed Acyclic Graph Kernels for Action Recognition
15 0.049681973 257 iccv-2013-Log-Euclidean Kernels for Sparse Representation and Dictionary Learning
16 0.049365979 26 iccv-2013-A Practical Transfer Learning Algorithm for Face Verification
17 0.049182706 17 iccv-2013-A Global Linear Method for Camera Pose Registration
18 0.048207898 190 iccv-2013-Handling Occlusions with Franken-Classifiers
19 0.047586195 212 iccv-2013-Image Set Classification Using Holistic Multiple Order Statistics Features and Localized Multi-kernel Metric Learning
20 0.046780847 209 iccv-2013-Image Guided Depth Upsampling Using Anisotropic Total Generalized Variation
topicId topicWeight
[(0, 0.104), (1, -0.013), (2, -0.053), (3, -0.025), (4, -0.067), (5, 0.014), (6, 0.02), (7, 0.038), (8, 0.035), (9, -0.009), (10, -0.009), (11, -0.02), (12, -0.038), (13, 0.006), (14, 0.043), (15, 0.014), (16, 0.024), (17, 0.022), (18, 0.02), (19, 0.044), (20, 0.002), (21, 0.032), (22, -0.053), (23, -0.081), (24, 0.022), (25, 0.057), (26, 0.061), (27, 0.029), (28, -0.069), (29, 0.071), (30, 0.002), (31, -0.066), (32, -0.086), (33, 0.042), (34, 0.053), (35, 0.093), (36, 0.102), (37, 0.111), (38, -0.095), (39, 0.128), (40, -0.122), (41, 0.047), (42, 0.032), (43, 0.133), (44, 0.077), (45, 0.049), (46, -0.198), (47, 0.226), (48, 0.122), (49, -0.155)]
simIndex simValue paperId paperTitle
same-paper 1 0.93131608 362 iccv-2013-Robust Tucker Tensor Decomposition for Effective Image Representation
Author: Miao Zhang, Chris Ding
Abstract: Many tensor based algorithms have been proposed for the study of high dimensional data in a large variety ofcomputer vision and machine learning applications. However, most of the existing tensor analysis approaches are based on Frobenius norm, which makes them sensitive to outliers, because they minimize the sum of squared errors and enlarge the influence of both outliers and large feature noises. In this paper, we propose a robust Tucker tensor decomposition model (RTD) to suppress the influence of outliers, which uses L1-norm loss function. Yet, the optimization on L1-norm based tensor analysis is much harder than standard tensor decomposition. In this paper, we propose a simple and efficient algorithm to solve our RTD model. Moreover, tensor factorization-based image storage needs much less space than PCA based methods. We carry out extensive experiments to evaluate the proposed algorithm, and verify the robustness against image occlusions. Both numerical and visual results show that our RTD model is consistently better against the existence of outliers than previous tensor and PCA methods.
2 0.86520129 119 iccv-2013-Discriminant Tracking Using Tensor Representation with Semi-supervised Improvement
Author: Jin Gao, Junliang Xing, Weiming Hu, Steve Maybank
Abstract: Visual tracking has witnessed growing methods in object representation, which is crucial to robust tracking. The dominant mechanism in object representation is using image features encoded in a vector as observations to perform tracking, without considering that an image is intrinsically a matrix, or a 2nd-order tensor. Thus approaches following this mechanism inevitably lose a lot of useful information, and therefore cannot fully exploit the spatial correlations within the 2D image ensembles. In this paper, we address an image as a 2nd-order tensor in its original form, and find a discriminative linear embedding space approximation to the original nonlinear submanifold embedded in the tensor space based on the graph embedding framework. We specially design two graphs for characterizing the intrinsic local geometrical structure of the tensor space, so as to retain more discriminant information when reducing the dimension along certain tensor dimensions. However, spatial correlations within a tensor are not limited to the elements along these dimensions. This means that some part of the discriminant information may not be encoded in the embedding space. We introduce a novel technique called semi-supervised improvement to iteratively adjust the embedding space to compensate for the loss of discriminant information, hence improving the performance of our tracker. Experimental results on challenging videos demonstrate the effectiveness and robustness of the proposed tracker.
3 0.55568391 347 iccv-2013-Recursive Estimation of the Stein Center of SPD Matrices and Its Applications
Author: Hesamoddin Salehian, Guang Cheng, Baba C. Vemuri, Jeffrey Ho
Abstract: Symmetric positive-definite (SPD) matrices are ubiquitous in Computer Vision, Machine Learning and Medical Image Analysis. Finding the center/average of a population of such matrices is a common theme in many algorithms such as clustering, segmentation, principal geodesic analysis, etc. The center of a population of such matrices can be defined using a variety of distance/divergence measures as the minimizer of the sum of squared distances/divergences from the unknown center to the members of the population. It is well known that the computation of the Karcher mean for the space of SPD matrices which is a negativelycurved Riemannian manifold is computationally expensive. Recently, the LogDet divergence-based center was shown to be a computationally attractive alternative. However, the LogDet-based mean of more than two matrices can not be computed in closed form, which makes it computationally less attractive for large populations. In this paper we present a novel recursive estimator for center based on the Stein distance which is the square root of the LogDet di– vergence that is significantly faster than the batch mode computation of this center. The key theoretical contribution is a closed-form solution for the weighted Stein center of two SPD matrices, which is used in the recursive computation of the Stein center for a population of SPD matrices. Additionally, we show experimental evidence of the convergence of our recursive Stein center estimator to the batch mode Stein center. We present applications of our recursive estimator to K-means clustering and image indexing depicting significant time gains over corresponding algorithms that use the batch mode computations. For the latter application, we develop novel hashing functions using the Stein distance and apply it to publicly available data sets, and experimental results have shown favorable com– ∗This research was funded in part by the NIH grant NS066340 to BCV. †Corresponding author parisons to other competing methods.
4 0.46705875 55 iccv-2013-Automatic Kronecker Product Model Based Detection of Repeated Patterns in 2D Urban Images
Author: Juan Liu, Emmanouil Psarakis, Ioannis Stamos
Abstract: Repeated patterns (such as windows, tiles, balconies and doors) are prominent and significant features in urban scenes. Therefore, detection of these repeated patterns becomes very important for city scene analysis. This paper attacks the problem of repeated patterns detection in a precise, efficient and automatic way, by combining traditional feature extraction followed by a Kronecker product lowrank modeling approach. Our method is tailored for 2D images of building fac ¸ades. We have developed algorithms for automatic selection ofa representative texture withinfa ¸cade images using vanishing points and Harris corners. After rectifying the input images, we describe novel algorithms that extract repeated patterns by using Kronecker product based modeling that is based on a solid theoretical foundation. Our approach is unique and has not ever been used for fac ¸ade analysis. We have tested our algorithms in a large set of images.
5 0.4600963 357 iccv-2013-Robust Matrix Factorization with Unknown Noise
Author: Deyu Meng, Fernando De_La_Torre
Abstract: Many problems in computer vision can be posed as recovering a low-dimensional subspace from highdimensional visual data. Factorization approaches to lowrank subspace estimation minimize a loss function between an observed measurement matrix and a bilinear factorization. Most popular loss functions include the L2 and L1 losses. L2 is optimal for Gaussian noise, while L1 is for Laplacian distributed noise. However, real data is often corrupted by an unknown noise distribution, which is unlikely to be purely Gaussian or Laplacian. To address this problem, this paper proposes a low-rank matrix factorization problem with a Mixture of Gaussians (MoG) noise model. The MoG model is a universal approximator for any continuous distribution, and hence is able to model a wider range of noise distributions. The parameters of the MoG model can be estimated with a maximum likelihood method, while the subspace is computed with standard approaches. We illustrate the benefits of our approach in extensive syn- thetic and real-world experiments including structure from motion, face modeling and background subtraction.
6 0.45493722 434 iccv-2013-Unifying Nuclear Norm and Bilinear Factorization Approaches for Low-Rank Matrix Decomposition
7 0.4536725 184 iccv-2013-Global Fusion of Relative Motions for Robust, Accurate and Scalable Structure from Motion
8 0.44588023 257 iccv-2013-Log-Euclidean Kernels for Sparse Representation and Dictionary Learning
9 0.43084759 60 iccv-2013-Bayesian Robust Matrix Factorization for Image and Video Processing
10 0.4136492 310 iccv-2013-Partial Sum Minimization of Singular Values in RPCA for Low-Level Vision
11 0.39999726 138 iccv-2013-Efficient and Robust Large-Scale Rotation Averaging
12 0.3987281 25 iccv-2013-A Novel Earth Mover's Distance Methodology for Image Matching with Gaussian Mixture Models
13 0.37866157 364 iccv-2013-SGTD: Structure Gradient and Texture Decorrelating Regularization for Image Decomposition
14 0.36026588 17 iccv-2013-A Global Linear Method for Camera Pose Registration
15 0.34285715 15 iccv-2013-A Generalized Low-Rank Appearance Model for Spatio-temporally Correlated Rain Streaks
16 0.33643121 134 iccv-2013-Efficient Higher-Order Clustering on the Grassmann Manifold
17 0.31068963 141 iccv-2013-Enhanced Continuous Tabu Search for Parameter Estimation in Multiview Geometry
18 0.31038058 98 iccv-2013-Cross-Field Joint Image Restoration via Scale Map
20 0.30194241 182 iccv-2013-GOSUS: Grassmannian Online Subspace Updates with Structured-Sparsity
topicId topicWeight
[(2, 0.065), (7, 0.021), (11, 0.224), (26, 0.044), (27, 0.031), (31, 0.061), (34, 0.011), (35, 0.02), (42, 0.134), (48, 0.022), (64, 0.034), (73, 0.049), (78, 0.032), (89, 0.116), (98, 0.036)]
simIndex simValue paperId paperTitle
same-paper 1 0.75246048 362 iccv-2013-Robust Tucker Tensor Decomposition for Effective Image Representation
Author: Miao Zhang, Chris Ding
Abstract: Many tensor based algorithms have been proposed for the study of high dimensional data in a large variety ofcomputer vision and machine learning applications. However, most of the existing tensor analysis approaches are based on Frobenius norm, which makes them sensitive to outliers, because they minimize the sum of squared errors and enlarge the influence of both outliers and large feature noises. In this paper, we propose a robust Tucker tensor decomposition model (RTD) to suppress the influence of outliers, which uses L1-norm loss function. Yet, the optimization on L1-norm based tensor analysis is much harder than standard tensor decomposition. In this paper, we propose a simple and efficient algorithm to solve our RTD model. Moreover, tensor factorization-based image storage needs much less space than PCA based methods. We carry out extensive experiments to evaluate the proposed algorithm, and verify the robustness against image occlusions. Both numerical and visual results show that our RTD model is consistently better against the existence of outliers than previous tensor and PCA methods.
2 0.69815624 220 iccv-2013-Joint Deep Learning for Pedestrian Detection
Author: Wanli Ouyang, Xiaogang Wang
Abstract: Feature extraction, deformation handling, occlusion handling, and classi?cation are four important components in pedestrian detection. Existing methods learn or design these components either individually or sequentially. The interaction among these components is not yet well explored. This paper proposes that they should be jointly learned in order to maximize their strengths through cooperation. We formulate these four components into a joint deep learning framework and propose a new deep network architecture1. By establishing automatic, mutual interaction among components, the deep model achieves a 9% reduction in the average miss rate compared with the current best-performing pedestrian detection approaches on the largest Caltech benchmark dataset.
3 0.67792225 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning
Author: Junliang Xing, Jin Gao, Bing Li, Weiming Hu, Shuicheng Yan
Abstract: Recently, sparse representation has been introduced for robust object tracking. By representing the object sparsely, i.e., using only a few templates via ?1-norm minimization, these so-called ?1-trackers exhibit promising tracking results. In this work, we address the object template building and updating problem in these ?1-tracking approaches, which has not been fully studied. We propose to perform template updating, in a new perspective, as an online incremental dictionary learning problem, which is efficiently solved through an online optimization procedure. To guarantee the robustness and adaptability of the tracking algorithm, we also propose to build a multi-lifespan dictionary model. By building target dictionaries of different lifespans, effective object observations can be obtained to deal with the well-known drifting problem in tracking and thus improve the tracking accuracy. We derive effective observa- tion models both generatively and discriminatively based on the online multi-lifespan dictionary learning model and deploy them to the Bayesian sequential estimation framework to perform tracking. The proposed approach has been extensively evaluated on ten challenging video sequences. Experimental results demonstrate the effectiveness of the online learned templates, as well as the state-of-the-art tracking performance of the proposed approach.
4 0.66555649 427 iccv-2013-Transfer Feature Learning with Joint Distribution Adaptation
Author: Mingsheng Long, Jianmin Wang, Guiguang Ding, Jiaguang Sun, Philip S. Yu
Abstract: Transfer learning is established as an effective technology in computer visionfor leveraging rich labeled data in the source domain to build an accurate classifier for the target domain. However, most prior methods have not simultaneously reduced the difference in both the marginal distribution and conditional distribution between domains. In this paper, we put forward a novel transfer learning approach, referred to as Joint Distribution Adaptation (JDA). Specifically, JDA aims to jointly adapt both the marginal distribution and conditional distribution in a principled dimensionality reduction procedure, and construct new feature representation that is effective and robustfor substantial distribution difference. Extensive experiments verify that JDA can significantly outperform several state-of-the-art methods on four types of cross-domain image classification problems.
5 0.6539166 277 iccv-2013-Multi-channel Correlation Filters
Author: Hamed Kiani Galoogahi, Terence Sim, Simon Lucey
Abstract: Modern descriptors like HOG and SIFT are now commonly used in vision for pattern detection within image and video. From a signal processing perspective, this detection process can be efficiently posed as a correlation/convolution between a multi-channel image and a multi-channel detector/filter which results in a singlechannel response map indicating where the pattern (e.g. object) has occurred. In this paper, we propose a novel framework for learning a multi-channel detector/filter efficiently in the frequency domain, both in terms of training time and memory footprint, which we refer to as a multichannel correlation filter. To demonstrate the effectiveness of our strategy, we evaluate it across a number of visual detection/localization tasks where we: (i) exhibit superiorperformance to current state of the art correlation filters, and (ii) superior computational and memory efficiencies compared to state of the art spatial detectors.
6 0.6512236 328 iccv-2013-Probabilistic Elastic Part Model for Unsupervised Face Detector Adaptation
7 0.65074742 370 iccv-2013-Saliency Detection in Large Point Sets
8 0.64954293 259 iccv-2013-Manifold Based Face Synthesis from Sparse Samples
9 0.64929903 52 iccv-2013-Attribute Adaptation for Personalized Image Search
10 0.64913183 14 iccv-2013-A Generalized Iterated Shrinkage Algorithm for Non-convex Sparse Coding
11 0.64893711 80 iccv-2013-Collaborative Active Learning of a Kernel Machine Ensemble for Recognition
12 0.64885664 44 iccv-2013-Adapting Classification Cascades to New Domains
13 0.64803845 94 iccv-2013-Correntropy Induced L2 Graph for Robust Subspace Clustering
14 0.64782673 26 iccv-2013-A Practical Transfer Learning Algorithm for Face Verification
15 0.64671493 398 iccv-2013-Sparse Variation Dictionary Learning for Face Recognition with a Single Training Sample per Person
16 0.64647675 290 iccv-2013-New Graph Structured Sparsity Model for Multi-label Image Annotations
17 0.64573479 123 iccv-2013-Domain Adaptive Classification
18 0.64505839 384 iccv-2013-Semi-supervised Robust Dictionary Learning via Efficient l-Norms Minimization
19 0.64487469 392 iccv-2013-Similarity Metric Learning for Face Recognition
20 0.64383161 54 iccv-2013-Attribute Pivots for Guiding Relevance Feedback in Image Search