nips nips2013 nips2013-179 knowledge-graph by maker-knowledge-mining

179 nips-2013-Low-Rank Matrix and Tensor Completion via Adaptive Sampling


Source: pdf

Author: Akshay Krishnamurthy, Aarti Singh

Abstract: We study low rank matrix and tensor completion and propose novel algorithms that employ adaptive sampling schemes to obtain strong performance guarantees. Our algorithms exploit adaptivity to identify entries that are highly informative for learning the column space of the matrix (tensor) and consequently, our results hold even when the row space is highly coherent, in contrast with previous analyses. In the absence of noise, we show that one can exactly recover a n ⇥ n matrix of rank r from merely ⌦(nr3/2 log(r)) matrix entries. We also show that one can recover an order T tensor using ⌦(nrT 1/2 T 2 log(r)) entries. For noisy recovery, our algorithm consistently estimates a low rank matrix corrupted with noise using ⌦(nr3/2 polylog(n)) entries. We complement our study with simulations that verify our theory and demonstrate the scalability of our algorithms. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract We study low rank matrix and tensor completion and propose novel algorithms that employ adaptive sampling schemes to obtain strong performance guarantees. [sent-5, score-1.256]

2 Our algorithms exploit adaptivity to identify entries that are highly informative for learning the column space of the matrix (tensor) and consequently, our results hold even when the row space is highly coherent, in contrast with previous analyses. [sent-6, score-0.514]

3 In the absence of noise, we show that one can exactly recover a n ⇥ n matrix of rank r from merely ⌦(nr3/2 log(r)) matrix entries. [sent-7, score-0.433]

4 We also show that one can recover an order T tensor using ⌦(nrT 1/2 T 2 log(r)) entries. [sent-8, score-0.377]

5 For noisy recovery, our algorithm consistently estimates a low rank matrix corrupted with noise using ⌦(nr3/2 polylog(n)) entries. [sent-9, score-0.328]

6 These empirical observations have lead to a number of theoretical studies characterizing the performance gains offered by adaptive sensing over conventional, passive approaches. [sent-13, score-0.344]

7 In this work, we continue in that direction and study the role of adaptive data acquisition in low rank matrix and tensor completion problems. [sent-14, score-1.187]

8 Our study is motivated not only by prior theoretical results in favor of adaptive sensing but also by several applications where adaptive sensing is feasible. [sent-15, score-0.316]

9 In particular, the operator can obtain full columns of the matrix by measuring from one host to all others, a sampling strategy we will exploit in this paper. [sent-19, score-0.333]

10 There are typically two types of measurements: low-throughput assays provide highly reliable measurements of single entries in this matrix while high-throughput microarrays provide expression levels of all genes of interest across operating conditions, thus revealing entire columns. [sent-21, score-0.208]

11 The completion problem can be seen as a strategy for learning the expression matrix from both low- and high-throughput data while minimizing the total measurement cost. [sent-22, score-0.599]

12 1 Contributions We develop algorithms with theoretical guarantees for three low-rank completion problems. [sent-24, score-0.437]

13 The algorithms find a small subset of columns of the matrix (tensor) that can be used to reconstruct or approximate the matrix (tensor). [sent-25, score-0.426]

14 We exploit adaptivity to focus on highly informative columns, and this enables us to do away with the usual incoherence assumptions on the row-space while achieving competitive (or in some cases better) sample complexity bounds. [sent-26, score-0.35]

15 In the matrix case, we show that ⌦(nr3/2 log r) adaptively chosen samples are sufficient for exact recovery, improving on the best known bound of ⌦(nr2 log2 n) in the passive setting [21]. [sent-29, score-0.447]

16 This also gives the first guarantee for matrix completion with coherent row space. [sent-30, score-0.729]

17 In the tensor case, we establish that ⌦(nrT 1/2 T 2 log r) adaptively chosen samples are sufficient for recovering a n ⇥ . [sent-32, score-0.541]

18 We complement this with a necessary condition for tensor completion under random sampling, showing that our adaptive strategy is competitive with any passive algorithm. [sent-36, score-1.103]

19 These are the first sample complexity upper and lower bounds for exact tensor completion. [sent-37, score-0.412]

20 In the noisy matrix completion setting, we modify the adaptive column subset selection algorithm of Deshpande et al. [sent-39, score-0.915]

21 As before, the algorithm does not require an incoherent row space but we are no longer able to process the matrix sequentially. [sent-41, score-0.304]

22 Along the way, we improve on existing results for subspace detection from missing data, the problem of testing if a partially observed vector lies in a known subspace. [sent-43, score-0.258]

23 2 Related Work The matrix completion problem has received considerable attention in recent years. [sent-44, score-0.599]

24 Here µ0 and µ1 are parameters characterizing the incoherence of the row and column spaces of the matrix, which we will define shortly. [sent-46, score-0.469]

25 Candes and Tao [7] proved that under random sampling ⌦(n1 rµ0 log(n2 )) samples are necessary, showing that nuclear norm minimization is near-optimal. [sent-47, score-0.218]

26 The noisy matrix completion problem has also received considerable attention [5, 17, 20]. [sent-48, score-0.656]

27 One challenge stems from the NP-hardness of computing most tensor decompositions, pushing researchers to study alternative structure-inducing norms in lieu of the nuclear norm [11, 22]. [sent-51, score-0.482]

28 Both papers derive algorithms for tensor completion, but neither provide sample complexity bounds for the noiseless case. [sent-52, score-0.464]

29 As a low rank matrix is sparse in its unknown eigenbasis, the completion problem is coupled with learning this basis, which poses a new challenge for adaptive sampling procedures. [sent-55, score-0.879]

30 While the sample complexity bounds match ours, the analysis for the Nystrom method has focused on positive-semidefinite kernel matrices and requires incoherence of both the row and column spaces. [sent-61, score-0.504]

31 2 It is also worth briefly mentioning the vast body of literature on column subset selection, the task of approximating a matrix by projecting it onto a few of its columns. [sent-63, score-0.319]

32 While the best algorithms, namely volume sampling [14] and sampling according to statistical leverages [3], do not seem to be readily applicable to the missing data setting, some algorithms are. [sent-64, score-0.22]

33 Indeed our procedure for noisy matrix completion is an adaptation of an existing column subset selection procedure [10]. [sent-65, score-0.813]

34 Our techniques are also closely related to ideas employed for subspace detection – testing whether a vector lies in a known subspace – and subspace tracking – learning a time-evolving low-dimensional subspace from vectors lying close to that subspace. [sent-66, score-0.789]

35 [2] prove guarantees for subspace detection with known subspace and a partially observed vector, and we will improve on their result en route to establishing our guarantees. [sent-68, score-0.396]

36 Let M 2 Rn1 ⇥n2 be a rank r matrix with singular value decomposition U ⌃V T . [sent-71, score-0.305]

37 ⇥nT denote an order T tensor with canonical decomposition: r X (1) (2) (T ) M= ak ⌦ ak ⌦ . [sent-79, score-0.539]

38 We represent a d-dimensional subspace U ⇢ Rn as a set of orthonormal basis vectors U = {ui }d i=1 and in some cases as n ⇥ d matrix whose columns are the basis vectors. [sent-87, score-0.48]

39 Note that if U is a orthobasis for a subspace, U⌦ is a |⌦| ⇥ d matrix with columns ui⌦ where ui 2 U , rather than a set of orthonormal basis vectors. [sent-92, score-0.368]

40 These definitions extend to the tensor setting with slight modifications. [sent-94, score-0.377]

41 We use the vec operation to unfold a tensor into a single vector and define the inner product hx, yi = vec(x)T vec(y). [sent-95, score-0.478]

42 For a Q subspace U ⇢ R⌦ni , we write it as a ( ni ) ⇥ d matrix whose columns are vec(ui ), ui 2 U . [sent-96, score-0.557]

43 As in recent work on matrix completion [7, 21], we will require a certain amount of incoherence between the column space associated with M (M) and the standard basis. [sent-98, score-0.995]

44 The coherence of an r-dimensional subspace U ⇢ Rn is: n µ(U ) , max ||PU ej ||2 (2) r 1jn where ej denotes the jth standard basis element. [sent-100, score-0.241]

45 In previous analyses of matrix completion, the incoherence assumption is that both the row and column spaces of the matrix have coherences upper bounded by µ0 . [sent-101, score-0.793]

46 When both spaces are incoherent, each entry of the matrix reveals roughly the same amount of information, so there is little to be gained from adaptive sampling, which typically involves looking for highly informative measurements. [sent-102, score-0.264]

47 Thus the power of adaptivity for these problems should center around relaxing the incoherence assumption, which is the direction we take in this paper. [sent-103, score-0.315]

48 Unfortunately, even under adaptive sampling, it is impossible to identify a rank one matrix that is zero in all but one entry without observing the entire matrix, implying that we cannot completely eliminate the assumption. [sent-104, score-0.449]

49 Instead, we will retain incoherence on the column space, but remove the restrictions on the row space. [sent-105, score-0.469]

50 Exact Completion Problems In the matrix case, our sequential algorithm builds up the column space of the matrix by selecting a ˜ few columns to observe in their entirety. [sent-124, score-0.616]

51 In particular, we maintain a candidate column space U and ˜ or not, choosing to completely observe ci and add it to U if it ˜ test whether a column ci lives in U does not. [sent-125, score-0.516]

52 [2] observed that we can perform this test with a subsampled version of ci , meaning that we can recover the column space using few samples. [sent-127, score-0.248]

53 At the outer level of the recursion, the (T ) algorithm maintains a candidate subspace U for the mode T subtensors Mi . [sent-130, score-0.368]

54 When the subtensor itself is just a column; we observe the columns in its entirety. [sent-133, score-0.207]

55 Our first main result characterizes the performance of the tensor completion algorithm. [sent-135, score-0.814]

56 Let M = be a rank r order-T tensor with subspaces A(t) = i=1 ⌦t=1 aj (t) span({aj }r ). [sent-138, score-0.486]

57 ⇥ n tensor of order T , the algorithm succeeds with high probability using ⌦(nrT 1/2 µT 1 T 2 log(T r/ )) samples, exhibiting a linear dependence on the ⌘ 0 ⇣⇣Q ⌘ tensor dimenT1 sions. [sent-147, score-0.834]

58 In comparison, the only guarantee we are aware of shows that ⌦ t=2 nt r samples are sufficient for consistent estimation of a noisy tensor, exhibiting a much worse dependence on tensor QT dimension [23]. [sent-148, score-0.687]

59 In the noiseless scenario, one can unfold the tensor into a n1 ⇥ t=2 nt matrix and apply any matrix completion algorithm. [sent-149, score-1.362]

60 Unfortunately, without exploiting the additional tensor QT structure, this approach will scale with t=2 nt , which is similarly much worse than our guarantee. [sent-150, score-0.506]

61 The most obvious specialization of Theorem 2 is to the matrix completion problem: Corollary 3. [sent-152, score-0.599]

62 The Nystrom method can also be applied to the matrix completion problem, albeit under nonuniform sampling. [sent-161, score-0.599]

63 Gittens showed that if one samples O(r log r) columns, then one can exactly reconstruct a rank r matrix [12]. [sent-163, score-0.366]

64 This result requires incoherence of both row and column spaces, so it is more restrictive than ours. [sent-164, score-0.469]

65 Almost all previous results for exact matrix completion require incoherence of both row and column spaces. [sent-165, score-1.068]

66 They show that sampling the matrix according to statistical leverages of the rows and columns can eliminate the need for incoherence assumptions. [sent-168, score-0.651]

67 Specifically, when the matrix has incoherent column space, they show that by first estimating the leverages of the columns, sampling the matrix according to this distribution, and then solving the nuclear norm minimization program, one can recover the matrix with ⌦(nrµ0 log2 n) samples. [sent-169, score-0.927]

68 The key ingredient in our proofs is a result pertaining to subspace detection, the task of testing if a subsampled vector lies in a subspace. [sent-173, score-0.211]

69 In the matrix case, this improved dependence on incoherence parameters brings our sample complexity down from nr2 µ2 log r to nr3/2 µ0 log r. [sent-184, score-0.618]

70 1 Lower Bounds for Uniform Sampling We adapt the proof strategy of Candes and Tao [7] to the tensor completion problem and establish the following lower bound for uniform sampling: Theorem 5 (Passive Lower Bound). [sent-187, score-0.814]

71 ⇥ nT order-T tensors M 6= M0 of rank r with coherence parameter  µ0 such that P⌦ (M) = P⌦ (M0 ) with probability at least . [sent-193, score-0.228]

72 This gives a necessary condition on the number of samples required for tensor completion. [sent-196, score-0.421]

73 Comparing with Theorem 2 shows that our procedure outperforms any passive procedure in its dependence on the tensor dimensions. [sent-199, score-0.612]

74 If P P m < r( t nt ), this system is underdetermined showing that ⌦(( t nt )r) observations are necessary for exact recovery, even under adaptive sampling. [sent-210, score-0.391]

75 Thus, our algorithm enjoys sample complexity with optimal dependence on matrix dimensions. [sent-211, score-0.277]

76 5 Noisy Matrix Completion Our algorithm for noisy matrix completion is an adaptation of the column subset selection (CSS) algorithm analyzed by Deshpande et al. [sent-212, score-0.813]

77 The algorithm builds a candidate column space in rounds; at each round it samples additional columns with probability proportional to their projection on the orthogonal complement of the candidate column space. [sent-214, score-0.622]

78 To concretely describe the algorithm, suppose that at the beginning of the lth round we have a candidate subspace Ul . [sent-215, score-0.265]

79 Then in the lth round, we draw s additional columns according to the distribution where the probability of drawing the ith column is proportional to ||PUl? [sent-216, score-0.297]

80 Observing 2 these s columns in full and then adding them to the subspace Ul gives the candidate subspace Ul+1 for the next round. [sent-218, score-0.508]

81 We assume that the matrix M 2 Rn1 ⇥n2 can be decomposed as a rank r matrix A and a random gaussian matrix R whose entries are independently drawn from N (0, 2 ). [sent-223, score-0.641]

82 As before, the incoherence assumption is crucial in guaranteeing that one can estimate the column norms, and consequently sampling probabilities, from missing data. [sent-225, score-0.506]

83 Let ⌦ be the set of all observations over the course of the algorithm, let UL ˆ be the subspace obtained after L = log(n1 n2 ) rounds and M be the matrix whose columns T T ci = UL (UL⌦ UL⌦ ) 1 UL⌦ c⌦i . [sent-227, score-0.569]

84 Existing results for noisy matrix completion require that the energy of the matrix is well spread out across both the rows and the columns (i. [sent-232, score-0.92]

85 when the matrix is zero except for on one column and constant across that column. [sent-237, score-0.319]

86 6 (a) (b) (c) (d) (e) (f) (g) (h) (i) Figure 1: Probability of success curves for our noiseless matrix completion algorithm (top) and SVT (middle). [sent-238, score-0.751]

87 Top: Success probability as a function of: Left: p, the fraction of samples per column, Center: np, total samples per column, and Right: np log2 n, expected samples per column for passive completion. [sent-239, score-0.543]

88 Bottom: Success probability of our noiseless algorithm for different values of r as a function of p, the fraction of samples per column (left), p/r3/2 (middle) and p/r (right). [sent-240, score-0.286]

89 Thinking of n1 = n2 = n and the incoherence parameter as a constant, our ⇣ ⌘ n results imply consistent estimation as long as 2 = ! [sent-242, score-0.239]

90 6 Simulations We verify Corollary 3’s linear dependence on n in Figure 1, where we empirically compute the success probability of the algorithm for varying values of n and p = m/n, the fraction of entries observed per column. [sent-250, score-0.21]

91 Finally, in Figure 1(c), we rescale instead by n/ log2 n, which corresponds to the passive sample complexity bound [21]. [sent-255, score-0.259]

92 Empirically, the fact that these curves do not line up demonstrates that our algorithm requires fewer than log2 n samples per column, outperforming the passive bound. [sent-256, score-0.281]

93 As is apparent from the plots, SVT does not enjoy a linear dependence on n; indeed Figure 1(f) confirms the logarithmic dependency that we expect for passive matrix completion, and establishes that our algorithm has empirically better performance. [sent-258, score-0.397]

94 7 n 1000 5000 10000 Figure 2: Reconstruction error as a function of row space incoherence for our noisy algorithm (CSS) and the semidefinite program of [20]. [sent-259, score-0.402]

95 Indeed, the curves line up in Figure 1(i), demonstrating that empirically, the number of samples needed per column is linear in r rather than the r3/2 dependence in our theorem. [sent-284, score-0.363]

96 To confirm the computational improvement over existing methods, we ran our matrix completion algorithm on large-scale matrices, recording the running time and error in Table 1. [sent-285, score-0.599]

97 As an example, recovering a 10000 ⇥ 10000 matrix of rank 100 takes close to 2 hours with the SVT, while it takes less than 5 minutes with our algorithm. [sent-288, score-0.305]

98 7 Conclusions and Open Problems In this work, we demonstrate how sequential active algorithms can offer significant improvements in time, and measurement overhead over passive algorithms for matrix and tensor completion. [sent-292, score-0.727]

99 Can one generalize the nuclear norm minimization program for matrix completion to the tensor completion setting while providing theoretical guarantees on sample complexity? [sent-299, score-1.551]

100 Tensor completion and low-n-rank tensor recovery via convex optimization. [sent-339, score-0.868]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('completion', 0.437), ('tensor', 0.377), ('incoherence', 0.239), ('ul', 0.203), ('subspace', 0.179), ('matrix', 0.162), ('column', 0.157), ('passive', 0.155), ('nt', 0.129), ('mi', 0.128), ('rank', 0.109), ('svt', 0.106), ('pu', 0.105), ('balzano', 0.105), ('subtensor', 0.105), ('subtensors', 0.105), ('columns', 0.102), ('adaptive', 0.102), ('polylog', 0.095), ('nystrom', 0.085), ('emmanuel', 0.085), ('qt', 0.081), ('ak', 0.081), ('dependence', 0.08), ('nrt', 0.079), ('adaptivity', 0.076), ('row', 0.073), ('rescale', 0.069), ('deshpande', 0.069), ('incoherent', 0.069), ('nuclear', 0.069), ('sampling', 0.069), ('ui', 0.067), ('coherence', 0.062), ('ci', 0.059), ('vec', 0.058), ('noisy', 0.057), ('tensors', 0.057), ('coherent', 0.057), ('sensing', 0.056), ('recovery', 0.054), ('spikiness', 0.052), ('noiseless', 0.052), ('log', 0.051), ('success', 0.051), ('benjamin', 0.051), ('curves', 0.049), ('candidate', 0.048), ('ni', 0.047), ('negahban', 0.047), ('css', 0.046), ('latencies', 0.046), ('entries', 0.046), ('samples', 0.044), ('recht', 0.044), ('mt', 0.043), ('ryota', 0.043), ('unfold', 0.043), ('hisashi', 0.043), ('kohei', 0.043), ('krishnamurthy', 0.043), ('candes', 0.042), ('arxiv', 0.042), ('subsampling', 0.041), ('leverages', 0.041), ('missing', 0.041), ('isit', 0.04), ('compressive', 0.039), ('observing', 0.038), ('lth', 0.038), ('thinking', 0.038), ('conjecture', 0.038), ('eliminate', 0.038), ('detection', 0.038), ('orthonormal', 0.037), ('laura', 0.036), ('lives', 0.036), ('cand', 0.036), ('outer', 0.036), ('norm', 0.036), ('rounds', 0.036), ('adaptively', 0.035), ('tracking', 0.035), ('complexity', 0.035), ('recovering', 0.034), ('orthogonal', 0.034), ('entrywise', 0.034), ('hayashi', 0.034), ('tomioka', 0.034), ('singular', 0.034), ('program', 0.033), ('sequential', 0.033), ('per', 0.033), ('replacement', 0.033), ('preprint', 0.033), ('complement', 0.032), ('fix', 0.032), ('subsampled', 0.032), ('observations', 0.031), ('nr', 0.031)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000005 179 nips-2013-Low-Rank Matrix and Tensor Completion via Adaptive Sampling

Author: Akshay Krishnamurthy, Aarti Singh

Abstract: We study low rank matrix and tensor completion and propose novel algorithms that employ adaptive sampling schemes to obtain strong performance guarantees. Our algorithms exploit adaptivity to identify entries that are highly informative for learning the column space of the matrix (tensor) and consequently, our results hold even when the row space is highly coherent, in contrast with previous analyses. In the absence of noise, we show that one can exactly recover a n ⇥ n matrix of rank r from merely ⌦(nr3/2 log(r)) matrix entries. We also show that one can recover an order T tensor using ⌦(nrT 1/2 T 2 log(r)) entries. For noisy recovery, our algorithm consistently estimates a low rank matrix corrupted with noise using ⌦(nr3/2 polylog(n)) entries. We complement our study with simulations that verify our theory and demonstrate the scalability of our algorithms. 1

2 0.38015366 11 nips-2013-A New Convex Relaxation for Tensor Completion

Author: Bernardino Romera-Paredes, Massimiliano Pontil

Abstract: We study the problem of learning a tensor from a set of linear measurements. A prominent methodology for this problem is based on a generalization of trace norm regularization, which has been used extensively for learning low rank matrices, to the tensor setting. In this paper, we highlight some limitations of this approach and propose an alternative convex relaxation on the Euclidean ball. We then describe a technique to solve the associated regularization problem, which builds upon the alternating direction method of multipliers. Experiments on one synthetic dataset and two real datasets indicate that the proposed method improves significantly over tensor trace norm regularization in terms of estimation error, while remaining computationally tractable. 1

3 0.37445924 113 nips-2013-Exact and Stable Recovery of Pairwise Interaction Tensors

Author: Shouyuan Chen, Michael R. Lyu, Irwin King, Zenglin Xu

Abstract: Tensor completion from incomplete observations is a problem of significant practical interest. However, it is unlikely that there exists an efficient algorithm with provable guarantee to recover a general tensor from a limited number of observations. In this paper, we study the recovery algorithm for pairwise interaction tensors, which has recently gained considerable attention for modeling multiple attribute data due to its simplicity and effectiveness. Specifically, in the absence of noise, we show that one can exactly recover a pairwise interaction tensor by solving a constrained convex program which minimizes the weighted sum of nuclear norms of matrices from O(nr log2 (n)) observations. For the noisy cases, we also prove error bounds for a constrained convex program for recovering the tensors. Our experiments on the synthetic dataset demonstrate that the recovery performance of our algorithm agrees well with the theory. In addition, we apply our algorithm on a temporal collaborative filtering task and obtain state-of-the-art results. 1

4 0.27843812 155 nips-2013-Learning Hidden Markov Models from Non-sequence Data via Tensor Decomposition

Author: Tzu-Kuo Huang, Jeff Schneider

Abstract: Learning dynamic models from observed data has been a central issue in many scientific studies or engineering tasks. The usual setting is that data are collected sequentially from trajectories of some dynamical system operation. In quite a few modern scientific modeling tasks, however, it turns out that reliable sequential data are rather difficult to gather, whereas out-of-order snapshots are much easier to obtain. Examples include the modeling of galaxies, chronic diseases such Alzheimer’s, or certain biological processes. Existing methods for learning dynamic model from non-sequence data are mostly based on Expectation-Maximization, which involves non-convex optimization and is thus hard to analyze. Inspired by recent advances in spectral learning methods, we propose to study this problem from a different perspective: moment matching and spectral decomposition. Under that framework, we identify reasonable assumptions on the generative process of non-sequence data, and propose learning algorithms based on the tensor decomposition method [2] to provably recover firstorder Markov models and hidden Markov models. To the best of our knowledge, this is the first formal guarantee on learning from non-sequence data. Preliminary simulation results confirm our theoretical findings. 1

5 0.27724063 295 nips-2013-Simultaneous Rectification and Alignment via Robust Recovery of Low-rank Tensors

Author: Xiaoqin Zhang, Di Wang, Zhengyuan Zhou, Yi Ma

Abstract: In this work, we propose a general method for recovering low-rank three-order tensors, in which the data can be deformed by some unknown transformation and corrupted by arbitrary sparse errors. Since the unfolding matrices of a tensor are interdependent, we introduce auxiliary variables and relax the hard equality constraints by the augmented Lagrange multiplier method. To improve the computational efficiency, we introduce a proximal gradient step to the alternating direction minimization method. We have provided proof for the convergence of the linearized version of the problem which is the inner loop of the overall algorithm. Both simulations and experiments show that our methods are more efficient and effective than previous work. The proposed method can be easily applied to simultaneously rectify and align multiple images or videos frames. In this context, the state-of-the-art algorithms “RASL” and “TILT” can be viewed as two special cases of our work, and yet each only performs part of the function of our method.

6 0.24114418 307 nips-2013-Speedup Matrix Completion with Side Information: Application to Multi-Label Learning

7 0.23697039 185 nips-2013-Matrix Completion From any Given Set of Observations

8 0.23376565 74 nips-2013-Convex Tensor Decomposition via Structured Schatten Norm Regularization

9 0.18104354 203 nips-2013-Multilinear Dynamical Systems for Tensor Time Series

10 0.17525838 91 nips-2013-Dirty Statistical Models

11 0.16489007 108 nips-2013-Error-Minimizing Estimates and Universal Entry-Wise Error Bounds for Low-Rank Matrix Completion

12 0.162066 306 nips-2013-Speeding up Permutation Testing in Neuroimaging

13 0.14881259 281 nips-2013-Robust Low Rank Kernel Embeddings of Multivariate Distributions

14 0.14601754 259 nips-2013-Provable Subspace Clustering: When LRR meets SSC

15 0.13685653 206 nips-2013-Near-Optimal Entrywise Sampling for Data Matrices

16 0.13247848 137 nips-2013-High-Dimensional Gaussian Process Bandits

17 0.13136445 254 nips-2013-Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

18 0.12468899 286 nips-2013-Robust learning of low-dimensional dynamics from large neural ensembles

19 0.12091494 116 nips-2013-Fantope Projection and Selection: A near-optimal convex relaxation of sparse PCA

20 0.11468464 222 nips-2013-On the Linear Convergence of the Proximal Gradient Method for Trace Norm Regularization


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.263), (1, 0.139), (2, 0.219), (3, 0.388), (4, -0.026), (5, -0.314), (6, 0.034), (7, -0.062), (8, 0.037), (9, 0.027), (10, 0.011), (11, -0.002), (12, 0.004), (13, 0.081), (14, -0.009), (15, -0.052), (16, -0.111), (17, -0.035), (18, 0.024), (19, -0.0), (20, -0.025), (21, -0.041), (22, -0.046), (23, -0.081), (24, 0.013), (25, 0.062), (26, -0.038), (27, -0.09), (28, 0.023), (29, -0.041), (30, 0.016), (31, -0.022), (32, -0.009), (33, -0.009), (34, 0.061), (35, 0.028), (36, 0.061), (37, -0.0), (38, -0.01), (39, -0.026), (40, 0.001), (41, 0.014), (42, -0.004), (43, 0.028), (44, 0.072), (45, 0.04), (46, -0.017), (47, -0.057), (48, -0.017), (49, -0.019)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94693398 179 nips-2013-Low-Rank Matrix and Tensor Completion via Adaptive Sampling

Author: Akshay Krishnamurthy, Aarti Singh

Abstract: We study low rank matrix and tensor completion and propose novel algorithms that employ adaptive sampling schemes to obtain strong performance guarantees. Our algorithms exploit adaptivity to identify entries that are highly informative for learning the column space of the matrix (tensor) and consequently, our results hold even when the row space is highly coherent, in contrast with previous analyses. In the absence of noise, we show that one can exactly recover a n ⇥ n matrix of rank r from merely ⌦(nr3/2 log(r)) matrix entries. We also show that one can recover an order T tensor using ⌦(nrT 1/2 T 2 log(r)) entries. For noisy recovery, our algorithm consistently estimates a low rank matrix corrupted with noise using ⌦(nr3/2 polylog(n)) entries. We complement our study with simulations that verify our theory and demonstrate the scalability of our algorithms. 1

2 0.94518369 113 nips-2013-Exact and Stable Recovery of Pairwise Interaction Tensors

Author: Shouyuan Chen, Michael R. Lyu, Irwin King, Zenglin Xu

Abstract: Tensor completion from incomplete observations is a problem of significant practical interest. However, it is unlikely that there exists an efficient algorithm with provable guarantee to recover a general tensor from a limited number of observations. In this paper, we study the recovery algorithm for pairwise interaction tensors, which has recently gained considerable attention for modeling multiple attribute data due to its simplicity and effectiveness. Specifically, in the absence of noise, we show that one can exactly recover a pairwise interaction tensor by solving a constrained convex program which minimizes the weighted sum of nuclear norms of matrices from O(nr log2 (n)) observations. For the noisy cases, we also prove error bounds for a constrained convex program for recovering the tensors. Our experiments on the synthetic dataset demonstrate that the recovery performance of our algorithm agrees well with the theory. In addition, we apply our algorithm on a temporal collaborative filtering task and obtain state-of-the-art results. 1

3 0.86839193 11 nips-2013-A New Convex Relaxation for Tensor Completion

Author: Bernardino Romera-Paredes, Massimiliano Pontil

Abstract: We study the problem of learning a tensor from a set of linear measurements. A prominent methodology for this problem is based on a generalization of trace norm regularization, which has been used extensively for learning low rank matrices, to the tensor setting. In this paper, we highlight some limitations of this approach and propose an alternative convex relaxation on the Euclidean ball. We then describe a technique to solve the associated regularization problem, which builds upon the alternating direction method of multipliers. Experiments on one synthetic dataset and two real datasets indicate that the proposed method improves significantly over tensor trace norm regularization in terms of estimation error, while remaining computationally tractable. 1

4 0.84618306 295 nips-2013-Simultaneous Rectification and Alignment via Robust Recovery of Low-rank Tensors

Author: Xiaoqin Zhang, Di Wang, Zhengyuan Zhou, Yi Ma

Abstract: In this work, we propose a general method for recovering low-rank three-order tensors, in which the data can be deformed by some unknown transformation and corrupted by arbitrary sparse errors. Since the unfolding matrices of a tensor are interdependent, we introduce auxiliary variables and relax the hard equality constraints by the augmented Lagrange multiplier method. To improve the computational efficiency, we introduce a proximal gradient step to the alternating direction minimization method. We have provided proof for the convergence of the linearized version of the problem which is the inner loop of the overall algorithm. Both simulations and experiments show that our methods are more efficient and effective than previous work. The proposed method can be easily applied to simultaneously rectify and align multiple images or videos frames. In this context, the state-of-the-art algorithms “RASL” and “TILT” can be viewed as two special cases of our work, and yet each only performs part of the function of our method.

5 0.78072298 74 nips-2013-Convex Tensor Decomposition via Structured Schatten Norm Regularization

Author: Ryota Tomioka, Taiji Suzuki

Abstract: We study a new class of structured Schatten norms for tensors that includes two recently proposed norms (“overlapped” and “latent”) for convex-optimizationbased tensor decomposition. We analyze the performance of “latent” approach for tensor decomposition, which was empirically found to perform better than the “overlapped” approach in some settings. We show theoretically that this is indeed the case. In particular, when the unknown true tensor is low-rank in a specific unknown mode, this approach performs as well as knowing the mode with the smallest rank. Along the way, we show a novel duality result for structured Schatten norms, which is also interesting in the general context of structured sparsity. We confirm through numerical simulations that our theory can precisely predict the scaling behaviour of the mean squared error. 1

6 0.74246597 307 nips-2013-Speedup Matrix Completion with Side Information: Application to Multi-Label Learning

7 0.72957188 254 nips-2013-Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

8 0.72103119 203 nips-2013-Multilinear Dynamical Systems for Tensor Time Series

9 0.71899915 185 nips-2013-Matrix Completion From any Given Set of Observations

10 0.68774509 342 nips-2013-Unsupervised Spectral Learning of Finite State Transducers

11 0.67928588 155 nips-2013-Learning Hidden Markov Models from Non-sequence Data via Tensor Decomposition

12 0.58864808 206 nips-2013-Near-Optimal Entrywise Sampling for Data Matrices

13 0.58435965 108 nips-2013-Error-Minimizing Estimates and Universal Entry-Wise Error Bounds for Low-Rank Matrix Completion

14 0.56406051 352 nips-2013-What do row and column marginals reveal about your dataset?

15 0.54434061 306 nips-2013-Speeding up Permutation Testing in Neuroimaging

16 0.54348582 186 nips-2013-Matrix factorization with binary components

17 0.54196703 222 nips-2013-On the Linear Convergence of the Proximal Gradient Method for Trace Norm Regularization

18 0.51254642 247 nips-2013-Phase Retrieval using Alternating Minimization

19 0.50986862 259 nips-2013-Provable Subspace Clustering: When LRR meets SSC

20 0.50416636 73 nips-2013-Convex Relaxations for Permutation Problems


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(2, 0.012), (16, 0.045), (33, 0.187), (34, 0.088), (41, 0.041), (49, 0.019), (56, 0.164), (70, 0.025), (84, 0.151), (85, 0.06), (89, 0.06), (93, 0.043), (95, 0.039)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.93630284 178 nips-2013-Locally Adaptive Bayesian Multivariate Time Series

Author: Daniele Durante, Bruno Scarpa, David Dunson

Abstract: In modeling multivariate time series, it is important to allow time-varying smoothness in the mean and covariance process. In particular, there may be certain time intervals exhibiting rapid changes and others in which changes are slow. If such locally adaptive smoothness is not accounted for, one can obtain misleading inferences and predictions, with over-smoothing across erratic time intervals and under-smoothing across times exhibiting slow variation. This can lead to miscalibration of predictive intervals, which can be substantially too narrow or wide depending on the time. We propose a continuous multivariate stochastic process for time series having locally varying smoothness in both the mean and covariance matrix. This process is constructed utilizing latent dictionary functions in time, which are given nested Gaussian process priors and linearly related to the observed data through a sparse mapping. Using a differential equation representation, we bypass usual computational bottlenecks in obtaining MCMC and online algorithms for approximate Bayesian inference. The performance is assessed in simulations and illustrated in a financial application. 1 1.1

2 0.90723449 82 nips-2013-Decision Jungles: Compact and Rich Models for Classification

Author: Jamie Shotton, Toby Sharp, Pushmeet Kohli, Sebastian Nowozin, John Winn, Antonio Criminisi

Abstract: Randomized decision trees and forests have a rich history in machine learning and have seen considerable success in application, perhaps particularly so for computer vision. However, they face a fundamental limitation: given enough data, the number of nodes in decision trees will grow exponentially with depth. For certain applications, for example on mobile or embedded processors, memory is a limited resource, and so the exponential growth of trees limits their depth, and thus their potential accuracy. This paper proposes decision jungles, revisiting the idea of ensembles of rooted decision directed acyclic graphs (DAGs), and shows these to be compact and powerful discriminative models for classification. Unlike conventional decision trees that only allow one path to every node, a DAG in a decision jungle allows multiple paths from the root to each leaf. We present and compare two new node merging algorithms that jointly optimize both the features and the structure of the DAGs efficiently. During training, node splitting and node merging are driven by the minimization of exactly the same objective function, here the weighted sum of entropies at the leaves. Results on varied datasets show that, compared to decision forests and several other baselines, decision jungles require dramatically less memory while considerably improving generalization. 1

same-paper 3 0.90396953 179 nips-2013-Low-Rank Matrix and Tensor Completion via Adaptive Sampling

Author: Akshay Krishnamurthy, Aarti Singh

Abstract: We study low rank matrix and tensor completion and propose novel algorithms that employ adaptive sampling schemes to obtain strong performance guarantees. Our algorithms exploit adaptivity to identify entries that are highly informative for learning the column space of the matrix (tensor) and consequently, our results hold even when the row space is highly coherent, in contrast with previous analyses. In the absence of noise, we show that one can exactly recover a n ⇥ n matrix of rank r from merely ⌦(nr3/2 log(r)) matrix entries. We also show that one can recover an order T tensor using ⌦(nrT 1/2 T 2 log(r)) entries. For noisy recovery, our algorithm consistently estimates a low rank matrix corrupted with noise using ⌦(nr3/2 polylog(n)) entries. We complement our study with simulations that verify our theory and demonstrate the scalability of our algorithms. 1

4 0.87396097 142 nips-2013-Information-theoretic lower bounds for distributed statistical estimation with communication constraints

Author: Yuchen Zhang, John Duchi, Michael Jordan, Martin J. Wainwright

Abstract: We establish lower bounds on minimax risks for distributed statistical estimation under a communication budget. Such lower bounds reveal the minimum amount of communication required by any procedure to achieve the centralized minimax-optimal rates for statistical estimation. We study two classes of protocols: one in which machines send messages independently, and a second allowing for interactive communication. We establish lower bounds for several problems, including various types of location models, as well as for parameter estimation in regression models. 1

5 0.87138891 233 nips-2013-Online Robust PCA via Stochastic Optimization

Author: Jiashi Feng, Huan Xu, Shuicheng Yan

Abstract: Robust PCA methods are typically based on batch optimization and have to load all the samples into memory during optimization. This prevents them from efficiently processing big data. In this paper, we develop an Online Robust PCA (OR-PCA) that processes one sample per time instance and hence its memory cost is independent of the number of samples, significantly enhancing the computation and storage efficiency. The proposed OR-PCA is based on stochastic optimization of an equivalent reformulation of the batch RPCA. Indeed, we show that OR-PCA provides a sequence of subspace estimations converging to the optimum of its batch counterpart and hence is provably robust to sparse corruption. Moreover, OR-PCA can naturally be applied for tracking dynamic subspace. Comprehensive simulations on subspace recovering and tracking demonstrate the robustness and efficiency advantages of the OR-PCA over online PCA and batch RPCA methods. 1

6 0.86579859 74 nips-2013-Convex Tensor Decomposition via Structured Schatten Norm Regularization

7 0.86510307 311 nips-2013-Stochastic Convex Optimization with Multiple Objectives

8 0.86473238 355 nips-2013-Which Space Partitioning Tree to Use for Search?

9 0.8637892 207 nips-2013-Near-optimal Anomaly Detection in Graphs using Lovasz Extended Scan Statistic

10 0.86014515 113 nips-2013-Exact and Stable Recovery of Pairwise Interaction Tensors

11 0.85992378 33 nips-2013-An Approximate, Efficient LP Solver for LP Rounding

12 0.85948169 333 nips-2013-Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent

13 0.85924435 344 nips-2013-Using multiple samples to learn mixture models

14 0.8591758 149 nips-2013-Latent Structured Active Learning

15 0.85858601 68 nips-2013-Confidence Intervals and Hypothesis Testing for High-Dimensional Statistical Models

16 0.85853833 45 nips-2013-BIG & QUIC: Sparse Inverse Covariance Estimation for a Million Variables

17 0.85825855 228 nips-2013-Online Learning of Dynamic Parameters in Social Networks

18 0.85825062 249 nips-2013-Polar Operators for Structured Sparse Estimation

19 0.85815334 177 nips-2013-Local Privacy and Minimax Bounds: Sharp Rates for Probability Estimation

20 0.85804039 116 nips-2013-Fantope Projection and Selection: A near-optimal convex relaxation of sparse PCA