jmlr jmlr2013 jmlr2013-36 knowledge-graph by maker-knowledge-mining

36 jmlr-2013-Distributions of Angles in Random Packing on Spheres


Source: pdf

Author: Tony Cai, Jianqing Fan, Tiefeng Jiang

Abstract: This paper studies the asymptotic behaviors of the pairwise angles among n randomly and uniformly distributed unit vectors in R p as the number of points n → ∞, while the dimension p is either fixed or growing with n. For both settings, we derive the limiting empirical distribution of the random angles and the limiting distributions of the extreme angles. The results reveal interesting differences in the two settings and provide a precise characterization of the folklore that “all high-dimensional random vectors are almost always nearly orthogonal to each other”. Applications to statistics and machine learning and connections with some open problems in physics and mathematics are also discussed. Keywords: random angle, uniform distribution on sphere, empirical law, maximum of random variables, minimum of random variables, extreme-value distribution, packing on sphere

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 For both settings, we derive the limiting empirical distribution of the random angles and the limiting distributions of the extreme angles. [sent-7, score-0.994]

2 Keywords: random angle, uniform distribution on sphere, empirical law, maximum of random variables, minimum of random variables, extreme-value distribution, packing on sphere 1. [sent-10, score-0.566]

3 For example, Hammersley (1950), Lord (1954), Alagar (1976) and Garc´a-Pelayo (2005) studied the distribution of the Euclidean distance ı between two random points on the unit sphere S p−1 . [sent-13, score-0.292]

4 Williams (2001) showed that, when the underlying geometric object is a sphere or an ellipsoid, the distribution has a strong connection to the neutron transport theory. [sent-14, score-0.299]

5 In this paper we consider the empirical law and extreme laws of the pairwise angles among a large number of random unit vectors. [sent-19, score-0.902]

6 More specifically, let X1 , · · · , Xn be random points independently chosen with the uniform distribution on S p−1 , the unit sphere in R p . [sent-20, score-0.292]

7 In the case of a fixed dimension, the global behavior of the angles Θi j is captured by its empirical distribution µn = 1 ∑ δΘi j , n 2 1≤i< j≤n n ≥ 2. [sent-23, score-0.537]

8 (2) In many applications it is of significant interest to consider the extreme angles Θmin and Θmax defined by Θmin = min{Θi j ; 1 ≤ i < j ≤ n}; Θmax = max{Θi j ; 1 ≤ i < j ≤ n}. [sent-25, score-0.55]

9 (3) (4) We will study both the empirical distribution of the angles Θi j , 1 ≤ i < j ≤ n, and the distributions of the extreme angles Θmin and Θmax as the number of points n → ∞, while the dimension p is either fixed or growing with n. [sent-26, score-1.21]

10 The distribution of minimum angle of n points randomly distributed on the p-dimensional unit sphere has important implications in statistics and machine learning. [sent-27, score-0.374]

11 The present paper systematically investigates the asymptotic behaviors of the random angles {Θi j ; 1 ≤ i < j ≤ n}. [sent-35, score-0.525]

12 It is shown that, when the dimension p is fixed, as n → ∞, the empirical distribution µn converges to a distribution with the density function given by p 1 Γ( 2 ) √ h(θ) = · (sin θ) p−2 , θ ∈ [0, π]. [sent-36, score-0.369]

13 p−1 π Γ( 2 ) On the other hand, when the dimension p grows with n, it is shown that the limiting normalized empirical distribution µn,p of the random angles Θi j , 1 ≤ i < j ≤ n is Gaussian. [sent-37, score-0.746]

14 When the dimension is high, most of the angles are concentrated around π/2. [sent-38, score-0.5]

15 In addition to the empirical law of the angles Θi j , we also consider the extreme laws of the random angles in both the fixed and growing dimension settings. [sent-42, score-1.333]

16 Furthermore, the limiting distribution of the sum of the two extreme angles Θmin + Θmax is also established. [sent-44, score-0.754]

17 The distributions of the minimum and maximum angles as well as the empirical distributions of all pairwise angles have important applications in statistics. [sent-46, score-0.996]

18 The extreme laws of the random angles considered in this paper is also related to the study of the coherence of a random matrix, which is defined to be the largest magnitude of the Pearson correlation coefficients between the columns of the random matrix. [sent-56, score-0.901]

19 Section 2 studies the limiting empirical and extreme laws of the angles Θi j in the setting of the fixed dimension p as the number of points n going to ∞. [sent-60, score-0.869]

20 When The Dimension p Is Fixed In this section we consider the limiting empirical distribution of the angles Θi j , 1 ≤ i < j ≤ n when the number of random points n → ∞ while the dimension p is fixed. [sent-66, score-0.746]

21 Throughout the paper, we let X1 , X2 , · · · , Xn be independent random points with the uniform distribution on the unit sphere S p−1 for some fixed p ≥ 2. [sent-68, score-0.292]

22 We begin with the limiting empirical distribution of the random angles. [sent-69, score-0.279]

23 Theorem 1 (Empirical Law for Fixed p) Let the empirical distribution µn of the angles Θi j , 1 ≤ i < j ≤ n, be defined as in (1). [sent-70, score-0.537]

24 Then, as n → ∞, with probability one, µn converges weakly to the 1839 C AI , FAN AND J IANG distribution with density p 1 Γ( 2 ) h(θ) = √ · (sin θ) p−2 , θ ∈ [0, π]. [sent-71, score-0.437]

25 Theorem 1 says that the average of these angles asymptotically has the same density as that of Θ12 . [sent-74, score-0.466]

26 Theorem 1 implies that most of the angles in the total of n angles are 2 concentrated around π/2. [sent-76, score-0.875]

27 In fact, in the extreme case when p → ∞, √ almost all of n angles go to π/2 at the rate p. [sent-78, score-0.55]

28 Figure 1 plots the function h p (θ) = = 1 π θ h −√ p−2 2 p−2 p Γ( 2 ) θ 1 √ · cos √ √ p−1 p−2 π Γ( 2 ) p − 2 √ p−2 , θ ∈ [0, π] (6) which is the asymptotic density of the normalized empirical distribution µn,p defined in (2) when the √ √ dimension p is fixed. [sent-81, score-0.503]

29 This can also be seen from the asymptotic approximation h p (θ) ∝ exp (p − 2) log cos √ θ p−2 2 /2 ≈ e−θ . [sent-85, score-0.39]

30 (7) We now consider the limiting distribution of the extreme angles Θmin and Θmax . [sent-86, score-0.754]

31 Then, both n2/(p−1) Θmin and n2/(p−1) (π − Θmax ) converge weakly to a distribution given by F(x) = as n → ∞, where 1 − e−Kx 0, p−1 , if x ≥ 0; if x < 0, p 1 Γ( 2 ) . [sent-88, score-0.269]

32 K= √ 4 π Γ( p+1 ) 2 (8) (9) The above theorem says that the smallest angle Θmin is close to zero, and the largest angle Θmax is close to π as n grows. [sent-89, score-0.283]

33 1 from Resnick (2007)), it is easy to check that n2Wn converges weakly to Exp(1) with the probability density function e−x I(x ≥ 0). [sent-110, score-0.361]

34 5 (b) Sum of min and max angles 0 0 50 100 200 100 300 150 400 (a) Dist of min angles 3. [sent-155, score-1.169]

35 (a) A realization of the empirical distribution µn ; (b) The average distribution of 200 realizations of µn ; (c) the distribution of Θmin and its asymptotic distribution exp(−x/(2π))/(2π); (d) the distribution of Θmin + Θmax ; the vertical line indicating the location π. [sent-168, score-0.569]

36 Then, n2/(p−1) Θmax + Θmin − π converges weakly to the distribution of X − Y , where X and Y are i. [sent-181, score-0.355]

37 6 An empirical dist −4 −2 0 2 4 −4 −2 0 2 4 (b) Sum of min and max angles 0 0 1 2 2 4 3 6 4 8 5 10 6 (a) Dists of min and max angles 0. [sent-199, score-1.383]

38 (a) A realization of the normalized empirical distribution µn,p given by (2); (b) The average distribution of 200 realizations of µn,p ; (c) the distribution of Θmin and its asymptotic distribution; (d) the distribution of Θmin + Θmax ; the vertical line indicating the location π. [sent-215, score-0.493]

39 This is clearly different from the limiting distribution given in Theorem 1 when the dimension p is fixed. [sent-227, score-0.25]

40 6 An empirical dist −4 −2 0 2 4 −4 −2 0 2 4 (b) Sum of min and max angles 0 0 1 1 2 2 3 4 3 5 4 6 (a) Dists of min and max angles 0. [sent-242, score-1.383]

41 (a) A realization of the normalized empirical distribution µn,p given by (2); (b) The average distribution of 200 realizations of µn,p ; (c) the distribution of Θmin and its asymptotic distribution; (d) the distribution of Θmin + Θmax ; the vertical line indicating the location π. [sent-259, score-0.493]

42 Then, with probability one, µn,p converges weakly to N(0, 1) as n → ∞. [sent-262, score-0.316]

43 The theorem implies √ that most of the n random angles go to π/2 very quickly. [sent-265, score-0.505]

44 Take any γ p → 0 such that pγ p → ∞ 2 and denote by Nn,p the number of the angles Θi j that are within γ p of π/2, that is, | π − Θi j | ≤ γ p . [sent-266, score-0.421]

45 Note that cos ε ≤ 1 − ε2 /2 + ε4 /24, so π P |Θ − | ≥ 2 c log p p c log p c2 log2 p √ ≤ K p 1− + 2p 24p2 p−2 1 ≤ K ′ p− 2 (c−1) for all sufficiently large p, where K ′ is a constant depending only on c. [sent-279, score-0.415]

46 We now turn to the limiting extreme laws of the angles when both n and p → ∞. [sent-282, score-0.783]

47 For the extreme laws, it is necessary to divide into three asymptotic regimes: sub-exponential case 1 log n → 0, p exponential case 1 log n → β ∈ (0, ∞), and super-exponential case 1 log n → ∞. [sent-283, score-0.48]

48 The limiting extreme p p laws are different in these three regimes. [sent-284, score-0.362]

49 As n → ∞, 2p log sin Θmin + 4 log n − log log n converges weakly to the extreme value distri√ y/2 bution with the distribution function F(y) = 1 − e−Ke , y ∈ R and K = 1/(4 2π ). [sent-288, score-1.003]

50 The above extreme value distribution differs from that in (8) where the dimension p is fixed. [sent-291, score-0.251]

51 Then p cos2 Θmin − 4 log n + log log n con1 verges weakly to a distribution with the cumulative distribution function exp{− 4√2π e−(y+8α y ∈ R. [sent-294, score-0.627]

52 As n → ∞, 2p log sin Θmin + 4 log n − log log n converges weakly to a distribution with the distribution function F(y) = 1 − exp −K(β)e(y+8β)/2 , y ∈ R, where K(β) = β 8π(1 − e−4β ) 1/2 , and the conclusion still holds if Θmin is replaced by Θmax . [sent-298, score-0.95]

53 As n → ∞, 2p log sin Θmin + 4p p−1 log n − log p converges weakly to the extreme value distri√ y/2 bution with the distribution function F(y) = 1 − e−Ke , y ∈ R with K = 1/(2 2π). [sent-304, score-0.909]

54 Theorem 3 provides the limiting distribution of Θmax + Θmin − π when the dimension p is fixed. [sent-311, score-0.25]

55 Remark 10 As mentioned in the introduction, Cai and Jiang (2011, 2012) considered the limiting distribution of the coherence of a random matrix and the coherence is closely related to the minimum angle Θmin . [sent-314, score-0.518]

56 This maximum is analyzed through modifying the proofs of the results for the limiting distribution of the coherence Ln,p in Cai and Jiang (2012). [sent-318, score-0.285]

57 This provides the minimum angle test for sphericity or the packing test on sphericity. [sent-337, score-0.317]

58 65 Table 1: The power (percent of rejections) of the packing test based on 2000 simulations The packing test does not examine whether there is a gap in the data on the sphere. [sent-368, score-0.4]

59 Discussions We have established the limiting empirical and extreme laws of the angles between random unit vectors, both for the fixed dimension and growing dimension cases. [sent-387, score-1.026]

60 For fixed p, we study the empirical law of angles, the extreme law of angles and the law of the sum of the largest and smallest angles in Theorems 1, 2 and 3. [sent-388, score-1.299]

61 Assuming p is large, we establish the empirical law of random angles in Theorem 4. [sent-389, score-0.592]

62 The study of the random angles Θi j ’s, Θmin and Θmax is also related to several problems in machine learning as well as some deterministic open problems in physics and mathematics. [sent-396, score-0.512]

63 1 Connections to Machine Learning Our studies shed lights on random geometric graphs, which are formed by n random points on the p-dimensional unit sphere as vertices with edge connecting between points Xi and X j if Θi j > δ for certain δ (Penrose, 2003; Devroye et al. [sent-399, score-0.282]

64 (2013) showed an interesting asymptotic conical structure in the critical sample eigenvectors under a spike covariance models when the ratio between the dimension and the product of the sample size with the spike size converges to a nonzero constant. [sent-410, score-0.266]

65 The behavior of the randomness of the eigenvectors within the cones is related to the behavior of the random angles studied in the present paper. [sent-413, score-0.489]

66 2 Connections to Some Open Problems in Mathematics and Physics The results on random angles established in this paper can be potentially used to study a number of open deterministic problems in mathematics and physics. [sent-418, score-0.456]

67 Note that xi − x j 2 = 2(1 − cos θi j ), where θi j is the angle −→ −→ between vectors Oxi and Ox j . [sent-436, score-0.344]

68 Then 1 ˜ = max (1 − cos θi j ) = 1 − cos Θmax , 2E(R, −∞)2 x1 ,··· ,xn ∈S p−1 ˜ where Θmax = max{θi j ; 1 ≤ i < j ≤ n}. [sent-437, score-0.559]

69 Then, it is not difficult to see 1 1 ˜ = sup = sup(1 − cos Θmax ) = 1 − cos ∆ 2 2 2ε(R, −∞) R 2E(R, −∞) R where ∆ := ess · sup(Θmax ) is the essential upper bound of the random variable Θmax as defined in (4). [sent-442, score-0.55]

70 2(1 − cos ∆) (10) The essential upper bound ∆ of the random variable Θmax can be approximated by random sampling of Θmax . [sent-444, score-0.297]

71 So the approach outlined above provides a direct way for using a stochastic method to study these deterministic problems and establishes connections between the random angles and open problems mentioned above. [sent-445, score-0.505]

72 1 Technical Results Recall that X1 , X2 , · · · are random points independently chosen with the uniform distribution on −→ −→ S p−1 , the unit sphere in R p , and Θi j is the angle between OXi and OX j and ρi j = cos Θi j for any i = j. [sent-453, score-0.636]

73 204 from Ahlfors (1979)): √ 1 1 log Γ(z) = z log z − z − log z + log 2π + O 2 x as x = Re (z) → ∞, it is easy to verify that p Γ( 2 ) Γ( p−1 ) 2 ∼ 1852 p 2 (17) D ISTRIBUTIONS OF A NGLES IN R ANDOM PACKING ON S PHERES as p → ∞. [sent-499, score-0.376]

74 2 Proofs of Main Results in Section 2 Lemma 16 Let X1 , X2 , · · · be independent random points with the uniform distribution on the unit sphere in R p . [sent-504, score-0.292]

75 Then, with probability one, µn in (1) converges weakly to µ as n → ∞. [sent-506, score-0.316]

76 If ϕn (Θ12 ) converges weakly to a probability measure ν as n → ∞, then, with probability one, νn := 1 ∑ δϕn (Θi j ) n 2 1≤i< j≤n (18) converges weakly to ν as n → ∞. [sent-508, score-0.632]

77 This leads to that, with probability one, µn in (1) converges weakly to µ as n → ∞. [sent-523, score-0.316]

78 (ii) Since ϕn (Θ12 ) converges weakly to ν as n → ∞, we know that, for any bounded continu∞ ous function u(x) defined on R, Eu(ϕn (Θ12 )) → −∞ u(x) dν(x) as n → ∞. [sent-524, score-0.279]

79 Reviewing the definition of νn in (18), the above asserts that, with probability one, νn converges weakly to ν as n → ∞. [sent-529, score-0.316]

80 Recall X1 , · · · , Xn are random points independently chosen with the uniform distribution on −→ S p−1 , −→ the unit sphere in R p , and Θi j is the angle between OXi and OX j and ρi j = cos Θi j for all 1 ≤ i, j ≤ n. [sent-532, score-0.636]

81 First, since Mn = cos Θmin by (3), then use the identity 1 − cos h = 2 sin2 h for 2 all h ∈ R to have n4/(p−1) (1 − Mn ) = 2n4/(p−1) sin2 Θmin . [sent-565, score-0.454]

82 2 (27) min By Proposition 17 and the Slusky lemma, sin Θ2 → 0 in probability as n → ∞. [sent-566, score-0.291]

83 By Proposition 17 and the Slusky lemma again, 1 n4/(p−1) Θ2 converges min 2 in distribution to F1 (x) as in Proposition 17. [sent-569, score-0.327]

84 K = 2(1−p)/2 K1 = √ 4 π Γ( p+1 ) 2 (29) n2/(p−1) (π − Θmax ) converges weakly to F(x) as n → ∞. [sent-571, score-0.279]

85 By the standard subsequence argument, we obtain that Qn converges weakly to the distribution of (X,Y ) as n → ∞. [sent-585, score-0.355]

86 We claim that 2 Yn converges weakly to N(0, 1) (40) √ as n → ∞. [sent-615, score-0.279]

87 Assuming this is true, taking ϕn (θ) = p( π − θ) for θ ∈ [0, π] and ν = N(0, 1) in (ii) of 2 Lemma 16, then, with probability one, µn,p converges weakly to N(0, 1) as n → ∞. [sent-616, score-0.316]

88 In fact, noticing Θ12 has density h(θ) in (13), it is easy to see that Yn has density function hn (y) : = = p 1 Γ( 2 ) π y √ −√ · sin p−1 2 p π Γ( 2 ) p y 1 Γ( 2 ) · cos √ √ pπ Γ( p−1 ) p 2 p−2 1 · −√ p p−2 (41) for any y ∈ R as n is sufficiently large since limn→∞ pn = ∞. [sent-618, score-0.589]

89 Then, as n → ∞, pTn + 4 log n − log log n y/2 converges weakly to an extreme value distribution with the distribution function F(y) = 1−e−Ke , y ∈ √ √ R and K = 1/(2 8π) = 1/(4 2π). [sent-638, score-0.842]

90 From (11) we know Mn = max ρi j = cos Θmin and Θmin ∈ [0, π]; (44) 2 Tn = log(1 − Mn ) = 2 log sin Θmin . [sent-639, score-0.569]

91 Now, observe that min {π − Θi j } = π − Θmax and sin(π − Θmax ) = sin Θmax . [sent-641, score-0.254]

92 Finally, by the same argument between (30) and (31) again, and by (46) we obtain 2p log sin Θmax + 4 log n − log log n 1860 D ISTRIBUTIONS OF A NGLES IN R ANDOM PACKING ON S PHERES √ y/2 converges weakly to F(y) = 1 − e−Ke , y ∈ R and K = 1/(4 2π ). [sent-645, score-0.798]

93 Replacing Ln and Theorem 1 there by Mn and Theorem 6, we get that 2 pMn − 4 log n + log log n 1 converges weakly to the distribution function exp{− 4√2π e−(y+8α sion follows since Mn = cos Θmin . [sent-650, score-0.864]

94 Then, as n → ∞, pTn + 4 log n − log log n converges weakly to the distribution function F(y) = 1 − exp −K(β)e(y+8β)/2 , y ∈ R, where K(β) = β 1 2 2π(1 − e−4β ) 1/2 = β 8π(1 − e−4β ) 1/2 . [sent-660, score-0.637]

95 From (44) and (45) we obtain Θmin → cos−1 1 − e−4β in probability and (47) 2p log sin Θmin + 4 log n − log log n (48) converges weakly to the distribution function F(y) = 1 − exp −K(β)e(y+8β)/2 , y ∈ R, where K(β) = β 8π(1 − e−4β ) 1/2 (49) as n → ∞. [sent-661, score-0.911]

96 Now, reviewing (46) and the argument between (30) and (31), by (47) and (48), we conclude that Θmax → π − cos−1 1 − e−4β in probability and 2p log sin Θmax + 4 log n − log log n converges weakly to the distribution function F(y) as in (49). [sent-662, score-0.911]

97 1861 C AI , FAN AND J IANG ii) As n → ∞, pMn + 4p log n − log p p−1 √ y/2 converges weakly to the distribution function F(y) = 1 − e−Ke , y ∈ R with K = 1/(2 2π). [sent-671, score-0.543]

98 Combining i), ii), (44) and (45), we see that, as n → ∞, Θmin → 0 in probability; 4p 2p log sin Θmin + log n − log p converges weakly to p−1 √ y/2 F(y) = 1 − e−Ke , y ∈ R with K = 1/(2 2π). [sent-672, score-0.704]

99 Finally, combining the above two convergence results with (46) and the argument between (30) and (31), we have Θmax → π in probability; 4p 2p log sin Θmax + log n − log p converges weakly to p−1 √ y/2 F(y) = 1 − e−Ke , y ∈ R with K = 1/(2 2π). [sent-673, score-0.704]

100 Phase transition in limiting distributions of coherence of highdimensional random matrices. [sent-702, score-0.281]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('angles', 0.421), ('cos', 0.227), ('iang', 0.222), ('ngles', 0.206), ('pheres', 0.206), ('packing', 0.2), ('weakly', 0.193), ('istributions', 0.176), ('mn', 0.172), ('jiang', 0.163), ('andom', 0.146), ('sphere', 0.145), ('sin', 0.143), ('fan', 0.134), ('cai', 0.133), ('extreme', 0.129), ('limiting', 0.128), ('angle', 0.117), ('min', 0.111), ('tiefeng', 0.111), ('laws', 0.105), ('max', 0.105), ('law', 0.096), ('log', 0.094), ('spurious', 0.09), ('converges', 0.086), ('coherence', 0.081), ('oxi', 0.079), ('ke', 0.079), ('distribution', 0.076), ('pn', 0.071), ('ty', 0.069), ('dist', 0.069), ('tx', 0.069), ('asymptotic', 0.069), ('limn', 0.063), ('katanforoush', 0.063), ('ess', 0.061), ('correlation', 0.06), ('hn', 0.058), ('physics', 0.056), ('lemma', 0.054), ('tn', 0.053), ('ai', 0.053), ('tony', 0.053), ('eu', 0.05), ('connections', 0.049), ('theorem', 0.049), ('deli', 0.047), ('eun', 0.047), ('folklore', 0.047), ('jianqing', 0.047), ('kuijlaars', 0.047), ('neutron', 0.047), ('shahshahani', 0.047), ('dimension', 0.046), ('density', 0.045), ('realizations', 0.043), ('un', 0.043), ('ox', 0.042), ('pairwise', 0.04), ('growing', 0.04), ('empirical', 0.04), ('cp', 0.04), ('ii', 0.039), ('dx', 0.038), ('extremal', 0.038), ('probability', 0.037), ('distributions', 0.037), ('realization', 0.037), ('ave', 0.037), ('xn', 0.036), ('unit', 0.036), ('random', 0.035), ('kendall', 0.034), ('kx', 0.034), ('eigenvectors', 0.033), ('concentrated', 0.033), ('arxiv', 0.032), ('armentano', 0.032), ('arratia', 0.032), ('assertions', 0.032), ('castro', 0.032), ('conical', 0.032), ('coulomb', 0.032), ('diaconis', 0.032), ('dists', 0.032), ('gautier', 0.032), ('isotropicity', 0.032), ('nimum', 0.032), ('pmn', 0.032), ('saff', 0.032), ('slusky', 0.032), ('stoyan', 0.032), ('tammes', 0.032), ('weidong', 0.032), ('wilfrid', 0.032), ('vy', 0.032), ('proposition', 0.031), ('geometric', 0.031)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.000001 36 jmlr-2013-Distributions of Angles in Random Packing on Spheres

Author: Tony Cai, Jianqing Fan, Tiefeng Jiang

Abstract: This paper studies the asymptotic behaviors of the pairwise angles among n randomly and uniformly distributed unit vectors in R p as the number of points n → ∞, while the dimension p is either fixed or growing with n. For both settings, we derive the limiting empirical distribution of the random angles and the limiting distributions of the extreme angles. The results reveal interesting differences in the two settings and provide a precise characterization of the folklore that “all high-dimensional random vectors are almost always nearly orthogonal to each other”. Applications to statistics and machine learning and connections with some open problems in physics and mathematics are also discussed. Keywords: random angle, uniform distribution on sphere, empirical law, maximum of random variables, minimum of random variables, extreme-value distribution, packing on sphere

2 0.11312951 34 jmlr-2013-Distance Preserving Embeddings for General n-Dimensional Manifolds

Author: Nakul Verma

Abstract: Low dimensional embeddings of manifold data have gained popularity in the last decade. However, a systematic finite sample analysis of manifold embedding algorithms largely eludes researchers. Here we present two algorithms that embed a general n-dimensional manifold into Rd (where d only depends on some key manifold properties such as its intrinsic dimension, volume and curvature) that guarantee to approximately preserve all interpoint geodesic distances. Keywords: manifold learning, isometric embeddings, non-linear dimensionality reduction, Nash’s embedding theorem

3 0.10331292 79 jmlr-2013-On the Mutual Nearest Neighbors Estimate in Regression

Author: Arnaud Guyader, Nick Hengartner

Abstract: Motivated by promising experimental results, this paper investigates the theoretical properties of a recently proposed nonparametric estimator, called the Mutual Nearest Neighbors rule, which estimates the regression function m(x) = E[Y |X = x] as follows: first identify the k nearest neighbors of x in the sample Dn , then keep only those for which x is itself one of the k nearest neighbors, and finally take the average over the corresponding response variables. We prove that this estimator is consistent and that its rate of convergence is optimal. Since the estimate with the optimal rate of convergence depends on the unknown distribution of the observations, we also present adaptation results by data-splitting. Keywords: nonparametric estimation, nearest neighbor methods, mathematical statistics

4 0.072219931 4 jmlr-2013-A Max-Norm Constrained Minimization Approach to 1-Bit Matrix Completion

Author: Tony Cai, Wen-Xin Zhou

Abstract: We consider in this paper the problem of noisy 1-bit matrix completion under a general non-uniform sampling distribution using the max-norm as a convex relaxation for the rank. A max-norm constrained maximum likelihood estimate is introduced and studied. The rate of convergence for the estimate is obtained. Information-theoretical methods are used to establish a minimax lower bound under the general sampling model. The minimax upper and lower bounds together yield the optimal rate of convergence for the Frobenius norm loss. Computational algorithms and numerical performance are also discussed. Keywords: 1-bit matrix completion, low-rank matrix, max-norm, trace-norm, constrained optimization, maximum likelihood estimate, optimal rate of convergence

5 0.071226202 50 jmlr-2013-Greedy Feature Selection for Subspace Clustering

Author: Eva L. Dyer, Aswin C. Sankaranarayanan, Richard G. Baraniuk

Abstract: Unions of subspaces provide a powerful generalization of single subspace models for collections of high-dimensional data; however, learning multiple subspaces from data is challenging due to the fact that segmentation—the identification of points that live in the same subspace—and subspace estimation must be performed simultaneously. Recently, sparse recovery methods were shown to provide a provable and robust strategy for exact feature selection (EFS)—recovering subsets of points from the ensemble that live in the same subspace. In parallel with recent studies of EFS with ℓ1 -minimization, in this paper, we develop sufficient conditions for EFS with a greedy method for sparse signal recovery known as orthogonal matching pursuit (OMP). Following our analysis, we provide an empirical study of feature selection strategies for signals living on unions of subspaces and characterize the gap between sparse recovery methods and nearest neighbor (NN)-based approaches. In particular, we demonstrate that sparse recovery methods provide significant advantages over NN methods and that the gap between the two approaches is particularly pronounced when the sampling of subspaces in the data set is sparse. Our results suggest that OMP may be employed to reliably recover exact feature sets in a number of regimes where NN approaches fail to reveal the subspace membership of points in the ensemble. Keywords: subspace clustering, unions of subspaces, hybrid linear models, sparse approximation, structured sparsity, nearest neighbors, low-rank approximation

6 0.06882266 102 jmlr-2013-Sparse Matrix Inversion with Scaled Lasso

7 0.063187093 20 jmlr-2013-CODA: High Dimensional Copula Discriminant Analysis

8 0.057235669 47 jmlr-2013-Gaussian Kullback-Leibler Approximate Inference

9 0.054746836 77 jmlr-2013-On the Convergence of Maximum Variance Unfolding

10 0.054637399 119 jmlr-2013-Variable Selection in High-Dimension with Random Designs and Orthogonal Matching Pursuit

11 0.054623879 84 jmlr-2013-PC Algorithm for Nonparanormal Graphical Models

12 0.053801466 35 jmlr-2013-Distribution-Dependent Sample Complexity of Large Margin Learning

13 0.052590765 27 jmlr-2013-Consistent Selection of Tuning Parameters via Variable Selection Stability

14 0.050635051 26 jmlr-2013-Conjugate Relation between Loss Functions and Uncertainty Sets in Classification Problems

15 0.050165597 104 jmlr-2013-Sparse Single-Index Model

16 0.049690884 93 jmlr-2013-Random Walk Kernels and Learning Curves for Gaussian Process Regression on Random Graphs

17 0.049570747 97 jmlr-2013-Risk Bounds of Learning Processes for Lévy Processes

18 0.048088886 64 jmlr-2013-Lovasz theta function, SVMs and Finding Dense Subgraphs

19 0.044969562 5 jmlr-2013-A Near-Optimal Algorithm for Differentially-Private Principal Components

20 0.044462051 31 jmlr-2013-Derivative Estimation with Local Polynomial Fitting


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.225), (1, 0.076), (2, 0.094), (3, -0.011), (4, 0.028), (5, -0.078), (6, 0.02), (7, 0.051), (8, 0.122), (9, 0.009), (10, -0.17), (11, 0.007), (12, 0.069), (13, 0.021), (14, -0.125), (15, 0.118), (16, -0.008), (17, -0.109), (18, -0.075), (19, -0.147), (20, 0.198), (21, -0.031), (22, -0.083), (23, -0.079), (24, -0.047), (25, 0.157), (26, -0.004), (27, -0.094), (28, -0.035), (29, 0.057), (30, 0.061), (31, 0.102), (32, -0.039), (33, 0.192), (34, 0.07), (35, -0.033), (36, -0.043), (37, 0.022), (38, 0.127), (39, 0.05), (40, -0.135), (41, 0.104), (42, 0.02), (43, 0.036), (44, -0.021), (45, -0.204), (46, 0.09), (47, 0.082), (48, -0.063), (49, -0.026)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93859524 36 jmlr-2013-Distributions of Angles in Random Packing on Spheres

Author: Tony Cai, Jianqing Fan, Tiefeng Jiang

Abstract: This paper studies the asymptotic behaviors of the pairwise angles among n randomly and uniformly distributed unit vectors in R p as the number of points n → ∞, while the dimension p is either fixed or growing with n. For both settings, we derive the limiting empirical distribution of the random angles and the limiting distributions of the extreme angles. The results reveal interesting differences in the two settings and provide a precise characterization of the folklore that “all high-dimensional random vectors are almost always nearly orthogonal to each other”. Applications to statistics and machine learning and connections with some open problems in physics and mathematics are also discussed. Keywords: random angle, uniform distribution on sphere, empirical law, maximum of random variables, minimum of random variables, extreme-value distribution, packing on sphere

2 0.5359835 79 jmlr-2013-On the Mutual Nearest Neighbors Estimate in Regression

Author: Arnaud Guyader, Nick Hengartner

Abstract: Motivated by promising experimental results, this paper investigates the theoretical properties of a recently proposed nonparametric estimator, called the Mutual Nearest Neighbors rule, which estimates the regression function m(x) = E[Y |X = x] as follows: first identify the k nearest neighbors of x in the sample Dn , then keep only those for which x is itself one of the k nearest neighbors, and finally take the average over the corresponding response variables. We prove that this estimator is consistent and that its rate of convergence is optimal. Since the estimate with the optimal rate of convergence depends on the unknown distribution of the observations, we also present adaptation results by data-splitting. Keywords: nonparametric estimation, nearest neighbor methods, mathematical statistics

3 0.45807573 4 jmlr-2013-A Max-Norm Constrained Minimization Approach to 1-Bit Matrix Completion

Author: Tony Cai, Wen-Xin Zhou

Abstract: We consider in this paper the problem of noisy 1-bit matrix completion under a general non-uniform sampling distribution using the max-norm as a convex relaxation for the rank. A max-norm constrained maximum likelihood estimate is introduced and studied. The rate of convergence for the estimate is obtained. Information-theoretical methods are used to establish a minimax lower bound under the general sampling model. The minimax upper and lower bounds together yield the optimal rate of convergence for the Frobenius norm loss. Computational algorithms and numerical performance are also discussed. Keywords: 1-bit matrix completion, low-rank matrix, max-norm, trace-norm, constrained optimization, maximum likelihood estimate, optimal rate of convergence

4 0.45594126 119 jmlr-2013-Variable Selection in High-Dimension with Random Designs and Orthogonal Matching Pursuit

Author: Antony Joseph

Abstract: The performance of orthogonal matching pursuit (OMP) for variable selection is analyzed for random designs. When contrasted with the deterministic case, since the performance is here measured after averaging over the distribution of the design matrix, one can have far less stringent sparsity constraints on the coefficient vector. We demonstrate that for exact sparse vectors, the performance of the OMP is similar to known results on the Lasso algorithm (Wainwright, 2009). Moreover, variable selection under a more relaxed sparsity assumption on the coefficient vector, whereby one has only control on the ℓ1 norm of the smaller coefficients, is also analyzed. As consequence of these results, we also show that the coefficient estimate satisfies strong oracle type inequalities. Keywords: high dimensional regression, greedy algorithms, Lasso, compressed sensing

5 0.45315102 50 jmlr-2013-Greedy Feature Selection for Subspace Clustering

Author: Eva L. Dyer, Aswin C. Sankaranarayanan, Richard G. Baraniuk

Abstract: Unions of subspaces provide a powerful generalization of single subspace models for collections of high-dimensional data; however, learning multiple subspaces from data is challenging due to the fact that segmentation—the identification of points that live in the same subspace—and subspace estimation must be performed simultaneously. Recently, sparse recovery methods were shown to provide a provable and robust strategy for exact feature selection (EFS)—recovering subsets of points from the ensemble that live in the same subspace. In parallel with recent studies of EFS with ℓ1 -minimization, in this paper, we develop sufficient conditions for EFS with a greedy method for sparse signal recovery known as orthogonal matching pursuit (OMP). Following our analysis, we provide an empirical study of feature selection strategies for signals living on unions of subspaces and characterize the gap between sparse recovery methods and nearest neighbor (NN)-based approaches. In particular, we demonstrate that sparse recovery methods provide significant advantages over NN methods and that the gap between the two approaches is particularly pronounced when the sampling of subspaces in the data set is sparse. Our results suggest that OMP may be employed to reliably recover exact feature sets in a number of regimes where NN approaches fail to reveal the subspace membership of points in the ensemble. Keywords: subspace clustering, unions of subspaces, hybrid linear models, sparse approximation, structured sparsity, nearest neighbors, low-rank approximation

6 0.43142182 20 jmlr-2013-CODA: High Dimensional Copula Discriminant Analysis

7 0.43079513 34 jmlr-2013-Distance Preserving Embeddings for General n-Dimensional Manifolds

8 0.38624704 35 jmlr-2013-Distribution-Dependent Sample Complexity of Large Margin Learning

9 0.35360935 102 jmlr-2013-Sparse Matrix Inversion with Scaled Lasso

10 0.31664506 64 jmlr-2013-Lovasz theta function, SVMs and Finding Dense Subgraphs

11 0.30724019 39 jmlr-2013-Efficient Active Learning of Halfspaces: An Aggressive Approach

12 0.30213737 84 jmlr-2013-PC Algorithm for Nonparanormal Graphical Models

13 0.29901645 47 jmlr-2013-Gaussian Kullback-Leibler Approximate Inference

14 0.2817359 104 jmlr-2013-Sparse Single-Index Model

15 0.28063437 14 jmlr-2013-Asymptotic Results on Adaptive False Discovery Rate Controlling Procedures Based on Kernel Estimators

16 0.27457854 26 jmlr-2013-Conjugate Relation between Loss Functions and Uncertainty Sets in Classification Problems

17 0.27412054 77 jmlr-2013-On the Convergence of Maximum Variance Unfolding

18 0.26672325 2 jmlr-2013-A Binary-Classification-Based Metric between Time-Series Distributions and Its Use in Statistical and Learning Problems

19 0.26238406 6 jmlr-2013-A Plug-in Approach to Neyman-Pearson Classification

20 0.25734478 74 jmlr-2013-Multivariate Convex Regression with Adaptive Partitioning


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.018), (5, 0.152), (6, 0.025), (10, 0.077), (20, 0.013), (23, 0.032), (44, 0.011), (68, 0.02), (70, 0.037), (75, 0.035), (85, 0.036), (87, 0.024), (94, 0.425)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.70567298 36 jmlr-2013-Distributions of Angles in Random Packing on Spheres

Author: Tony Cai, Jianqing Fan, Tiefeng Jiang

Abstract: This paper studies the asymptotic behaviors of the pairwise angles among n randomly and uniformly distributed unit vectors in R p as the number of points n → ∞, while the dimension p is either fixed or growing with n. For both settings, we derive the limiting empirical distribution of the random angles and the limiting distributions of the extreme angles. The results reveal interesting differences in the two settings and provide a precise characterization of the folklore that “all high-dimensional random vectors are almost always nearly orthogonal to each other”. Applications to statistics and machine learning and connections with some open problems in physics and mathematics are also discussed. Keywords: random angle, uniform distribution on sphere, empirical law, maximum of random variables, minimum of random variables, extreme-value distribution, packing on sphere

2 0.40501443 50 jmlr-2013-Greedy Feature Selection for Subspace Clustering

Author: Eva L. Dyer, Aswin C. Sankaranarayanan, Richard G. Baraniuk

Abstract: Unions of subspaces provide a powerful generalization of single subspace models for collections of high-dimensional data; however, learning multiple subspaces from data is challenging due to the fact that segmentation—the identification of points that live in the same subspace—and subspace estimation must be performed simultaneously. Recently, sparse recovery methods were shown to provide a provable and robust strategy for exact feature selection (EFS)—recovering subsets of points from the ensemble that live in the same subspace. In parallel with recent studies of EFS with ℓ1 -minimization, in this paper, we develop sufficient conditions for EFS with a greedy method for sparse signal recovery known as orthogonal matching pursuit (OMP). Following our analysis, we provide an empirical study of feature selection strategies for signals living on unions of subspaces and characterize the gap between sparse recovery methods and nearest neighbor (NN)-based approaches. In particular, we demonstrate that sparse recovery methods provide significant advantages over NN methods and that the gap between the two approaches is particularly pronounced when the sampling of subspaces in the data set is sparse. Our results suggest that OMP may be employed to reliably recover exact feature sets in a number of regimes where NN approaches fail to reveal the subspace membership of points in the ensemble. Keywords: subspace clustering, unions of subspaces, hybrid linear models, sparse approximation, structured sparsity, nearest neighbors, low-rank approximation

3 0.39815998 25 jmlr-2013-Communication-Efficient Algorithms for Statistical Optimization

Author: Yuchen Zhang, John C. Duchi, Martin J. Wainwright

Abstract: We analyze two communication-efficient algorithms for distributed optimization in statistical settings involving large-scale data sets. The first algorithm is a standard averaging method that distributes the N data samples evenly to m machines, performs separate minimization on each subset, and then averages the estimates. We provide a sharp analysis of this average mixture algorithm, showing that under a reasonable set of conditions, the combined parameter achieves √ mean-squared error (MSE) that decays as O (N −1 + (N/m)−2 ). Whenever m ≤ N, this guarantee matches the best possible rate achievable by a centralized algorithm having access to all N samples. The second algorithm is a novel method, based on an appropriate form of bootstrap subsampling. Requiring only a single round of communication, it has mean-squared error that decays as O (N −1 + (N/m)−3 ), and so is more robust to the amount of parallelization. In addition, we show that a stochastic gradient-based method attains mean-squared error decaying as O (N −1 + (N/m)−3/2 ), easing computation at the expense of a potentially slower MSE rate. We also provide an experimental evaluation of our methods, investigating their performance both on simulated data and on a large-scale regression problem from the internet search domain. In particular, we show that our methods can be used to efficiently solve an advertisement prediction problem from the Chinese SoSo Search Engine, which involves logistic regression with N ≈ 2.4 × 108 samples and d ≈ 740,000 covariates. Keywords: distributed learning, stochastic optimization, averaging, subsampling

4 0.39364123 4 jmlr-2013-A Max-Norm Constrained Minimization Approach to 1-Bit Matrix Completion

Author: Tony Cai, Wen-Xin Zhou

Abstract: We consider in this paper the problem of noisy 1-bit matrix completion under a general non-uniform sampling distribution using the max-norm as a convex relaxation for the rank. A max-norm constrained maximum likelihood estimate is introduced and studied. The rate of convergence for the estimate is obtained. Information-theoretical methods are used to establish a minimax lower bound under the general sampling model. The minimax upper and lower bounds together yield the optimal rate of convergence for the Frobenius norm loss. Computational algorithms and numerical performance are also discussed. Keywords: 1-bit matrix completion, low-rank matrix, max-norm, trace-norm, constrained optimization, maximum likelihood estimate, optimal rate of convergence

5 0.3892687 105 jmlr-2013-Sparsity Regret Bounds for Individual Sequences in Online Linear Regression

Author: Sébastien Gerchinovitz

Abstract: We consider the problem of online linear regression on arbitrary deterministic sequences when the ambient dimension d can be much larger than the number of time rounds T . We introduce the notion of sparsity regret bound, which is a deterministic online counterpart of recent risk bounds derived in the stochastic setting under a sparsity scenario. We prove such regret bounds for an online-learning algorithm called SeqSEW and based on exponential weighting and data-driven truncation. In a second part we apply a parameter-free version of this algorithm to the stochastic setting (regression model with random design). This yields risk bounds of the same flavor as in Dalalyan and Tsybakov (2012a) but which solve two questions left open therein. In particular our risk bounds are adaptive (up to a logarithmic factor) to the unknown variance of the noise if the latter is Gaussian. We also address the regression model with fixed design. Keywords: sparsity, online linear regression, individual sequences, adaptive regret bounds

6 0.38911483 26 jmlr-2013-Conjugate Relation between Loss Functions and Uncertainty Sets in Classification Problems

7 0.38829115 2 jmlr-2013-A Binary-Classification-Based Metric between Time-Series Distributions and Its Use in Statistical and Learning Problems

8 0.38783297 73 jmlr-2013-Multicategory Large-Margin Unified Machines

9 0.38561872 28 jmlr-2013-Construction of Approximation Spaces for Reinforcement Learning

10 0.38526002 72 jmlr-2013-Multi-Stage Multi-Task Feature Learning

11 0.38470727 52 jmlr-2013-How to Solve Classification and Regression Problems on High-Dimensional Data with a Supervised Extension of Slow Feature Analysis

12 0.38424316 17 jmlr-2013-Belief Propagation for Continuous State Spaces: Stochastic Message-Passing with Quantitative Guarantees

13 0.38365945 102 jmlr-2013-Sparse Matrix Inversion with Scaled Lasso

14 0.38354763 10 jmlr-2013-Algorithms and Hardness Results for Parallel Large Margin Learning

15 0.38349608 114 jmlr-2013-The Rate of Convergence of AdaBoost

16 0.38280049 39 jmlr-2013-Efficient Active Learning of Halfspaces: An Aggressive Approach

17 0.38273084 48 jmlr-2013-Generalized Spike-and-Slab Priors for Bayesian Group Feature Selection Using Expectation Propagation

18 0.38200405 68 jmlr-2013-Machine Learning with Operational Costs

19 0.38162139 15 jmlr-2013-Bayesian Canonical Correlation Analysis

20 0.38073817 18 jmlr-2013-Beyond Fano's Inequality: Bounds on the Optimal F-Score, BER, and Cost-Sensitive Risk and Their Implications