nips nips2012 nips2012-128 knowledge-graph by maker-knowledge-mining

128 nips-2012-Fast Resampling Weighted v-Statistics

Source: pdf

Author: Chunxiao Zhou, Jiseong Park, Yun Fu

Abstract: In this paper, a novel and computationally fast algorithm for computing weighted v-statistics in resampling both univariate and multivariate data is proposed. To avoid any real resampling, we have linked this problem with ﬁnite group action and converted it into a problem of orbit enumeration. For further computational cost reduction, an efﬁcient method is developed to list all orbits by their symmetry orders and calculate all index function orbit sums and data function orbit sums recursively. The computational complexity analysis shows reduction in the computational cost from n! or nn level to low-order polynomial level. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract In this paper, a novel and computationally fast algorithm for computing weighted v-statistics in resampling both univariate and multivariate data is proposed. [sent-7, score-0.32]

2 To avoid any real resampling, we have linked this problem with ﬁnite group action and converted it into a problem of orbit enumeration. [sent-8, score-0.747]

3 For further computational cost reduction, an efﬁcient method is developed to list all orbits by their symmetry orders and calculate all index function orbit sums and data function orbit sums recursively. [sent-9, score-1.857]

4 The key idea of resampling is to generate the empirical distribution of a test statistic by resampling with or without replacement from the original observations. [sent-16, score-0.553]

5 One of the most important problems in resampling is calculating resampling statistics, i. [sent-20, score-0.524]

6 , the expected values of test statistics under the resampling distribution, because resampling statistics are compact representatives of the resampling distribution. [sent-22, score-0.801]

7 In addition, a resampling distribution may be approximated by a parametric model with some resampling statistics, for example, the ﬁrst several moments of a resampling distribution [5, 16]. [sent-23, score-0.782]

8 In this paper, we focus on computing resampling weighted vstatistics [18] (see Section 2 for the formal deﬁnition). [sent-24, score-0.337]

9 Traditionally, estimation of resampling statistics is solved by random sampling since exhaustive examination of the resampling space is usually ill advised [5,16]. [sent-31, score-0.548]

10 To date, there is no systematic and efﬁcient solution to the issue of exact calculation of resampling statistics. [sent-33, score-0.321]

11 , indices of all possible k observations ) into several permutation equivalent index subsets such that the summa1 tion of the data/index function term over all permutations is invariant within each subset and can be calculated without conducting any permutation. [sent-41, score-0.362]

12 However, methods for listing all permutation equivalent index subsets and calculating of the respective cardinalities were not emphasized in the previous publication [21]. [sent-43, score-0.423]

13 Even only for calculating the ﬁrst four moments of a second order resampling weighted v statistic, hundreds of index subsets and thousands of coefﬁcients have to be derived manually. [sent-45, score-0.564]

14 In this paper, we propose a novel and computationally fast algorithm for computing weighted vstatistics in resampling both univariate and multivariate data. [sent-48, score-0.35]

15 In the proposed algorithm, the calculation of weighted v-statistics is considered as a summation of products of data function terms and index function terms over a high-dimensional index set and all possible resamplings with or without replacement. [sent-49, score-0.553]

16 To avoid any resampling, we link this problem with ﬁnite group actions and convert it into a problem of orbit enumeration [10]. [sent-50, score-0.692]

17 For further computational cost reduction, an efﬁcient method has been developed to list all orbits by their symmetry order and to calculate all index function orbit sums and data function orbit sums recursively. [sent-51, score-1.857]

18 We have built up a solid theoretical framework that explains the symmetry of resampling statistics using a product of several symmetric groups. [sent-56, score-0.34]

19 In addition, by associating this problem with ﬁnite group action, we have developed an algorithm to enumerate all orbits by their symmetry order and generated a recursive relationship for orbits sum calculation systematically. [sent-57, score-0.657]

20 All resampling strategies, such as permutation and bootstrap, are also more or less symmetric. [sent-60, score-0.368]

21 Pn This study is focused on computing resampling weighted v-statistics, i. [sent-62, score-0.307]

22 Thus the r-th moment of a resampling weighted v-statistic can be considered as a summation of r products of data function terms and index function terms over a high-dimensional index set Ud = dr {1, · · · , n} and all possible resamplings in R. [sent-77, score-0.833]

23 Since both index space and resampling space are huge, it is computationally expensive for calculating resampling statistics directly. [sent-78, score-0.729]

24 For terminology convenience, {(i1 , · · · , i1 ), · · · , (ir , · · · , ir )} is called an index paragraph, which 1 1 d d includes r index sentences (ik , · · · , ik ), k = 1, · · · , r, and each index sentence has d index words 1 d ik , j = 1, · · · , d. [sent-79, score-1.273]

25 Note that there are three different types of symmetry in computing resampling j weighted v-statistics. [sent-80, score-0.369]

26 The ﬁrst symmetry is that permutation of the order of index words will not affect the result since the data function is assumed to be symmetric. [sent-81, score-0.384]

27 The second symmetry is the permutation of the order of index sentences since multiplication is commutative. [sent-82, score-0.387]

28 The third symmetry is that each possible resampling is equally likely to be chosen. [sent-83, score-0.314]

29 ·ik ) d ⌘o , (2) n {1, · · · , n}dr = {(i1 , · · · , i1 ), · · · , (ir , · · · , ir )}|ik 2 m 1 1 d d o {1, · · · , n}; m = 1, · · · , d; k = 1, · · · , r is then divided into disjoint index subsets, in ⇣Q ⌘ r which E h(x ·ik , · · · , x ·ik ) is invariant. [sent-85, score-0.318]

30 Each part of the set partition is called an orbit that denotes the trajectory moved by all elements within the group. [sent-101, score-0.64]

31 Two elements, x and y 2 X fall into the same orbit if there exists a g 2 G such that x = g · y. [sent-103, score-0.607]

32 A transversal of orbits is a set of representatives containing exactly one element from each orbit. [sent-105, score-0.537]

33 The action of G := Sn ⇥ Sr ⇥ Sd r on the index set Ud is deﬁned as 3 ( , ⌧, ⇡1 , · · · , ⇡r ) · ik := m · i⌧ ⇡ k 1 1 ·k , ·m where m 2 {1, · · · , d}, and k 2 {1, · · · , r}. [sent-112, score-0.424]

34 Here, ⇡k denotes the permutation of the order of index words within the k-th index sentence, ⌧ denotes the permutation of the order of r index sentences, and denotes the permutation of the value of an index word from 1 to n. [sent-113, score-1.172]

35 , the number 1 1 d d of indices within the index orbit [{(i1 , · · · , i1 ), · · · , (ir , · · · , ir )}]. [sent-134, score-0.925]

36 1 1 d d ⇣Q ⌘ r Due to the invariance property of E h(x ·ik , · · · , x ·ik ) , the calculation of permutation k=1 1 d statistics can be simpliﬁed by summing up all index function product terms in each index orbit. [sent-135, score-0.584]

37 (6) Proposition 2 shows that the calculation of resampling weighted v-statistics can be solved by computing data function orbit sums, index function orbit sums, and cardinalities of all orbits deﬁned in deﬁnition 1. [sent-139, score-2.02]

38 Now we demonstrate how to calculate orbit cardinalities, h and w . [sent-141, score-0.629]

39 The following shows a naive algorithm to enumerate all index paragraphs and cardinality of each orbit r of G Ud , which are needed to calculate h and w . [sent-142, score-0.939]

40 Note that listing the index paragraphs of each orbit is equivalent to ﬁnding all connected components in the Cayley Action Graph, which can be performed by using existing depth-ﬁrst or breadth-ﬁrst search methods [15]. [sent-158, score-0.899]

41 In other words, we can not list all index orbits before we know the data size n. [sent-163, score-0.425]

42 1 i2 1 2 r G \ \U d 3 1 r G * \ \U d * 1 i1 2 3 Cayley action graph 1 Set of orbits U 2 [{(1,1)}] [{(1,2)}] Figure 1: Cayley action graph for G ⇤ 1 U2 . [sent-165, score-0.422]

43 For computing h and w , we ﬁnd that we do not need to know all the index paragraphs within each index orbit. [sent-168, score-0.445]

44 Since each orbit is well structured, it is enough to only list a transversal of orbits r G Ud and corresponding cardinalities. [sent-169, score-1.144]

45 Actually, the 1 2 n o transversal L = {(1, 1)}, {(1, 2)} carries all the above information. [sent-173, score-0.304]

46 We deﬁne an index set Ud ⇤ = {1, · · · , dr}dr = {(i1 , · · · , i1 ), · · · , (ir , · · · , ir )}|ik m 1 1 d d o r ⇤ 2 {1, · · · , dr}; m = 1, · · · , d; k = 1, · · · , r and a group G := Sdr ⇥ Sr ⇥ Sd . [sent-176, score-0.389]

47 The transversal of G⇤ r Ud ⇤ is also a transversal of G r Ud . [sent-181, score-0.608]

48 r By proposition 3, we notice that the listing of the transversal of G Ud is equivalent to the listing of ⇤ r⇤ the transversal of G Ud (see Figure 2). [sent-182, score-0.715]

49 r Furthermore, ﬁnding the transversal of G⇤ Ud ⇤ can be done without knowning sample size n. [sent-184, score-0.304]

50 Due r r to the structure of each orbit of G Ud , we can calculate the cardinality of each orbit of G Ud ⇤ r⇤ r ⇤ r⇤ with the transversal of G Ud , although G Ud and G Ud have different caridnalities for corresponding orbits. [sent-185, score-1.566]

51 Table 1: Ofﬂine double sided searching algorithm for listing the transversal Input: d and r, 1. [sent-186, score-0.385]

52 Starting from an orbit representative {(1, · · · , d), · · · , ((r 1)d + 1, · · · , rd)} r 2. [sent-187, score-0.62]

53 Construct the transversal of Sdr Ud ⇤ by merging r 3. [sent-188, score-0.35]

54 Construct the transversal of of G⇤ Ud ⇤ by graph isomorphism testing 4. [sent-189, score-0.372]

55 Ending to an orbit representative {(1, · · · , 1), · · · , (1, · · · , 1)} r Output: a transversal L of G Ud , #( ), #( ! [sent-190, score-0.924]

56 ⌫), and merging order(symmetry order) of orbits Comparing with the Cayley Action Graph naive algorithm, our improved algorithm lists the transverr sal of G Ud and calculates the cardinalities of all orbits more efﬁciently. [sent-191, score-0.528]

57 In addition, the improved algorithm also assigns a symmetry order to all orbits, which helps further reduce the computational 5 cost of the data function orbit sum h and the index function orbit sum w . [sent-192, score-1.564]

58 On r one hand, it is challenging to directly list the transversal of G⇤ Ud ⇤ . [sent-194, score-0.323]

59 These two group r actions help us ﬁnd the transversal of G⇤ Ud ⇤ efﬁciently with a double sided searching method. [sent-196, score-0.431]

60 Each orbit of Sdr [{(i1 , · · · , i1 ), · · · , (ir , · · · , ir )}]s . [sent-199, score-0.733]

61 1 1 d d · ik , where 2 m r Ud ⇤ is denoted by Note the group action deﬁned in deﬁnition 3 only allows permutation of index values, it does not allow shufﬂing of index words within each index sentence or of index sentences. [sent-200, score-1.223]

62 In addition, it is easy to r construct a transversal of Sdr Ud ⇤ by merging distinct index elements. [sent-203, score-0.557]

63 Given a representative I, which includes at least two distinct index values, for example i 6= j, an operation called merging replaces all index values of i or j with min(i, j). [sent-205, score-0.472]

64 The action of Sdr ⇥Sdr on the index set Ud ⇤ is deﬁned as ·i✓ 1 ·(k,m)s , where ✓ 2 Sdr w denotes a permutation of all dr index words without any restriction, i. [sent-208, score-0.653]

65 ✓ 1 · (k, m)s denotes the index sentence location after permutation ✓, and ✓ 1 · (k, m)w denotes the index word location after r permutation ✓. [sent-210, score-0.666]

66 The orbit of Sdr ⇥ Sdr Ud ⇤ is denoted by [{(i1 , · · · , i1 ), · · · , (ir , · · · , ir )}]l . [sent-211, score-0.733]

67 1 1 d d Since the group action deﬁned in deﬁnition 5 allows free shufﬂing of the order of all dr index r words, the order does not matter for Sdr ⇥ Sdr Ud ⇤ and shufﬂing can across different sentences. [sent-212, score-0.388]

68 A transversal of Sdr Ud ⇤ can be generated by all possible mergings of s [{(1, · · · , d), · · · , (d(r 1) + 1, · · · , dr)}] . [sent-216, score-0.304]

69 r Ud ⇤ is equivalent to the integer partition We start the transversal graph construction from an initial orbit [{(1, · · · , d), · · · , (d(r 1) + r 1, · · · , dr)}]s , i. [sent-219, score-0.965]

70 Then we generate new orbits of Sdr Ud ⇤ by merging distinct index values in existing orbits until we meet [{(1, · · · , 1), · · · , (1, · · · , 1)}]s , i. [sent-221, score-0.681]

71 We also add an edge from an existing orbit to a new orbit generated by merging the existing one. [sent-224, score-1.26]

72 r r Now we generate the transversal of G⇤ Ud ⇤ from that of Sdr Ud ⇤ . [sent-226, score-0.304]

73 Actually, orbit equivalence checking is equivalent to the classical graph isomorphism problem since we can consider each index word as a vertex and connect two index words if they belong to the same index sentence. [sent-228, score-1.28]

74 Figure 4 shows a transversal of G⇤ U2 ⇤ 2 2 generated from that of S4 U2 (Figure 3). [sent-230, score-0.304]

75 By proposition 3, it is also a transversal of G U2 . [sent-231, score-0.333]

76 ⇤ r⇤ r⇤ Since G Ud is a ﬁner partition of Sdr ⇥ Sdr Ud , orbit equivalence testing is only necessary r when two orbits of Sdr Ud ⇤ correspond to the same integer partition. [sent-232, score-0.855]

77 For any two index orbit representatives 2 L and ⌫ 2 L, we say that ⌫ has a lower merging or symmetry order than that of , i. [sent-237, score-0.926]

78 Or there is a path from [ ] to [⌫] in the transversal graph. [sent-240, score-0.304]

79 r r It is easy to get #( ) when we generate a transversal graph of G Ud from that of Sdr Ud ⇤ . [sent-246, score-0.339]

80 ⌫) can also be obtained from the transversal graph of G Ud by counting the number of different [⌫]s s which can be reached from a [ ]s . [sent-248, score-0.339]

81 The difﬁculty for computing data function orbit sum and index function orbit sum comes from two constraints: equal constraint and unequal constraint. [sent-253, score-1.486]

82 For example, in the orbit [{(1, 1), (2, 2)}], the equal constraint is that the ﬁrst and the second index values are equal and the third and fourth index values are also equal. [sent-254, score-0.991]

83 Thus, the calculation of an orbit sum can be separated into two parts: the relaxed orbit sum without unequal constraint and lower order orbit sums. [sent-257, score-1.984]

84 For example, the relaxed index function orbit sum is ⇣P ⌘2 P w⇤ =[{(1,1),(2,2)}] = i,j w(i, i)w(j, j) = w(i, i) . [sent-258, score-0.852]

85 The index function orbit sum w can be calculated by subtracting all lower order orbit sums from the corresponding relaxed index function orbit sum w⇤ , i. [sent-260, score-2.329]

86 The calculation of the data index function orbit sum h is similar. [sent-265, score-0.881]

87 So the computational cost mainly depends on the calculation of relaxed orbit sum and the lowest order orbit sum. [sent-266, score-1.367]

88 The calculation of relaxed orbit can be done by Zhou’s greedy graph search algorithm [21]. [sent-268, score-0.725]

89 For a d-th order weighted v-statistic, the computational cost of the orbit sum for the r-th moment is bounded by O(nm ). [sent-271, score-0.76]

90 When d = 1, the computational complexity of the orbit sum is O(n). [sent-272, score-0.649]

91 With this change, H := Endn ⇥ Sr ⇥ Sd acting on Ud becomes a monoid action r instead of a group action since endofunction is not invertible. [sent-275, score-0.289]

92 Fortunately, the computation of Bootstrap weighted v-statistics only needs index function orbit sums and relaxed data function orbit sums in the corresponding permutation computation. [sent-283, score-1.694]

93 Therefore, the Bootstrap weighted v-statistics calculation is just a subproblem of permutation weighted v-statistics calculation. [sent-284, score-0.282]

94 We can obtain the r-th moment of bootstrapping weighted v-statistics by summing up the product of the index function orbit sum w and the relaxed data function orbit sum h⇤ over all index orbits, i. [sent-286, score-1.8]

95 q 7 Table 2: Comparison of accuracy and complexity for calculation of resampling statistics. [sent-289, score-0.308]

96 6 Conclusion In this paper, we propose a novel and computationally fast algorithm for computing weighted vstatistics in resampling both univariate and multivariate data. [sent-348, score-0.35]

97 Our theoretical framework reveals that the three types of symmetry in resampling weighted v-statistics can be represented by a product of symmetric groups. [sent-349, score-0.382]

98 As an exciting result, we demonstrate the calculation of resampling weighted v-statistics can be converted into the problem of orbit enumeration. [sent-350, score-0.97]

99 A novel efﬁcient orbit enumeration algorithm has been developed by using a small group acting on a small index set. [sent-351, score-0.89]

100 For further computational cost reduction, we sort all orbits by their symmetry order and calculate all index function orbit sums and data function orbit sums recursively. [sent-352, score-1.838]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('orbit', 0.607), ('ud', 0.393), ('transversal', 0.304), ('sdr', 0.294), ('resampling', 0.252), ('orbits', 0.214), ('index', 0.192), ('ik', 0.163), ('ir', 0.126), ('permutation', 0.116), ('group', 0.071), ('action', 0.069), ('cayley', 0.066), ('bootstrap', 0.063), ('symmetry', 0.062), ('paragraphs', 0.061), ('calculation', 0.056), ('dr', 0.056), ('weighted', 0.055), ('merging', 0.046), ('sums', 0.045), ('jd', 0.045), ('monoid', 0.04), ('ndr', 0.04), ('resamplings', 0.04), ('listing', 0.039), ('cardinalities', 0.037), ('graph', 0.035), ('isomorphism', 0.033), ('endn', 0.03), ('vstatistics', 0.03), ('card', 0.03), ('proposition', 0.029), ('unequal', 0.028), ('moment', 0.028), ('cost', 0.028), ('pn', 0.027), ('sided', 0.027), ('relaxed', 0.027), ('cardinality', 0.026), ('moments', 0.026), ('ner', 0.026), ('sum', 0.026), ('id', 0.026), ('statistic', 0.025), ('bootstrapping', 0.025), ('sn', 0.024), ('replacement', 0.024), ('sr', 0.024), ('generators', 0.023), ('zhou', 0.022), ('calculate', 0.022), ('shuf', 0.022), ('gp', 0.022), ('sentence', 0.022), ('nn', 0.021), ('acting', 0.02), ('endofunction', 0.02), ('luks', 0.02), ('xjd', 0.02), ('calculating', 0.02), ('sd', 0.019), ('representatives', 0.019), ('subsets', 0.019), ('permutations', 0.019), ('partition', 0.019), ('list', 0.019), ('subgroup', 0.018), ('summation', 0.018), ('sentences', 0.017), ('ing', 0.017), ('naive', 0.017), ('semigroup', 0.016), ('advised', 0.016), ('computational', 0.016), ('nition', 0.016), ('observations', 0.016), ('distinct', 0.015), ('ill', 0.015), ('paragraph', 0.015), ('summations', 0.015), ('axioms', 0.015), ('coarser', 0.015), ('equivalence', 0.015), ('ine', 0.015), ('summing', 0.015), ('dept', 0.015), ('double', 0.015), ('reduction', 0.014), ('operation', 0.014), ('words', 0.014), ('enumerate', 0.014), ('qr', 0.014), ('denotes', 0.014), ('actions', 0.014), ('univariate', 0.013), ('representative', 0.013), ('symmetric', 0.013), ('exact', 0.013), ('statistics', 0.013)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999988 128 nips-2012-Fast Resampling Weighted v-Statistics

Author: Chunxiao Zhou, Jiseong Park, Yun Fu

2 0.094187543 37 nips-2012-Affine Independent Variational Inference

Author: Edward Challis, David Barber

Abstract: We consider inference in a broad class of non-conjugate probabilistic models based on minimising the Kullback-Leibler divergence between the given target density and an approximating ‘variational’ density. In particular, for generalised linear models we describe approximating densities formed from an afﬁne transformation of independently distributed latent variables, this class including many well known densities as special cases. We show how all relevant quantities can be efﬁciently computed using the fast Fourier transform. This extends the known class of tractable variational approximations and enables the ﬁtting for example of skew variational densities to the target density. 1

3 0.064159811 234 nips-2012-Multiresolution analysis on the symmetric group

Author: Risi Kondor, Walter Dempsey

Abstract: There is no generally accepted way to deﬁne wavelets on permutations. We address this issue by introducing the notion of coset based multiresolution analysis (CMRA) on the symmetric group, ﬁnd the corresponding wavelet functions, and describe a fast wavelet transform for sparse signals. We discuss potential applications in ranking, sparse approximation, and multi-object tracking. 1

4 0.056347027 118 nips-2012-Entangled Monte Carlo

Author: Seong-hwan Jun, Liangliang Wang, Alexandre Bouchard-côté

Abstract: We propose a novel method for scalable parallelization of SMC algorithms, Entangled Monte Carlo simulation (EMC). EMC avoids the transmission of particles between nodes, and instead reconstructs them from the particle genealogy. In particular, we show that we can reduce the communication to the particle weights for each machine while efﬁciently maintaining implicit global coherence of the parallel simulation. We explain methods to efﬁciently maintain a genealogy of particles from which any particle can be reconstructed. We demonstrate using examples from Bayesian phylogenetic that the computational gain from parallelization using EMC signiﬁcantly outweighs the cost of particle reconstruction. The timing experiments show that reconstruction of particles is indeed much more efﬁcient as compared to transmission of particles. 1

5 0.049029559 246 nips-2012-Nonparametric Max-Margin Matrix Factorization for Collaborative Prediction

Author: Minjie Xu, Jun Zhu, Bo Zhang

Abstract: We present a probabilistic formulation of max-margin matrix factorization and build accordingly a nonparametric Bayesian model which automatically resolves the unknown number of latent factors. Our work demonstrates a successful example that integrates Bayesian nonparametrics and max-margin learning, which are conventionally two separate paradigms and enjoy complementary advantages. We develop an efﬁcient variational algorithm for posterior inference, and our extensive empirical studies on large-scale MovieLens and EachMovie data sets appear to justify the aforementioned dual advantages. 1

6 0.043388002 138 nips-2012-Fully Bayesian inference for neural models with negative-binomial spiking

7 0.040333658 333 nips-2012-Synchronization can Control Regularization in Neural Systems via Correlated Noise Processes

8 0.038211647 202 nips-2012-Locally Uniform Comparison Image Descriptor

9 0.034340542 257 nips-2012-One Permutation Hashing

10 0.03330157 209 nips-2012-Max-Margin Structured Output Regression for Spatio-Temporal Action Localization

11 0.032804776 111 nips-2012-Efficient Sampling for Bipartite Matching Problems

12 0.031695522 290 nips-2012-Recovery of Sparse Probability Measures via Convex Programming

13 0.031108214 344 nips-2012-Timely Object Recognition

14 0.030746251 70 nips-2012-Clustering by Nonnegative Matrix Factorization Using Graph Random Walk

15 0.029302485 76 nips-2012-Communication-Efficient Algorithms for Statistical Optimization

16 0.028948808 133 nips-2012-Finding Exemplars from Pairwise Dissimilarities via Simultaneous Sparse Recovery

17 0.02746629 211 nips-2012-Meta-Gaussian Information Bottleneck

18 0.025314564 60 nips-2012-Bayesian nonparametric models for ranked data

19 0.024567798 153 nips-2012-How Prior Probability Influences Decision Making: A Unifying Probabilistic Model

20 0.023580184 156 nips-2012-Identifiability and Unmixing of Latent Parse Trees

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.071), (1, -0.007), (2, 0.001), (3, -0.017), (4, -0.027), (5, -0.003), (6, 0.002), (7, 0.008), (8, 0.002), (9, 0.011), (10, -0.035), (11, -0.024), (12, -0.004), (13, -0.025), (14, -0.004), (15, -0.028), (16, 0.024), (17, -0.002), (18, -0.01), (19, -0.02), (20, 0.005), (21, 0.026), (22, -0.015), (23, -0.018), (24, -0.026), (25, -0.011), (26, -0.059), (27, 0.02), (28, -0.014), (29, 0.006), (30, 0.031), (31, 0.031), (32, 0.019), (33, -0.01), (34, -0.002), (35, -0.045), (36, -0.045), (37, 0.017), (38, -0.024), (39, 0.075), (40, 0.024), (41, -0.019), (42, 0.127), (43, 0.031), (44, -0.007), (45, 0.017), (46, -0.018), (47, -0.061), (48, 0.021), (49, 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.89221352 128 nips-2012-Fast Resampling Weighted v-Statistics

Author: Chunxiao Zhou, Jiseong Park, Yun Fu

2 0.52923387 234 nips-2012-Multiresolution analysis on the symmetric group

Author: Risi Kondor, Walter Dempsey

3 0.51926619 118 nips-2012-Entangled Monte Carlo

Author: Seong-hwan Jun, Liangliang Wang, Alexandre Bouchard-côté

4 0.41798189 115 nips-2012-Efficient high dimensional maximum entropy modeling via symmetric partition functions

Author: Paul Vernaza, Drew Bagnell

Abstract: Maximum entropy (MaxEnt) modeling is a popular choice for sequence analysis in applications such as natural language processing, where the sequences are embedded in discrete, tractably-sized spaces. We consider the problem of applying MaxEnt to distributions over paths in continuous spaces of high dimensionality— a problem for which inference is generally intractable. Our main contribution is to show that this intractability can be avoided as long as the constrained features possess a certain kind of low dimensional structure. In this case, we show that the associated partition function is symmetric and that this symmetry can be exploited to compute the partition function efﬁciently in a compressed form. Empirical results are given showing an application of our method to learning models of high-dimensional human motion capture data. 1

5 0.41228601 37 nips-2012-Affine Independent Variational Inference

Author: Edward Challis, David Barber

6 0.4103213 45 nips-2012-Approximating Equilibria in Sequential Auctions with Incomplete Information and Multi-Unit Demand

7 0.41006613 363 nips-2012-Wavelet based multi-scale shape features on arbitrary surfaces for cortical thickness discrimination

8 0.40925896 125 nips-2012-Factoring nonnegative matrices with linear programs

9 0.39328578 336 nips-2012-The Coloured Noise Expansion and Parameter Estimation of Diffusion Processes

10 0.3889094 22 nips-2012-A latent factor model for highly multi-relational data

11 0.3744902 246 nips-2012-Nonparametric Max-Margin Matrix Factorization for Collaborative Prediction

12 0.36729985 232 nips-2012-Multiplicative Forests for Continuous-Time Processes

13 0.36669701 249 nips-2012-Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison

14 0.36637825 17 nips-2012-A Scalable CUR Matrix Decomposition Algorithm: Lower Time Complexity and Tighter Bound

15 0.36556199 44 nips-2012-Approximating Concavely Parameterized Optimization Problems

16 0.36452013 196 nips-2012-Learning with Partially Absorbing Random Walks

17 0.35459575 41 nips-2012-Ancestor Sampling for Particle Gibbs

18 0.35239601 287 nips-2012-Random function priors for exchangeable arrays with applications to graphs and relational data

19 0.3436487 351 nips-2012-Transelliptical Component Analysis

20 0.33391485 184 nips-2012-Learning Probability Measures with respect to Optimal Transport Metrics

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.05), (21, 0.02), (38, 0.076), (39, 0.019), (42, 0.012), (54, 0.062), (55, 0.019), (74, 0.053), (76, 0.091), (79, 0.335), (80, 0.092), (92, 0.035)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.76177073 53 nips-2012-Bayesian Pedigree Analysis using Measure Factorization

Author: Bonnie Kirkpatrick, Alexandre Bouchard-côté

Abstract: Pedigrees, or family trees, are directed graphs used to identify sites of the genome that are correlated with the presence or absence of a disease. With the advent of genotyping and sequencing technologies, there has been an explosion in the amount of data available, both in the number of individuals and in the number of sites. Some pedigrees number in the thousands of individuals. Meanwhile, analysis methods have remained limited to pedigrees of < 100 individuals which limits analyses to many small independent pedigrees. Disease models, such those used for the linkage analysis log-odds (LOD) estimator, have similarly been limited. This is because linkage analysis was originally designed with a different task in mind, that of ordering the sites in the genome, before there were technologies that could reveal the order. LODs are difﬁcult to interpret and nontrivial to extend to consider interactions among sites. These developments and difﬁculties call for the creation of modern methods of pedigree analysis. Drawing from recent advances in graphical model inference and transducer theory, we introduce a simple yet powerful formalism for expressing genetic disease models. We show that these disease models can be turned into accurate and computationally efﬁcient estimators. The technique we use for constructing the variational approximation has potential applications to inference in other large-scale graphical models. This method allows inference on larger pedigrees than previously analyzed in the literature, which improves disease site prediction. 1

same-paper 2 0.70606089 128 nips-2012-Fast Resampling Weighted v-Statistics

Author: Chunxiao Zhou, Jiseong Park, Yun Fu

3 0.50525951 16 nips-2012-A Polynomial-time Form of Robust Regression

Author: Ozlem Aslan, Dale Schuurmans, Yao-liang Yu

Abstract: Despite the variety of robust regression methods that have been developed, current regression formulations are either NP-hard, or allow unbounded response to even a single leverage point. We present a general formulation for robust regression—Variational M-estimation—that uniﬁes a number of robust regression methods while allowing a tractable approximation strategy. We develop an estimator that requires only polynomial-time, while achieving certain robustness and consistency guarantees. An experimental evaluation demonstrates the effectiveness of the new estimation approach compared to standard methods. 1

4 0.45587412 287 nips-2012-Random function priors for exchangeable arrays with applications to graphs and relational data

Author: James Lloyd, Peter Orbanz, Zoubin Ghahramani, Daniel M. Roy

Abstract: A fundamental problem in the analysis of structured relational data like graphs, networks, databases, and matrices is to extract a summary of the common structure underlying relations between individual entities. Relational data are typically encoded in the form of arrays; invariance to the ordering of rows and columns corresponds to exchangeable arrays. Results in probability theory due to Aldous, Hoover and Kallenberg show that exchangeable arrays can be represented in terms of a random measurable function which constitutes the natural model parameter in a Bayesian model. We obtain a ﬂexible yet simple Bayesian nonparametric model by placing a Gaussian process prior on the parameter function. Efﬁcient inference utilises elliptical slice sampling combined with a random sparse approximation to the Gaussian process. We demonstrate applications of the model to network data and clarify its relation to models in the literature, several of which emerge as special cases. 1

5 0.45495936 177 nips-2012-Learning Invariant Representations of Molecules for Atomization Energy Prediction

Author: Grégoire Montavon, Katja Hansen, Siamac Fazli, Matthias Rupp, Franziska Biegler, Andreas Ziehe, Alexandre Tkatchenko, Anatole V. Lilienfeld, Klaus-Robert Müller

Abstract: The accurate prediction of molecular energetics in chemical compound space is a crucial ingredient for rational compound design. The inherently graph-like, non-vectorial nature of molecular data gives rise to a unique and difﬁcult machine learning problem. In this paper, we adopt a learning-from-scratch approach where quantum-mechanical molecular energies are predicted directly from the raw molecular geometry. The study suggests a beneﬁt from setting ﬂexible priors and enforcing invariance stochastically rather than structurally. Our results improve the state-of-the-art by a factor of almost three, bringing statistical methods one step closer to chemical accuracy. 1

6 0.45300373 173 nips-2012-Learned Prioritization for Trading Off Accuracy and Speed

7 0.44800723 344 nips-2012-Timely Object Recognition

8 0.44697687 108 nips-2012-Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search

9 0.44686261 115 nips-2012-Efficient high dimensional maximum entropy modeling via symmetric partition functions

10 0.44659722 197 nips-2012-Learning with Recursive Perceptual Representations

11 0.44590241 83 nips-2012-Controlled Recognition Bounds for Visual Learning and Exploration

12 0.44360676 234 nips-2012-Multiresolution analysis on the symmetric group

13 0.44329679 168 nips-2012-Kernel Latent SVM for Visual Recognition

14 0.44287378 355 nips-2012-Truncation-free Online Variational Inference for Bayesian Nonparametric Models

15 0.4420169 274 nips-2012-Priors for Diversity in Generative Latent Variable Models

16 0.4419651 279 nips-2012-Projection Retrieval for Classification

17 0.44037405 229 nips-2012-Multimodal Learning with Deep Boltzmann Machines

18 0.43992037 172 nips-2012-Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs

19 0.43985304 51 nips-2012-Bayesian Hierarchical Reinforcement Learning

20 0.43845609 162 nips-2012-Inverse Reinforcement Learning through Structured Classification