nips nips2011 nips2011-234 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Flavio Chierichetti, David Liben-nowell, Jon M. Kleinberg
Abstract: Motivated by the spread of on-line information in general and on-line petitions in particular, recent research has raised the following combinatorial estimation problem. There is a tree T that we cannot observe directly (representing the structure along which the information has spread), and certain nodes randomly decide to make their copy of the information public. In the case of a petition, the list of names on each public copy of the petition also reveals a path leading back to the root of the tree. What can we conclude about the properties of the tree we observe from these revealed paths, and can we use the structure of the observed tree to estimate the size of the full unobserved tree T ? Here we provide the first algorithm for this size estimation task, together with provable guarantees on its performance. We also establish structural properties of the observed tree, providing the first rigorous explanation for some of the unusual structural phenomena present in the spread of real chain-letter petitions on the Internet. 1
Reference: text
sentIndex sentText sentNum sentScore
1 There is a tree T that we cannot observe directly (representing the structure along which the information has spread), and certain nodes randomly decide to make their copy of the information public. [sent-2, score-0.742]
2 In the case of a petition, the list of names on each public copy of the petition also reveals a path leading back to the root of the tree. [sent-3, score-1.113]
3 What can we conclude about the properties of the tree we observe from these revealed paths, and can we use the structure of the observed tree to estimate the size of the full unobserved tree T ? [sent-4, score-1.186]
4 We also establish structural properties of the observed tree, providing the first rigorous explanation for some of the unusual structural phenomena present in the spread of real chain-letter petitions on the Internet. [sent-6, score-0.326]
5 1 Introduction The on-line domain is a rich environment for observing social contagion — the tendency of new information, ideas, and behaviors to spread from person to person through a social network [1, 4, 6, 10, 12, 14, 17, 19]. [sent-7, score-0.507]
6 When a link, invitation, petition, or other on-line item passes between people in the network, it is natural to model its spread using a tree structure: each person has the ability to pass the item to one or more others who haven’t yet received it, producing a set of “offspring” in this tree. [sent-8, score-0.642]
7 Recent work has considered such tree structures in the context of on-line conversations [13], chain letters [5, 9, 16], on-line product recommendations [11, 15], and other forms of forwarded e-mail [18]. [sent-9, score-0.312]
8 A fundamental type of social contagion is one in which the item, by its very nature, accumulates information about the paths it follows as it travels through the social network. [sent-12, score-0.285]
9 1 of such a self-recording item is an on-line petition that spreads virally by e-mail — in other words, a chain-letter petition [9, 16]. [sent-14, score-1.382]
10 Each recipient who wants to take part in the petition signs his or her name to it and forwards copies of it to friends. [sent-15, score-0.775]
11 In this way, each copy of the petition contains a growing list of names that corresponds to a path all the way back to the initiator of the petition. [sent-16, score-0.988]
12 Liben-Nowell and Kleinberg studied the following framework for reconstructing the spread of chain-letter petitions [16]. [sent-20, score-0.326]
13 The originator of the petition is the root of T . [sent-22, score-0.658]
14 For a given petition, the tree T is the structure we wish to understand, because it captures how the message spreads through the population. [sent-23, score-0.405]
15 But in general we cannot hope to observe T : assuming that the petition spreads through individuals’ e-mail accounts, hosted by multiple providers, there is no single organization that has all the information needed to reconstruct T . [sent-24, score-0.782]
16 For each person v who signs the petition, there is a small probability δ > 0 that v will also publicly post their copy of it. [sent-26, score-0.302]
17 When v is exposed, we see not only that v belongs to T , but we also see v’s path all the way back to the root r of T , due to the list of names on v’s copy of the petition. [sent-28, score-0.438]
18 Thus, if the set of people who post their copy of the petition is {v1 , v2 , . [sent-29, score-0.805]
19 2 We refer to this process as the δ-sampling of a tree T : each node v is exposed independently with probability δ, and then all nodes on any path from the root to an exposed node (including the exposed nodes themselves) are revealed. [sent-36, score-1.899]
20 Understanding the relationship between Tδ and T is a fundamental question, since empirically we are often in a setting where we can observe Tδ and want to reason about properties of the larger unobserved tree T . [sent-38, score-0.4]
21 This is the basic issue we address in this paper: to understand the observation of a tree under δ-sampling. [sent-40, score-0.299]
22 In particular, the observed trees that they reconstructed had a very large single-child fraction — the fraction of nodes with only one child was above 94%. [sent-42, score-0.578]
23 Second, existing work on this topic has so far not provided any framework capable of addressing what is perhaps the most basic question about δ-sampling: given a tree Tδ with its set of exposed nodes indicated — i. [sent-48, score-0.768]
24 , a single outcome of the δ-sampling process 1 Some petitions are hosted by a single Web site, rather than relying on social contagion; however, our focus here is on those that spread via person-to-person communication. [sent-50, score-0.458]
25 2 — can we infer the number of nodes in the original tree T ? [sent-52, score-0.554]
26 We do not require any assumption that the unobserved tree T arises from a branching process; the tree may be arbitrary as long as the degrees are bounded. [sent-58, score-0.792]
27 Second, we consider the problem of estimating the size of T , which we define as the number of nodes it contains. [sent-61, score-0.313]
28 In the basic formulation of the problem, we ask: given a single draw of Tδ , with its set of exposed nodes indicated, can we estimate the size of T to within a 1 ± ε factor with high probability for any constant ε > 0? [sent-62, score-0.565]
29 5 million copies of the e-mailed petition when both signers and non-signers are considered. [sent-69, score-0.762]
30 For this model, we show that as long as the offspring distribution of the Galton–Watson process has finite variance and unit expectation we can estimate the size of the unobserved tree T . [sent-73, score-0.54]
31 A further extension relaxes the assumption that when a node v makes its copy of the petition public, the path is revealed all the way back to the root. [sent-77, score-1.163]
32 Indeed, if we don’t bound the maximum number of children at any node, then the star graph — a single node with n − 1 children — has a single-child fraction of 0 with high probability for any non-trivial δ. [sent-79, score-0.454]
33 And if we don’t consider the case of δ → 0, then each node with multiple children has a constant probability of having several of them made public and hence the single-child fraction can’t converge to 1, unless the original tree was composed almost exclusively of single-child nodes to begin with. [sent-80, score-0.981]
34 4 For example, if Tδ is simply a star with s leaves, this observed tree is consistent with δ-sampling applied to any n-leaf star, for any value of n ≥ s and with δ = s/n. [sent-81, score-0.34]
35 Thus, our estimation methods work even when the data provides much less than a full path back to the root for each node made public. [sent-84, score-0.345]
36 2 Single-Child Fraction We begin by showing that in any bounded-degree tree, the fraction of single-child nodes converges to 1 as the sampling rate δ goes to 0. [sent-86, score-0.408]
37 First of all, let the unobserved tree T have n nodes, each having at most k children. [sent-88, score-0.359]
38 Let us say that v is a branching node if it has more than one child. [sent-89, score-0.289]
39 (That is, we partition the nodes of the tree into three disjoint categories: leaves, single-child nodes, and branching nodes. [sent-90, score-0.713]
40 ) In any bounded-degree tree, the number of leaves and the number of branching nodes are within a constant factor of each other; in particular, this will be true in the revealed tree Tδ . [sent-91, score-0.94]
41 Now, all leaves in Tδ are nodes that are exposed (i. [sent-92, score-0.552]
42 Thus, there will also be O(δn) branching nodes, and all other nodes in Tδ must be single-child nodes. [sent-95, score-0.439]
43 1, which asserts that with high probability, Tδ has Ω(δn logk δ −1 ) nodes in total. [sent-97, score-0.393]
44 Since there are only O(δn) leaves and branching nodes, the remaining nodes must be single-child nodes — and since the size of Tδ exceeds O(δn) by a factor of Ω(logk δ −1 ), the fraction of single-child nodes in Tδ must therefore converge to 1 as δ goes to 0. [sent-98, score-1.206]
45 We then argue that in a constant fraction of these trees Ti , a node of Ti at distance at least Ω(logk δ −1 ) from Ti ’s root will be exposed, which will result in the appearance of Ω(logk δ −1 ) nodes in Ti . [sent-106, score-0.613]
46 Let Tδ be the random subtree of T revealed by the δ-sampling process, and let Xδ be the number of its internal nodes. [sent-111, score-0.3]
47 We now follow the plan outlined at the beginning of this section, using this theorem to conclude that the fraction of single-child nodes converges to 1. [sent-113, score-0.432]
48 2, that Tδ will have at most O(δn) branching nodes with high probability. [sent-116, score-0.439]
49 Given a tree T on n nodes, a sampling rate δ, and a number M ≤ n, let p be the probability that the size of the tree Tδ revealed by the δ-sampling process is at most M . [sent-119, score-0.797]
50 Let m be the number of nodes in Tδ and m1 be the number of single-child nodes in Tδ . [sent-120, score-0.56]
51 2, we obtain the main result about single-child nodes as the following corollary. [sent-124, score-0.28]
52 6 If, in T , each node has at most one child, then T is a path — in which case, an easy argument shows that almost every node will be revealed, and that necessarily only one of the revealed nodes will not have one child. [sent-129, score-0.795]
53 Let m and m1 be, respectively, the number of nodes, and the number of nodes with exactly one child, in Tδ . [sent-133, score-0.28]
54 3, we obtain that the fraction of single-child nodes in the revealed tree will approach 1−O( ) with probability 1−exp (−Ω (δn)). [sent-136, score-0.825]
55 3 Estimation As before, given an unknown tree T , let Tδ be the tree revealed by the δ-sampling process. [sent-137, score-0.693]
56 Let V = V (Tδ ) be the set of nodes of Tδ , let L ⊆ V be the set of its leaves, and let E ⊆ V be the ˆ set of its nodes that were exposed. [sent-139, score-0.56]
57 ) For the unbiased estimator δ, we consider the set of all nodes “above” the leaves of Tδ — that is, internal nodes on a path from a leaf of Tδ to ˆ the root — and we use the empirical fraction of exposures in this set as our value for δ. [sent-141, score-1.128]
58 ˆ After establishing that δ is an unbiased estimator, we show that the probability of a large deviation ˆ and δ decreases exponentially in |V − L|, the number of internal nodes of Tδ . [sent-142, score-0.434]
59 Thus, between δ to show a high probability bound for our size estimate, we need to establish a lower bound on the number of internal nodes of Tδ , which will be the final step in the analysis. [sent-143, score-0.496]
60 Following the plan outlined above, we consider the independent exposures of all nodes that lie above the leaves of Tδ , resulting in the set E −L ⊆ V −L. [sent-151, score-0.424]
61 If |V | ≥ 2, then the size n of the unknown tree T satisfies Pr [|n − n| ≤ n] ≥ 1 − e−Θ( ˆ 3. [sent-164, score-0.307]
62 Trees with sublinear maximum degree Our bounds thus far show that n is close to n with a probability that decreases exponentially in the ˆ number of internal nodes |V − L|. [sent-166, score-0.418]
63 5 To do this, we require a theorem that guarantees that the number of internal nodes is at least an explicit function of n; this function can then be used in place of |V − L| in the probability bounds. [sent-168, score-0.427]
64 The crux of the proof is to show that if a node v has kv children, and δkv ≤ 1, then the probability that v is revealed is at least a constant times the expected number of the exposed children of v: that is, Ω(δkv ); if, instead, δkv > 1, then the probability that v is revealed is Ω(1). [sent-171, score-0.803]
65 The result then follows from a bound on the number of nodes of degree greater than δ −1 , linearity of expectation, and Chernoff bounds. [sent-172, score-0.373]
66 Then, the number Xδ of internal nodes of Tδ satisfies Pr Xδ ≥ −1 1 − e−1 · min k −1 , δ · (n − 1) ≥ 1 − e−Θ(n min(k ,δ)) . [sent-177, score-0.357]
67 4 can tolerate is roughly n−1/2 : if δ ≥ Ω , and no node 2n √ n), then with probability in the unknown tree has more than δ −1 children (observe that δ −1 1 − η, the n returned by the estimator is within a multiplicative 1 ± factor of the actual n. [sent-188, score-0.541]
68 ) The main fact we require about such branching processes is that the height of a uniformly chosen node from a branching process tree (with offspring distribution having finite variance, unit expectation, and conditioned on being of size n) is at least Ω n1/2− with high probability [2]. [sent-194, score-0.909]
69 ˆ 4 Concentration As we observed in previous sections, the size of Tδ plays a prominent role in determining both the fraction of single-child nodes and the size of the unknown tree T . [sent-197, score-0.744]
70 In this section we prove some concentration results on the quantity |Tδ | — that is, we will bound the probability that |Tδ | is far from its mean, over random outcomes of the δ-sampling process applied to the underlying tree T . [sent-198, score-0.455]
71 Let T be a rooted tree, and let Tδ be the random subtree of T revealed by the δ-sampling process. [sent-203, score-0.286]
72 The proof requires an intricate balancing of two kinds of nodes — those “high” in T , with many descendants, and those “low” in T , with few descendants. [sent-206, score-0.28]
73 Let T be a rooted tree on n nodes, with height at most H. [sent-210, score-0.337]
74 To do this, let T be a tree whose root is connected directly to n − 1 − δ −1 leaves and also to a path of length δ −1 . [sent-216, score-0.52]
75 Then Tδ will not contain any node in the path with probability (1 − δ) δ −1 δ→0 − − → e−1 . [sent-217, score-0.298]
76 −− If Tδ does not contain any node in the path, then it will only contain nodes adjacent to the root. [sent-218, score-0.458]
77 On the other hand, the probability that Tδ will contain exactly one node in the path is δ −1 · δ · (1 − δ) δ −1 −1 δ→0 − − → e−1 . [sent-220, score-0.298]
78 −− Since, under this conditioning, the single node in the path will be uniformly distributed over the path itself, with half the probability it will be in the lower half of the path — causing the upper half of the path to be revealed. [sent-221, score-0.604]
79 Hence with constant probability at least L = Ω(δ −1 ) nodes will be revealed. [sent-222, score-0.314]
80 If δ = o n−1/2 , we have L/U = Ω(δ −2 n−1 ) = ω(n)/n = ω(1), from which it follows that the number of nodes of Tδ is not concentrated when δ = o n−1/2 . [sent-224, score-0.307]
81 5 The Iraq-War Petition Using the framework developed in the previous sections, we now turn to the anti-war petition studied by Liben-Nowell and Kleinberg. [sent-225, score-0.604]
82 The Iraq-War tree observed by Liben-Nowell and Kleinberg — after they did some mild preprocessing to clean the data — was deep and narrow, and contained the characteristic “stringy” pattern analyzed in Section 2, with over 94% of nodes having exactly one child. [sent-227, score-0.586]
83 The observed Iraq-War tree contained |V | = 18, 119 nodes and |E| = 620 exposed nodes, of which |L| = 557 were exposed leaves. [sent-228, score-0.966]
84 00359, and we estimate the size of the unobserved ˆ Iraq-War tree to be n = |E|/δ ≈ 172,832. [sent-230, score-0.42]
85 For this purpose, ˆ we pose the question concretely as follows: if the observed Iraq-War tree arose via δ-sampling from an arbitrary unobserved tree T of size n, what is the probability of the event that the estimate n ˆ produced by our algorithm lies in the interval [ 1 n, 2n]? [sent-233, score-0.76]
86 (Recall that our estimation algorithm is 2 deterministic; the probability here is taken over the random choices of nodes exposed by the δsampling process to the arbitrary fixed tree T . [sent-234, score-0.839]
87 For any tree T of size n, assuming the observed Iraq-War tree was produced via 1 δ-sampling of T , the event that n lies in the interval [ 2 n, 2n] is at least 95%. [sent-238, score-0.613]
88 A person who signs the petition forwards the petition, on average, to 20. [sent-249, score-0.738]
89 38 signers in the unobserved tree sent a total of ≈ 3,520,595. [sent-252, score-0.505]
90 Finally, the δ-sampling process is a very simple abstraction of the process by which a widely circulated message becomes public, and with further inspection of the Iraq-War tree observed by LibenNowell and Kleinberg, we can begin to identify potential limitations of the basic δ-sampling model. [sent-254, score-0.486]
91 Principally, we have been assuming that each individual signatory of the petition exposes her petition copy independently with probability δ. [sent-255, score-1.435]
92 One of the most common mechanisms that exposes a petition e-mail is when that e-mail is sent to a mailing list that archives its messages on the Web. [sent-257, score-0.852]
93 We can quantify this independence issue explicitly by noting that many of the exposed internal nodes in the observed Iraq-War tree are close to the leaves of the tree. [sent-259, score-0.96]
94 In particular, 48 of the 63 exposed internal nodes are within 10 hops of a leaf, out of only 5351 total such nodes. [sent-260, score-0.547]
95 Thus the exposure rate for internal nodes within distance 10 of a leaf is 48/5351 ≈ 0. [sent-261, score-0.489]
96 00897, while the exposure rate for internal nodes more than distance 10 from any leaf is 15/12211 ≈ 0. [sent-262, score-0.489]
97 6 Conclusion When information spreads through a social network, it often does so along a branching structure that can be reasonably modeled as a tree; but when we observe this spreading process, we frequently see only a portion of the full tree. [sent-264, score-0.445]
98 When we apply these techniques to data such as the Iraq-War petition in Section 5, our conclusions must clearly be interpreted in light of the model’s underlying assumptions. [sent-266, score-0.604]
99 Among these assumptions, the requirement of bounded degree may generally be fairly mild, since it essentially requires the tree of interest simply to be large enough compared to the number of children at any one node. [sent-267, score-0.365]
100 Arguably more restrictive is the assumption that each node makes an independent decision about posting its copy of the information, and with the same fixed probability δ. [sent-268, score-0.377]
wordName wordTfidf (topN-words)
[('petition', 0.604), ('nodes', 0.28), ('tree', 0.274), ('exposed', 0.19), ('petitions', 0.17), ('branching', 0.159), ('copy', 0.147), ('revealed', 0.145), ('node', 0.13), ('spread', 0.124), ('logk', 0.113), ('path', 0.11), ('spreads', 0.099), ('exposure', 0.099), ('signers', 0.094), ('fraction', 0.092), ('social', 0.089), ('unobserved', 0.085), ('offspring', 0.083), ('leaves', 0.082), ('subtree', 0.078), ('internal', 0.077), ('contagion', 0.075), ('kleinberg', 0.075), ('item', 0.075), ('concentration', 0.074), ('public', 0.071), ('posting', 0.066), ('person', 0.065), ('corollary', 0.065), ('copies', 0.064), ('children', 0.064), ('rooted', 0.063), ('list', 0.063), ('kv', 0.061), ('trees', 0.057), ('adamic', 0.057), ('galton', 0.057), ('mailing', 0.057), ('spreading', 0.057), ('stringy', 0.057), ('root', 0.054), ('diffusion', 0.052), ('sent', 0.052), ('pr', 0.048), ('chernoff', 0.048), ('exposes', 0.046), ('unbiased', 0.043), ('web', 0.041), ('observe', 0.041), ('estimator', 0.039), ('activism', 0.038), ('circulated', 0.038), ('exposures', 0.038), ('forwarded', 0.038), ('forwards', 0.038), ('hosted', 0.038), ('protest', 0.038), ('recipient', 0.038), ('sending', 0.038), ('watson', 0.037), ('names', 0.037), ('process', 0.037), ('begin', 0.036), ('theorem', 0.036), ('bound', 0.036), ('probability', 0.034), ('star', 0.034), ('size', 0.033), ('leaf', 0.033), ('reconstructing', 0.032), ('observed', 0.032), ('paths', 0.032), ('message', 0.032), ('internet', 0.031), ('signs', 0.031), ('messages', 0.03), ('linearity', 0.03), ('fk', 0.03), ('lists', 0.029), ('ti', 0.029), ('fractions', 0.029), ('people', 0.029), ('individuals', 0.029), ('estimate', 0.028), ('concentrated', 0.027), ('dynamics', 0.027), ('back', 0.027), ('ithaca', 0.027), ('unbounded', 0.027), ('degree', 0.027), ('leskovec', 0.026), ('issue', 0.025), ('post', 0.025), ('child', 0.025), ('indicated', 0.024), ('estimation', 0.024), ('tv', 0.024), ('plan', 0.024), ('contain', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000004 234 nips-2011-Reconstructing Patterns of Information Diffusion from Incomplete Observations
Author: Flavio Chierichetti, David Liben-nowell, Jon M. Kleinberg
Abstract: Motivated by the spread of on-line information in general and on-line petitions in particular, recent research has raised the following combinatorial estimation problem. There is a tree T that we cannot observe directly (representing the structure along which the information has spread), and certain nodes randomly decide to make their copy of the information public. In the case of a petition, the list of names on each public copy of the petition also reveals a path leading back to the root of the tree. What can we conclude about the properties of the tree we observe from these revealed paths, and can we use the structure of the observed tree to estimate the size of the full unobserved tree T ? Here we provide the first algorithm for this size estimation task, together with provable guarantees on its performance. We also establish structural properties of the observed tree, providing the first rigorous explanation for some of the unusual structural phenomena present in the spread of real chain-letter petitions on the Internet. 1
2 0.17448434 96 nips-2011-Fast and Balanced: Efficient Label Tree Learning for Large Scale Object Recognition
Author: Jia Deng, Sanjeev Satheesh, Alexander C. Berg, Fei Li
Abstract: We present a novel approach to efficiently learn a label tree for large scale classification with many classes. The key contribution of the approach is a technique to simultaneously determine the structure of the tree and learn the classifiers for each node in the tree. This approach also allows fine grained control over the efficiency vs accuracy trade-off in designing a label tree, leading to more balanced trees. Experiments are performed on large scale image classification with 10184 classes and 9 million images. We demonstrate significant improvements in test accuracy and efficiency with less training time and more balanced trees compared to the previous state of the art by Bengio et al. 1
3 0.14614394 226 nips-2011-Projection onto A Nonnegative Max-Heap
Author: Jun Liu, Liang Sun, Jieping Ye
Abstract: We consider the problem of computing the Euclidean projection of a vector of length p onto a non-negative max-heap—an ordered tree where the values of the nodes are all nonnegative and the value of any parent node is no less than the value(s) of its child node(s). This Euclidean projection plays a building block role in the optimization problem with a non-negative maxheap constraint. Such a constraint is desirable when the features follow an ordered tree structure, that is, a given feature is selected for the given regression/classification task only if its parent node is selected. In this paper, we show that such Euclidean projection problem admits an analytical solution and we develop a top-down algorithm where the key operation is to find the so-called maximal root-tree of the subtree rooted at each node. A naive approach for finding the maximal root-tree is to enumerate all the possible root-trees, which, however, does not scale well. We reveal several important properties of the maximal root-tree, based on which we design a bottom-up algorithm with merge for efficiently finding the maximal roottree. The proposed algorithm has a (worst-case) linear time complexity for a sequential list, and O(p2 ) for a general tree. We report simulation results showing the effectiveness of the max-heap for regression with an ordered tree structure. Empirical results show that the proposed algorithm has an expected linear time complexity for many special cases including a sequential list, a full binary tree, and a tree with depth 1. 1
4 0.1440551 140 nips-2011-Kernel Embeddings of Latent Tree Graphical Models
Author: Le Song, Eric P. Xing, Ankur P. Parikh
Abstract: Latent tree graphical models are natural tools for expressing long range and hierarchical dependencies among many variables which are common in computer vision, bioinformatics and natural language processing problems. However, existing models are largely restricted to discrete and Gaussian variables due to computational constraints; furthermore, algorithms for estimating the latent tree structure and learning the model parameters are largely restricted to heuristic local search. We present a method based on kernel embeddings of distributions for latent tree graphical models with continuous and non-Gaussian variables. Our method can recover the latent tree structures with provable guarantees and perform local-minimum free parameter learning and efficient inference. Experiments on simulated and real data show the advantage of our proposed approach. 1
5 0.12658225 242 nips-2011-See the Tree Through the Lines: The Shazoo Algorithm
Author: Fabio Vitale, Nicolò Cesa-bianchi, Claudio Gentile, Giovanni Zappella
Abstract: Predicting the nodes of a given graph is a fascinating theoretical problem with applications in several domains. Since graph sparsification via spanning trees retains enough information while making the task much easier, trees are an important special case of this problem. Although it is known how to predict the nodes of an unweighted tree in a nearly optimal way, in the weighted case a fully satisfactory algorithm is not available yet. We fill this hole and introduce an efficient node predictor, S HAZOO, which is nearly optimal on any weighted tree. Moreover, we show that S HAZOO can be viewed as a common nontrivial generalization of both previous approaches for unweighted trees and weighted lines. Experiments on real-world datasets confirm that S HAZOO performs well in that it fully exploits the structure of the input tree, and gets very close to (and sometimes better than) less scalable energy minimization methods. 1
6 0.12528235 115 nips-2011-Hierarchical Topic Modeling for Analysis of Time-Evolving Personal Choices
7 0.11703935 177 nips-2011-Multi-armed bandits on implicit metric spaces
8 0.1124691 170 nips-2011-Message-Passing for Approximate MAP Inference with Latent Variables
9 0.10224251 267 nips-2011-Spectral Methods for Learning Multivariate Latent Tree Structure
10 0.0752469 208 nips-2011-Optimistic Optimization of a Deterministic Function without the Knowledge of its Smoothness
11 0.069931149 233 nips-2011-Rapid Deformable Object Detection using Dual-Tree Branch-and-Bound
12 0.069478773 6 nips-2011-A Global Structural EM Algorithm for a Model of Cancer Progression
13 0.063767314 151 nips-2011-Learning a Tree of Metrics with Disjoint Visual Features
14 0.06257952 157 nips-2011-Learning to Search Efficiently in High Dimensions
15 0.061954394 274 nips-2011-Structure Learning for Optimization
16 0.061772242 43 nips-2011-Bayesian Partitioning of Large-Scale Distance Data
17 0.060118359 141 nips-2011-Large-Scale Category Structure Aware Image Categorization
18 0.058198322 146 nips-2011-Learning Higher-Order Graph Structure with Features by Structure Penalty
19 0.053958237 72 nips-2011-Distributed Delayed Stochastic Optimization
20 0.052722488 150 nips-2011-Learning a Distance Metric from a Network
topicId topicWeight
[(0, 0.156), (1, -0.003), (2, -0.058), (3, 0.003), (4, -0.017), (5, -0.139), (6, -0.079), (7, -0.083), (8, 0.015), (9, -0.225), (10, -0.012), (11, 0.001), (12, -0.215), (13, 0.126), (14, -0.003), (15, 0.063), (16, 0.004), (17, -0.073), (18, 0.004), (19, -0.105), (20, 0.062), (21, -0.164), (22, -0.102), (23, -0.005), (24, -0.106), (25, 0.05), (26, -0.053), (27, 0.025), (28, -0.037), (29, -0.014), (30, -0.067), (31, 0.064), (32, 0.018), (33, -0.016), (34, 0.055), (35, -0.015), (36, 0.035), (37, 0.072), (38, -0.012), (39, 0.011), (40, -0.053), (41, -0.019), (42, 0.024), (43, -0.028), (44, 0.035), (45, -0.037), (46, 0.019), (47, -0.024), (48, 0.003), (49, 0.004)]
simIndex simValue paperId paperTitle
same-paper 1 0.97585833 234 nips-2011-Reconstructing Patterns of Information Diffusion from Incomplete Observations
Author: Flavio Chierichetti, David Liben-nowell, Jon M. Kleinberg
Abstract: Motivated by the spread of on-line information in general and on-line petitions in particular, recent research has raised the following combinatorial estimation problem. There is a tree T that we cannot observe directly (representing the structure along which the information has spread), and certain nodes randomly decide to make their copy of the information public. In the case of a petition, the list of names on each public copy of the petition also reveals a path leading back to the root of the tree. What can we conclude about the properties of the tree we observe from these revealed paths, and can we use the structure of the observed tree to estimate the size of the full unobserved tree T ? Here we provide the first algorithm for this size estimation task, together with provable guarantees on its performance. We also establish structural properties of the observed tree, providing the first rigorous explanation for some of the unusual structural phenomena present in the spread of real chain-letter petitions on the Internet. 1
2 0.82583988 242 nips-2011-See the Tree Through the Lines: The Shazoo Algorithm
Author: Fabio Vitale, Nicolò Cesa-bianchi, Claudio Gentile, Giovanni Zappella
Abstract: Predicting the nodes of a given graph is a fascinating theoretical problem with applications in several domains. Since graph sparsification via spanning trees retains enough information while making the task much easier, trees are an important special case of this problem. Although it is known how to predict the nodes of an unweighted tree in a nearly optimal way, in the weighted case a fully satisfactory algorithm is not available yet. We fill this hole and introduce an efficient node predictor, S HAZOO, which is nearly optimal on any weighted tree. Moreover, we show that S HAZOO can be viewed as a common nontrivial generalization of both previous approaches for unweighted trees and weighted lines. Experiments on real-world datasets confirm that S HAZOO performs well in that it fully exploits the structure of the input tree, and gets very close to (and sometimes better than) less scalable energy minimization methods. 1
3 0.82319105 226 nips-2011-Projection onto A Nonnegative Max-Heap
Author: Jun Liu, Liang Sun, Jieping Ye
Abstract: We consider the problem of computing the Euclidean projection of a vector of length p onto a non-negative max-heap—an ordered tree where the values of the nodes are all nonnegative and the value of any parent node is no less than the value(s) of its child node(s). This Euclidean projection plays a building block role in the optimization problem with a non-negative maxheap constraint. Such a constraint is desirable when the features follow an ordered tree structure, that is, a given feature is selected for the given regression/classification task only if its parent node is selected. In this paper, we show that such Euclidean projection problem admits an analytical solution and we develop a top-down algorithm where the key operation is to find the so-called maximal root-tree of the subtree rooted at each node. A naive approach for finding the maximal root-tree is to enumerate all the possible root-trees, which, however, does not scale well. We reveal several important properties of the maximal root-tree, based on which we design a bottom-up algorithm with merge for efficiently finding the maximal roottree. The proposed algorithm has a (worst-case) linear time complexity for a sequential list, and O(p2 ) for a general tree. We report simulation results showing the effectiveness of the max-heap for regression with an ordered tree structure. Empirical results show that the proposed algorithm has an expected linear time complexity for many special cases including a sequential list, a full binary tree, and a tree with depth 1. 1
4 0.74561781 267 nips-2011-Spectral Methods for Learning Multivariate Latent Tree Structure
Author: Animashree Anandkumar, Kamalika Chaudhuri, Daniel J. Hsu, Sham M. Kakade, Le Song, Tong Zhang
Abstract: This work considers the problem of learning the structure of multivariate linear tree models, which include a variety of directed tree graphical models with continuous, discrete, and mixed latent variables such as linear-Gaussian models, hidden Markov models, Gaussian mixture models, and Markov evolutionary trees. The setting is one where we only have samples from certain observed variables in the tree, and our goal is to estimate the tree structure (i.e., the graph of how the underlying hidden variables are connected to each other and to the observed variables). We propose the Spectral Recursive Grouping algorithm, an efficient and simple bottom-up procedure for recovering the tree structure from independent samples of the observed variables. Our finite sample size bounds for exact recovery of the tree structure reveal certain natural dependencies on underlying statistical and structural properties of the underlying joint distribution. Furthermore, our sample complexity guarantees have no explicit dependence on the dimensionality of the observed variables, making the algorithm applicable to many high-dimensional settings. At the heart of our algorithm is a spectral quartet test for determining the relative topology of a quartet of variables from second-order statistics. 1
5 0.72903687 177 nips-2011-Multi-armed bandits on implicit metric spaces
Author: Aleksandrs Slivkins
Abstract: The multi-armed bandit (MAB) setting is a useful abstraction of many online learning tasks which focuses on the trade-off between exploration and exploitation. In this setting, an online algorithm has a fixed set of alternatives (“arms”), and in each round it selects one arm and then observes the corresponding reward. While the case of small number of arms is by now well-understood, a lot of recent work has focused on multi-armed bandits with (infinitely) many arms, where one needs to assume extra structure in order to make the problem tractable. In particular, in the Lipschitz MAB problem there is an underlying similarity metric space, known to the algorithm, such that any two arms that are close in this metric space have similar payoffs. In this paper we consider the more realistic scenario in which the metric space is implicit – it is defined by the available structure but not revealed to the algorithm directly. Specifically, we assume that an algorithm is given a tree-based classification of arms. For any given problem instance such a classification implicitly defines a similarity metric space, but the numerical similarity information is not available to the algorithm. We provide an algorithm for this setting, whose performance guarantees (almost) match the best known guarantees for the corresponding instance of the Lipschitz MAB problem. 1
6 0.72404951 208 nips-2011-Optimistic Optimization of a Deterministic Function without the Knowledge of its Smoothness
7 0.63542747 115 nips-2011-Hierarchical Topic Modeling for Analysis of Time-Evolving Personal Choices
8 0.6239332 140 nips-2011-Kernel Embeddings of Latent Tree Graphical Models
9 0.61347359 96 nips-2011-Fast and Balanced: Efficient Label Tree Learning for Large Scale Object Recognition
10 0.60374683 6 nips-2011-A Global Structural EM Algorithm for a Model of Cancer Progression
11 0.59503323 274 nips-2011-Structure Learning for Optimization
12 0.52344543 170 nips-2011-Message-Passing for Approximate MAP Inference with Latent Variables
13 0.48208767 43 nips-2011-Bayesian Partitioning of Large-Scale Distance Data
14 0.46772593 233 nips-2011-Rapid Deformable Object Detection using Dual-Tree Branch-and-Bound
15 0.46553504 300 nips-2011-Variance Reduction in Monte-Carlo Tree Search
16 0.43737447 74 nips-2011-Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection
17 0.40656066 31 nips-2011-An Application of Tree-Structured Expectation Propagation for Channel Decoding
18 0.35267341 157 nips-2011-Learning to Search Efficiently in High Dimensions
19 0.34571004 256 nips-2011-Solving Decision Problems with Limited Information
20 0.33153826 151 nips-2011-Learning a Tree of Metrics with Disjoint Visual Features
topicId topicWeight
[(0, 0.032), (4, 0.046), (20, 0.019), (26, 0.02), (31, 0.07), (33, 0.019), (43, 0.04), (45, 0.068), (57, 0.512), (74, 0.033), (83, 0.024), (99, 0.029)]
simIndex simValue paperId paperTitle
1 0.92355251 224 nips-2011-Probabilistic Modeling of Dependencies Among Visual Short-Term Memory Representations
Author: Emin Orhan, Robert A. Jacobs
Abstract: Extensive evidence suggests that items are not encoded independently in visual short-term memory (VSTM). However, previous research has not quantitatively considered how the encoding of an item influences the encoding of other items. Here, we model the dependencies among VSTM representations using a multivariate Gaussian distribution with a stimulus-dependent mean and covariance matrix. We report the results of an experiment designed to determine the specific form of the stimulus-dependence of the mean and the covariance matrix. We find that the magnitude of the covariance between the representations of two items is a monotonically decreasing function of the difference between the items’ feature values, similar to a Gaussian process with a distance-dependent, stationary kernel function. We further show that this type of covariance function can be explained as a natural consequence of encoding multiple stimuli in a population of neurons with correlated responses. 1
same-paper 2 0.92294198 234 nips-2011-Reconstructing Patterns of Information Diffusion from Incomplete Observations
Author: Flavio Chierichetti, David Liben-nowell, Jon M. Kleinberg
Abstract: Motivated by the spread of on-line information in general and on-line petitions in particular, recent research has raised the following combinatorial estimation problem. There is a tree T that we cannot observe directly (representing the structure along which the information has spread), and certain nodes randomly decide to make their copy of the information public. In the case of a petition, the list of names on each public copy of the petition also reveals a path leading back to the root of the tree. What can we conclude about the properties of the tree we observe from these revealed paths, and can we use the structure of the observed tree to estimate the size of the full unobserved tree T ? Here we provide the first algorithm for this size estimation task, together with provable guarantees on its performance. We also establish structural properties of the observed tree, providing the first rigorous explanation for some of the unusual structural phenomena present in the spread of real chain-letter petitions on the Internet. 1
3 0.87822384 170 nips-2011-Message-Passing for Approximate MAP Inference with Latent Variables
Author: Jiarong Jiang, Piyush Rai, Hal Daume
Abstract: We consider a general inference setting for discrete probabilistic graphical models where we seek maximum a posteriori (MAP) estimates for a subset of the random variables (max nodes), marginalizing over the rest (sum nodes). We present a hybrid message-passing algorithm to accomplish this. The hybrid algorithm passes a mix of sum and max messages depending on the type of source node (sum or max). We derive our algorithm by showing that it falls out as the solution of a particular relaxation of a variational framework. We further show that the Expectation Maximization algorithm can be seen as an approximation to our algorithm. Experimental results on synthetic and real-world datasets, against several baselines, demonstrate the efficacy of our proposed algorithm. 1
4 0.84510988 100 nips-2011-Gaussian Process Training with Input Noise
Author: Andrew Mchutchon, Carl E. Rasmussen
Abstract: In standard Gaussian Process regression input locations are assumed to be noise free. We present a simple yet effective GP model for training on input points corrupted by i.i.d. Gaussian noise. To make computations tractable we use a local linear expansion about each input point. This allows the input noise to be recast as output noise proportional to the squared gradient of the GP posterior mean. The input noise variances are inferred from the data as extra hyperparameters. They are trained alongside other hyperparameters by the usual method of maximisation of the marginal likelihood. Training uses an iterative scheme, which alternates between optimising the hyperparameters and calculating the posterior gradient. Analytic predictive moments can then be found for Gaussian distributed test points. We compare our model to others over a range of different regression problems and show that it improves over current methods. 1
5 0.8367663 111 nips-2011-Hashing Algorithms for Large-Scale Learning
Author: Ping Li, Anshumali Shrivastava, Joshua L. Moore, Arnd C. König
Abstract: Minwise hashing is a standard technique in the context of search for efficiently computing set similarities. The recent development of b-bit minwise hashing provides a substantial improvement by storing only the lowest b bits of each hashed value. In this paper, we demonstrate that b-bit minwise hashing can be naturally integrated with linear learning algorithms such as linear SVM and logistic regression, to solve large-scale and high-dimensional statistical learning tasks, especially when the data do not fit in memory. We compare b-bit minwise hashing with the Count-Min (CM) and Vowpal Wabbit (VW) algorithms, which have essentially the same variances as random projections. Our theoretical and empirical comparisons illustrate that b-bit minwise hashing is significantly more accurate (at the same storage cost) than VW (and random projections) for binary data.
6 0.67868692 171 nips-2011-Metric Learning with Multiple Kernels
7 0.59435564 242 nips-2011-See the Tree Through the Lines: The Shazoo Algorithm
8 0.5811336 157 nips-2011-Learning to Search Efficiently in High Dimensions
9 0.55175138 226 nips-2011-Projection onto A Nonnegative Max-Heap
10 0.54752356 140 nips-2011-Kernel Embeddings of Latent Tree Graphical Models
11 0.54126436 89 nips-2011-Estimating time-varying input signals and ion channel states from a single voltage trace of a neuron
12 0.51934212 177 nips-2011-Multi-armed bandits on implicit metric spaces
13 0.50323832 31 nips-2011-An Application of Tree-Structured Expectation Propagation for Channel Decoding
14 0.50206596 135 nips-2011-Information Rates and Optimal Decoding in Large Neural Populations
15 0.50101811 219 nips-2011-Predicting response time and error rates in visual search
16 0.49822232 233 nips-2011-Rapid Deformable Object Detection using Dual-Tree Branch-and-Bound
17 0.48960793 82 nips-2011-Efficient coding of natural images with a population of noisy Linear-Nonlinear neurons
18 0.48890147 292 nips-2011-Two is better than one: distinct roles for familiarity and recollection in retrieving palimpsest memories
19 0.4883967 30 nips-2011-Algorithms for Hyper-Parameter Optimization
20 0.47038415 150 nips-2011-Learning a Distance Metric from a Network