nips nips2010 nips2010-131 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Eric Wang, Dehong Liu, Jorge Silva, Lawrence Carin, David B. Dunson
Abstract: We consider problems for which one has incomplete binary matrices that evolve with time (e.g., the votes of legislators on particular legislation, with each year characterized by a different such matrix). An objective of such analysis is to infer structure and inter-relationships underlying the matrices, here defined by latent features associated with each axis of the matrix. In addition, it is assumed that documents are available for the entities associated with at least one of the matrix axes. By jointly analyzing the matrices and documents, one may be used to inform the other within the analysis, and the model offers the opportunity to predict matrix values (e.g., votes) based only on an associated document (e.g., legislation). The research presented here merges two areas of machine-learning that have previously been investigated separately: incomplete-matrix analysis and topic modeling. The analysis is performed from a Bayesian perspective, with efficient inference constituted via Gibbs sampling. The framework is demonstrated by considering all voting data and available documents (legislation) during the 220-year lifetime of the United States Senate and House of Representatives. 1
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract We consider problems for which one has incomplete binary matrices that evolve with time (e. [sent-8, score-0.298]
2 , the votes of legislators on particular legislation, with each year characterized by a different such matrix). [sent-10, score-0.471]
3 An objective of such analysis is to infer structure and inter-relationships underlying the matrices, here defined by latent features associated with each axis of the matrix. [sent-11, score-0.263]
4 In addition, it is assumed that documents are available for the entities associated with at least one of the matrix axes. [sent-12, score-0.493]
5 By jointly analyzing the matrices and documents, one may be used to inform the other within the analysis, and the model offers the opportunity to predict matrix values (e. [sent-13, score-0.434]
6 The research presented here merges two areas of machine-learning that have previously been investigated separately: incomplete-matrix analysis and topic modeling. [sent-18, score-0.199]
7 The analysis is performed from a Bayesian perspective, with efficient inference constituted via Gibbs sampling. [sent-19, score-0.174]
8 The framework is demonstrated by considering all voting data and available documents (legislation) during the 220-year lifetime of the United States Senate and House of Representatives. [sent-20, score-0.42]
9 1 Introduction There has been significant recent research on the analysis of incomplete matrices [10, 15, 1, 12, 13, 18]. [sent-21, score-0.251]
10 Most analyses have been performed under the assumption that the matrix is real. [sent-22, score-0.123]
11 There are interesting problems for which the matrices may be binary; for example, reflecting the presence/absence of links on nodes of a graph, or for analysis of data associated with a series of binary questions. [sent-23, score-0.422]
12 One may connect an underlying real matrix to binary (or, more generally, integer) observations via a probit or logistic link function; for example, such analysis has been performed in the context of analyzing legislative roll-call data [6]. [sent-24, score-0.648]
13 A problem that has received less attention concerns the analysis of time-evolving matrices. [sent-25, score-0.106]
14 The specific motivation of this paper involves binary questions in a legislative setting; we are interested in analyzing such data over many legislative sessions, and since the legislators change over time, it is undesirable to treat the entire set of votes as a single matrix. [sent-26, score-0.983]
15 Each piece of legislation (question) is unique, but it is desirable to infer inter-relationships and commonalities over time. [sent-27, score-0.709]
16 Similar latent groupings and relationships exist for the legislators. [sent-28, score-0.119]
17 This general setting is also of interest for analysis of more-general social networks [8]. [sent-29, score-0.034]
18 A distinct line of research has focused on analysis of documents, with topic modeling constituting a popular framework [4, 2, 17, 3, 11]. [sent-30, score-0.244]
19 Although the analysis of matrices and documents has heretofore been performed independently, there are many problems for which documents and matrices may be coupled. [sent-31, score-1.033]
20 For example, in addition to a matrix of links between websites or email sender/recipient data, one also has access to the associated documents (website and email content). [sent-32, score-0.676]
21 By analyzing the matrices and documents simultaneously, one may infer inter-relationships about each. [sent-33, score-0.61]
22 For example, in a factor-based model of matrices [8], the associated documents may be used to relate matrix factors to topics/words, providing insight from the documents about the matrix, and vice versa. [sent-34, score-0.932]
23 1 To the authors’ knowledge, this paper represents the first joint analysis of time-evolving matrices and associated documents. [sent-35, score-0.294]
24 The analysis is performed using nonparametric Bayesian tools; for example, the truncated Dirichlet process [7] is used to jointly cluster latent topics and matrix features. [sent-36, score-0.431]
25 The framework is demonstrated through analysis of large-scale data sets. [sent-37, score-0.075]
26 Specifically, we consider binary vote matrices from the United States Senate and House of Representatives, from the first congress in 1789 to the present. [sent-38, score-0.454]
27 Documents of the legislation are available for the most recent 20 years, and those are also analyzed jointly with the matrix data. [sent-39, score-0.61]
28 The quantitative predictive performance of this framework is demonstrated, as is the power of this setting for making qualitative assessments of large-scale and complex joint matrix-document data. [sent-40, score-0.039]
29 1 Time-evolving binary matrices (t) (t) Assume we are given a set of binary matrices, {Bt }t=1,τ , with Bt ∈ {0, 1}Ny ×Nx . [sent-42, score-0.336]
30 For example, for the legislative roll-call data consider below, time index t corresponds to year and the number of pieces of legislation and legislators changes with time (e. [sent-44, score-1.076]
31 , for the historical data considered for the United States congress, the number of states and hence legislators changes as the country has grown). [sent-46, score-0.377]
32 The random effects are drawn N (0, λ−1 ), α with λα ∼ µα δ∞ + (1 − µα )Gamma(a, b) and λβ ∼ ∼ and ∼ µβ δ∞ + (1 − µβ )Gamma(a, b); δ∞ is a point measure at infinity, corresponding to there not being an associated random effect. [sent-48, score-0.154]
33 The probability of whether there is a random effect is controlled by µβ and µα , each of which is drawn from a beta distribution. [sent-49, score-0.156]
34 In previous political science Bayesian analysis [6] researchers have simply set µβ = 1 and µα = 0, but here we consider the model in a more-general setting, and infer these relationships. [sent-51, score-0.197]
35 (t) (t) Additionally, in previous Bayesian analysis [6] the dimensionality of yi and xj has been set (usually to one or two). [sent-52, score-0.239]
36 In related probabilistic matrix factorization (PMF) applied to real matrices [15, 12], priors/regularizers are employed to constrain the dimensionality of the latent features. [sent-53, score-0.323]
37 Here we employ the sparse binary vector b ∈ {0, 1}K , with bk ∼ Bernoulli(πk ), and πk ∼ Beta(c/K, d(K − 1)/K), for K set to a large integer. [sent-54, score-0.114]
38 Specifically, by integrating out the {πk }k=1,K , one may readily show that the number of non-zero components in b is a random variable drawn from Binomial(K, c/(c + d(K − 1))), and the expected number of ones in b is cK/[c + d(K − 1)]. [sent-56, score-0.068]
39 This is related to a draw from a truncated beta-Bernoulli process [16]. [sent-57, score-0.123]
40 Specifically, we assume that each row corresponds to a person/entity that may be present for matrix t + 1 and matrix t. [sent-59, score-0.144]
41 It is assumed here that each column corresponds to a question (in the examples, a piece of legislation), and each question is unique. [sent-60, score-0.173]
42 Since (t) (t) (t) −1 ˆ ˆ the columns are each unique, we assume xj = b ◦ xj , xj ∼ N (0, γx IK ), γx ∼ Gamma(e, f ), where ◦ denotes the pointwise/Hadamard vector product. [sent-61, score-0.216]
43 If the person/entity associated with the ith row at time t is introduced for the first time, its associated feature vector is similarly drawn (t) (t) (t) (t) −1 ˆ ˆ yi = b ◦ yi , yi ∼ N (0, γy IK ), with γy ∼ Gamma(e, f ). [sent-62, score-0.639]
44 However, assuming yi is already drawn (person/entity i is active prior to time t + 1), then a simple auto-regressive model is (t+1) (t+1) (t+1) (t+1) (t) ˆ ˆ ˆ used to draw yi : yi = b ◦ yi , yi ∼ N (yi , ξ −1 IK ), with ξ ∼ Gamma(g, h). [sent-63, score-0.8]
45 The prior on ξ is set to favor small/smooth changes in the features of an individual on consecutive years. [sent-64, score-0.09]
46 This model constitutes a relatively direct extension of existing techniques for real matrices [15, 12]. [sent-65, score-0.174]
47 Specifically, we have introduced a probit link function and a simple auto-regression construction to 2 impose statistical correlation in the traits of a person/entity at consecutive times. [sent-66, score-0.261]
48 The introduction of the random effects αj and βi has also not been considered within much of the machine-learning matrix-analysis literature, but the use of αj is standard in political science Bayesian models [6]. [sent-67, score-0.097]
49 The principal modeling contribution of this paper concerns how one may integrate such a time-evolving binary-matrix model with associated documents. [sent-68, score-0.199]
50 2 Topic model The manner in which the topic modeling is performed is a generalization of latent Dirichlet allocation (LDA) [4]. [sent-70, score-0.296]
51 Assume that the documents of interest have words drawn from a vocabulary V = {w1 , . [sent-71, score-0.368]
52 The kth topic is characterized by a distribution pk on words (“bag-of-words” assumption), where pk ∼ Dir(αV /V, . [sent-75, score-0.382]
53 The generative model draws {pk }k=1,T once for each of the T possible topics. [sent-79, score-0.111]
54 Each document is characterized by a probability distribution on topics, where the cl ∼ Dir(αT /T, . [sent-80, score-0.276]
55 , αT /T ) corresponds to the distribution across T topics for document l. [sent-83, score-0.219]
56 The generative process for drawing words for document l is to first (and once) draw cl for document l. [sent-84, score-0.456]
57 For word i in document l, we draw a topic zil ∼ Mult(cl ), and then the specific word is drawn from a multinomial with probability vector pzil . [sent-85, score-0.47]
58 The above procedure is like the standard LDA [4], with the difference manifested in how we handle the Dirichlet distributions Dir(αV /V, . [sent-86, score-0.048]
59 Specifically, the following hierarchical construction is used for draws from Dir(αV /V, . [sent-94, score-0.112]
wordName wordTfidf (topN-words)
[('legislation', 0.496), ('documents', 0.3), ('legislative', 0.221), ('legislators', 0.221), ('dir', 0.21), ('matrices', 0.174), ('gamma', 0.136), ('votes', 0.133), ('yi', 0.133), ('topic', 0.127), ('vote', 0.124), ('document', 0.12), ('bt', 0.113), ('cl', 0.105), ('piece', 0.105), ('pk', 0.102), ('topics', 0.099), ('senate', 0.097), ('political', 0.097), ('uh', 0.097), ('constituted', 0.089), ('beta', 0.088), ('associated', 0.086), ('dirichlet', 0.085), ('nx', 0.083), ('dunson', 0.083), ('probit', 0.083), ('lda', 0.083), ('ik', 0.081), ('binary', 0.081), ('united', 0.08), ('latent', 0.077), ('ah', 0.075), ('congress', 0.075), ('duke', 0.075), ('house', 0.072), ('concerns', 0.072), ('matrix', 0.072), ('xj', 0.072), ('analyzing', 0.07), ('drawn', 0.068), ('draws', 0.067), ('draw', 0.067), ('infer', 0.066), ('email', 0.066), ('year', 0.066), ('truncated', 0.056), ('consecutive', 0.053), ('characterized', 0.051), ('performed', 0.051), ('manifested', 0.048), ('jorge', 0.048), ('mult', 0.048), ('voted', 0.048), ('links', 0.047), ('construction', 0.045), ('grown', 0.044), ('traits', 0.044), ('pmf', 0.044), ('sethuraman', 0.044), ('country', 0.044), ('carin', 0.044), ('lifetime', 0.044), ('generative', 0.044), ('word', 0.044), ('bayesian', 0.043), ('incomplete', 0.043), ('jointly', 0.042), ('groupings', 0.042), ('commonalities', 0.042), ('inform', 0.042), ('constituting', 0.042), ('sessions', 0.042), ('permitting', 0.042), ('demonstrated', 0.041), ('modeling', 0.041), ('gibbs', 0.04), ('assessments', 0.039), ('wv', 0.039), ('historical', 0.039), ('conjugacy', 0.039), ('websites', 0.039), ('cally', 0.039), ('merges', 0.038), ('changes', 0.037), ('undesirable', 0.036), ('link', 0.036), ('states', 0.036), ('voting', 0.035), ('entities', 0.035), ('pieces', 0.035), ('silva', 0.035), ('analysis', 0.034), ('opportunity', 0.034), ('website', 0.034), ('binomial', 0.034), ('question', 0.034), ('employ', 0.033), ('ny', 0.032), ('xt', 0.032)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999982 131 nips-2010-Joint Analysis of Time-Evolving Binary Matrices and Associated Documents
Author: Eric Wang, Dehong Liu, Jorge Silva, Lawrence Carin, David B. Dunson
Abstract: We consider problems for which one has incomplete binary matrices that evolve with time (e.g., the votes of legislators on particular legislation, with each year characterized by a different such matrix). An objective of such analysis is to infer structure and inter-relationships underlying the matrices, here defined by latent features associated with each axis of the matrix. In addition, it is assumed that documents are available for the entities associated with at least one of the matrix axes. By jointly analyzing the matrices and documents, one may be used to inform the other within the analysis, and the model offers the opportunity to predict matrix values (e.g., votes) based only on an associated document (e.g., legislation). The research presented here merges two areas of machine-learning that have previously been investigated separately: incomplete-matrix analysis and topic modeling. The analysis is performed from a Bayesian perspective, with efficient inference constituted via Gibbs sampling. The framework is demonstrated by considering all voting data and available documents (legislation) during the 220-year lifetime of the United States Senate and House of Representatives. 1
2 0.1932335 286 nips-2010-Word Features for Latent Dirichlet Allocation
Author: James Petterson, Wray Buntine, Shravan M. Narayanamurthy, Tibério S. Caetano, Alex J. Smola
Abstract: We extend Latent Dirichlet Allocation (LDA) by explicitly allowing for the encoding of side information in the distribution over words. This results in a variety of new capabilities, such as improved estimates for infrequently occurring words, as well as the ability to leverage thesauri and dictionaries in order to boost topic cohesion within and across languages. We present experiments on multi-language topic synchronisation where dictionary information is used to bias corresponding words towards similar topics. Results indicate that our model substantially improves topic cohesion when compared to the standard LDA model. 1
3 0.166026 276 nips-2010-Tree-Structured Stick Breaking for Hierarchical Data
Author: Zoubin Ghahramani, Michael I. Jordan, Ryan P. Adams
Abstract: Many data are naturally modeled by an unobserved hierarchical structure. In this paper we propose a flexible nonparametric prior over unknown data hierarchies. The approach uses nested stick-breaking processes to allow for trees of unbounded width and depth, where data can live at any node and are infinitely exchangeable. One can view our model as providing infinite mixtures where the components have a dependency structure corresponding to an evolutionary diffusion down a tree. By using a stick-breaking approach, we can apply Markov chain Monte Carlo methods based on slice sampling to perform Bayesian inference and simulate from the posterior distribution on trees. We apply our method to hierarchical clustering of images and topic modeling of text data. 1
4 0.1603854 194 nips-2010-Online Learning for Latent Dirichlet Allocation
Author: Matthew Hoffman, Francis R. Bach, David M. Blei
Abstract: We develop an online variational Bayes (VB) algorithm for Latent Dirichlet Allocation (LDA). Online LDA is based on online stochastic optimization with a natural gradient step, which we show converges to a local optimum of the VB objective function. It can handily analyze massive document collections, including those arriving in a stream. We study the performance of online LDA in several ways, including by fitting a 100-topic topic model to 3.3M articles from Wikipedia in a single pass. We demonstrate that online LDA finds topic models as good or better than those found with batch VB, and in a fraction of the time. 1
5 0.15995789 150 nips-2010-Learning concept graphs from text with stick-breaking priors
Author: America Chambers, Padhraic Smyth, Mark Steyvers
Abstract: We present a generative probabilistic model for learning general graph structures, which we term concept graphs, from text. Concept graphs provide a visual summary of the thematic content of a collection of documents—a task that is difficult to accomplish using only keyword search. The proposed model can learn different types of concept graph structures and is capable of utilizing partial prior knowledge about graph structure as well as labeled documents. We describe a generative model that is based on a stick-breaking process for graphs, and a Markov Chain Monte Carlo inference procedure. Experiments on simulated data show that the model can recover known graph structure when learning in both unsupervised and semi-supervised modes. We also show that the proposed model is competitive in terms of empirical log likelihood with existing structure-based topic models (hPAM and hLDA) on real-world text data sets. Finally, we illustrate the application of the model to the problem of updating Wikipedia category graphs. 1
6 0.15646471 277 nips-2010-Two-Layer Generalization Analysis for Ranking Using Rademacher Average
7 0.14774552 60 nips-2010-Deterministic Single-Pass Algorithm for LDA
8 0.10410295 51 nips-2010-Construction of Dependent Dirichlet Processes based on Poisson Processes
9 0.069884002 237 nips-2010-Shadow Dirichlet for Restricted Probability Modeling
10 0.067977823 195 nips-2010-Online Learning in The Manifold of Low-Rank Matrices
11 0.062844418 169 nips-2010-More data means less inference: A pseudo-max approach to structured learning
12 0.061270248 49 nips-2010-Computing Marginal Distributions over Continuous Markov Networks for Statistical Relational Learning
13 0.061249316 162 nips-2010-Link Discovery using Graph Feature Tracking
14 0.060987443 108 nips-2010-Graph-Valued Regression
15 0.059891704 257 nips-2010-Structured Determinantal Point Processes
16 0.058663804 55 nips-2010-Cross Species Expression Analysis using a Dirichlet Process Mixture Model with Latent Matchings
17 0.057343334 198 nips-2010-Optimal Web-Scale Tiering as a Flow Problem
18 0.056598064 177 nips-2010-Multitask Learning without Label Correspondences
19 0.05475428 213 nips-2010-Predictive Subspace Learning for Multi-view Data: a Large Margin Approach
20 0.053247862 235 nips-2010-Self-Paced Learning for Latent Variable Models
topicId topicWeight
[(0, 0.156), (1, 0.054), (2, 0.026), (3, 0.012), (4, -0.284), (5, 0.09), (6, 0.158), (7, -0.002), (8, -0.092), (9, -0.048), (10, 0.14), (11, 0.05), (12, 0.024), (13, 0.086), (14, -0.003), (15, 0.027), (16, -0.012), (17, 0.013), (18, -0.037), (19, 0.069), (20, 0.082), (21, -0.004), (22, -0.007), (23, 0.044), (24, -0.034), (25, 0.055), (26, -0.038), (27, -0.046), (28, 0.095), (29, 0.006), (30, -0.039), (31, 0.037), (32, -0.031), (33, 0.051), (34, 0.053), (35, 0.025), (36, 0.066), (37, -0.01), (38, 0.01), (39, 0.024), (40, -0.054), (41, 0.09), (42, -0.085), (43, 0.064), (44, -0.091), (45, -0.015), (46, 0.068), (47, -0.02), (48, -0.03), (49, 0.01)]
simIndex simValue paperId paperTitle
same-paper 1 0.94870263 131 nips-2010-Joint Analysis of Time-Evolving Binary Matrices and Associated Documents
Author: Eric Wang, Dehong Liu, Jorge Silva, Lawrence Carin, David B. Dunson
Abstract: We consider problems for which one has incomplete binary matrices that evolve with time (e.g., the votes of legislators on particular legislation, with each year characterized by a different such matrix). An objective of such analysis is to infer structure and inter-relationships underlying the matrices, here defined by latent features associated with each axis of the matrix. In addition, it is assumed that documents are available for the entities associated with at least one of the matrix axes. By jointly analyzing the matrices and documents, one may be used to inform the other within the analysis, and the model offers the opportunity to predict matrix values (e.g., votes) based only on an associated document (e.g., legislation). The research presented here merges two areas of machine-learning that have previously been investigated separately: incomplete-matrix analysis and topic modeling. The analysis is performed from a Bayesian perspective, with efficient inference constituted via Gibbs sampling. The framework is demonstrated by considering all voting data and available documents (legislation) during the 220-year lifetime of the United States Senate and House of Representatives. 1
2 0.77155024 286 nips-2010-Word Features for Latent Dirichlet Allocation
Author: James Petterson, Wray Buntine, Shravan M. Narayanamurthy, Tibério S. Caetano, Alex J. Smola
Abstract: We extend Latent Dirichlet Allocation (LDA) by explicitly allowing for the encoding of side information in the distribution over words. This results in a variety of new capabilities, such as improved estimates for infrequently occurring words, as well as the ability to leverage thesauri and dictionaries in order to boost topic cohesion within and across languages. We present experiments on multi-language topic synchronisation where dictionary information is used to bias corresponding words towards similar topics. Results indicate that our model substantially improves topic cohesion when compared to the standard LDA model. 1
3 0.73624146 60 nips-2010-Deterministic Single-Pass Algorithm for LDA
Author: Issei Sato, Kenichi Kurihara, Hiroshi Nakagawa
Abstract: We develop a deterministic single-pass algorithm for latent Dirichlet allocation (LDA) in order to process received documents one at a time and then discard them in an excess text stream. Our algorithm does not need to store old statistics for all data. The proposed algorithm is much faster than a batch algorithm and is comparable to the batch algorithm in terms of perplexity in experiments.
4 0.64489371 150 nips-2010-Learning concept graphs from text with stick-breaking priors
Author: America Chambers, Padhraic Smyth, Mark Steyvers
Abstract: We present a generative probabilistic model for learning general graph structures, which we term concept graphs, from text. Concept graphs provide a visual summary of the thematic content of a collection of documents—a task that is difficult to accomplish using only keyword search. The proposed model can learn different types of concept graph structures and is capable of utilizing partial prior knowledge about graph structure as well as labeled documents. We describe a generative model that is based on a stick-breaking process for graphs, and a Markov Chain Monte Carlo inference procedure. Experiments on simulated data show that the model can recover known graph structure when learning in both unsupervised and semi-supervised modes. We also show that the proposed model is competitive in terms of empirical log likelihood with existing structure-based topic models (hPAM and hLDA) on real-world text data sets. Finally, we illustrate the application of the model to the problem of updating Wikipedia category graphs. 1
5 0.62580556 194 nips-2010-Online Learning for Latent Dirichlet Allocation
Author: Matthew Hoffman, Francis R. Bach, David M. Blei
Abstract: We develop an online variational Bayes (VB) algorithm for Latent Dirichlet Allocation (LDA). Online LDA is based on online stochastic optimization with a natural gradient step, which we show converges to a local optimum of the VB objective function. It can handily analyze massive document collections, including those arriving in a stream. We study the performance of online LDA in several ways, including by fitting a 100-topic topic model to 3.3M articles from Wikipedia in a single pass. We demonstrate that online LDA finds topic models as good or better than those found with batch VB, and in a fraction of the time. 1
6 0.62034112 277 nips-2010-Two-Layer Generalization Analysis for Ranking Using Rademacher Average
7 0.57730663 276 nips-2010-Tree-Structured Stick Breaking for Hierarchical Data
8 0.51291901 237 nips-2010-Shadow Dirichlet for Restricted Probability Modeling
9 0.44811976 51 nips-2010-Construction of Dependent Dirichlet Processes based on Poisson Processes
10 0.44460842 198 nips-2010-Optimal Web-Scale Tiering as a Flow Problem
11 0.40910536 55 nips-2010-Cross Species Expression Analysis using a Dirichlet Process Mixture Model with Latent Matchings
12 0.38918775 213 nips-2010-Predictive Subspace Learning for Multi-view Data: a Large Margin Approach
13 0.37882611 215 nips-2010-Probabilistic Deterministic Infinite Automata
14 0.37821364 129 nips-2010-Inter-time segment information sharing for non-homogeneous dynamic Bayesian networks
15 0.36001521 287 nips-2010-Worst-Case Linear Discriminant Analysis
16 0.35113305 120 nips-2010-Improvements to the Sequence Memoizer
17 0.35073048 106 nips-2010-Global Analytic Solution for Variational Bayesian Matrix Factorization
18 0.33577362 2 nips-2010-A Bayesian Approach to Concept Drift
19 0.326233 195 nips-2010-Online Learning in The Manifold of Low-Rank Matrices
20 0.32423466 74 nips-2010-Empirical Bernstein Inequalities for U-Statistics
topicId topicWeight
[(9, 0.272), (13, 0.044), (27, 0.175), (30, 0.044), (35, 0.027), (45, 0.175), (50, 0.04), (52, 0.062), (60, 0.027), (77, 0.011), (90, 0.034)]
simIndex simValue paperId paperTitle
same-paper 1 0.75333208 131 nips-2010-Joint Analysis of Time-Evolving Binary Matrices and Associated Documents
Author: Eric Wang, Dehong Liu, Jorge Silva, Lawrence Carin, David B. Dunson
Abstract: We consider problems for which one has incomplete binary matrices that evolve with time (e.g., the votes of legislators on particular legislation, with each year characterized by a different such matrix). An objective of such analysis is to infer structure and inter-relationships underlying the matrices, here defined by latent features associated with each axis of the matrix. In addition, it is assumed that documents are available for the entities associated with at least one of the matrix axes. By jointly analyzing the matrices and documents, one may be used to inform the other within the analysis, and the model offers the opportunity to predict matrix values (e.g., votes) based only on an associated document (e.g., legislation). The research presented here merges two areas of machine-learning that have previously been investigated separately: incomplete-matrix analysis and topic modeling. The analysis is performed from a Bayesian perspective, with efficient inference constituted via Gibbs sampling. The framework is demonstrated by considering all voting data and available documents (legislation) during the 220-year lifetime of the United States Senate and House of Representatives. 1
2 0.68285996 121 nips-2010-Improving Human Judgments by Decontaminating Sequential Dependencies
Author: Harold Pashler, Matthew Wilder, Robert Lindsey, Matt Jones, Michael C. Mozer, Michael P. Holmes
Abstract: For over half a century, psychologists have been struck by how poor people are at expressing their internal sensations, impressions, and evaluations via rating scales. When individuals make judgments, they are incapable of using an absolute rating scale, and instead rely on reference points from recent experience. This relativity of judgment limits the usefulness of responses provided by individuals to surveys, questionnaires, and evaluation forms. Fortunately, the cognitive processes that transform internal states to responses are not simply noisy, but rather are influenced by recent experience in a lawful manner. We explore techniques to remove sequential dependencies, and thereby decontaminate a series of ratings to obtain more meaningful human judgments. In our formulation, decontamination is fundamentally a problem of inferring latent states (internal sensations) which, because of the relativity of judgment, have temporal dependencies. We propose a decontamination solution using a conditional random field with constraints motivated by psychological theories of relative judgment. Our exploration of decontamination models is supported by two experiments we conducted to obtain ground-truth rating data on a simple length estimation task. Our decontamination techniques yield an over 20% reduction in the error of human judgments. 1
3 0.68272763 266 nips-2010-The Maximal Causes of Natural Scenes are Edge Filters
Author: Jose Puertas, Joerg Bornschein, Joerg Luecke
Abstract: We study the application of a strongly non-linear generative model to image patches. As in standard approaches such as Sparse Coding or Independent Component Analysis, the model assumes a sparse prior with independent hidden variables. However, in the place where standard approaches use the sum to combine basis functions we use the maximum. To derive tractable approximations for parameter estimation we apply a novel approach based on variational Expectation Maximization. The derived learning algorithm can be applied to large-scale problems with hundreds of observed and hidden variables. Furthermore, we can infer all model parameters including observation noise and the degree of sparseness. In applications to image patches we find that Gabor-like basis functions are obtained. Gabor-like functions are thus not a feature exclusive to approaches assuming linear superposition. Quantitatively, the inferred basis functions show a large diversity of shapes with many strongly elongated and many circular symmetric functions. The distribution of basis function shapes reflects properties of simple cell receptive fields that are not reproduced by standard linear approaches. In the study of natural image statistics, the implications of using different superposition assumptions have so far not been investigated systematically because models with strong non-linearities have been found analytically and computationally challenging. The presented algorithm represents the first large-scale application of such an approach. 1
4 0.68196452 75 nips-2010-Empirical Risk Minimization with Approximations of Probabilistic Grammars
Author: Noah A. Smith, Shay B. Cohen
Abstract: Probabilistic grammars are generative statistical models that are useful for compositional and sequential structures. We present a framework, reminiscent of structural risk minimization, for empirical risk minimization of the parameters of a fixed probabilistic grammar using the log-loss. We derive sample complexity bounds in this framework that apply both to the supervised setting and the unsupervised setting. 1
5 0.68171203 39 nips-2010-Bayesian Action-Graph Games
Author: Albert X. Jiang, Kevin Leyton-brown
Abstract: Games of incomplete information, or Bayesian games, are an important gametheoretic model and have many applications in economics. We propose Bayesian action-graph games (BAGGs), a novel graphical representation for Bayesian games. BAGGs can represent arbitrary Bayesian games, and furthermore can compactly express Bayesian games exhibiting commonly encountered types of structure including symmetry, action- and type-specific utility independence, and probabilistic independence of type distributions. We provide an algorithm for computing expected utility in BAGGs, and discuss conditions under which the algorithm runs in polynomial time. Bayes-Nash equilibria of BAGGs can be computed by adapting existing algorithms for complete-information normal form games and leveraging our expected utility algorithm. We show both theoretically and empirically that our approaches improve significantly on the state of the art. 1
6 0.68051058 161 nips-2010-Linear readout from a neural population with partial correlation data
7 0.6759721 81 nips-2010-Evaluating neuronal codes for inference using Fisher information
8 0.6707055 21 nips-2010-Accounting for network effects in neuronal responses using L1 regularized point process models
9 0.66865581 60 nips-2010-Deterministic Single-Pass Algorithm for LDA
10 0.66845411 128 nips-2010-Infinite Relational Modeling of Functional Connectivity in Resting State fMRI
11 0.66445547 6 nips-2010-A Discriminative Latent Model of Image Region and Object Tag Correspondence
12 0.66183436 98 nips-2010-Functional form of motion priors in human motion perception
13 0.65787601 194 nips-2010-Online Learning for Latent Dirichlet Allocation
14 0.65641272 44 nips-2010-Brain covariance selection: better individual functional connectivity models using population prior
15 0.65499014 268 nips-2010-The Neural Costs of Optimal Control
16 0.64834923 119 nips-2010-Implicit encoding of prior probabilities in optimal neural populations
17 0.64597523 17 nips-2010-A biologically plausible network for the computation of orientation dominance
18 0.64208919 123 nips-2010-Individualized ROI Optimization via Maximization of Group-wise Consistency of Structural and Functional Profiles
19 0.64067489 97 nips-2010-Functional Geometry Alignment and Localization of Brain Areas
20 0.63917744 56 nips-2010-Deciphering subsampled data: adaptive compressive sampling as a principle of brain communication