nips nips2012 nips2012-52 knowledge-graph by maker-knowledge-mining

52 nips-2012-Bayesian Nonparametric Modeling of Suicide Attempts


Source: pdf

Author: Francisco Ruiz, Isabel Valera, Carlos Blanco, Fernando Pérez-Cruz

Abstract: The National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) database contains a large amount of information, regarding the way of life, medical conditions, etc., of a representative sample of the U.S. population. In this paper, we are interested in seeking the hidden causes behind the suicide attempts, for which we propose to model the subjects using a nonparametric latent model based on the Indian Buffet Process (IBP). Due to the nature of the data, we need to adapt the observation model for discrete random variables. We propose a generative model in which the observations are drawn from a multinomial-logit distribution given the IBP matrix. The implementation of an efficient Gibbs sampler is accomplished using the Laplace approximation, which allows integrating out the weighting factors of the multinomial-logit likelihood model. Finally, the experiments over the NESARC database show that our model properly captures some of the hidden causes that model suicide attempts. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract The National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) database contains a large amount of information, regarding the way of life, medical conditions, etc. [sent-11, score-0.083]

2 In this paper, we are interested in seeking the hidden causes behind the suicide attempts, for which we propose to model the subjects using a nonparametric latent model based on the Indian Buffet Process (IBP). [sent-15, score-0.786]

3 Due to the nature of the data, we need to adapt the observation model for discrete random variables. [sent-16, score-0.075]

4 The implementation of an efficient Gibbs sampler is accomplished using the Laplace approximation, which allows integrating out the weighting factors of the multinomial-logit likelihood model. [sent-18, score-0.074]

5 Finally, the experiments over the NESARC database show that our model properly captures some of the hidden causes that model suicide attempts. [sent-19, score-0.711]

6 , where suicide prevention is one of the top public health priorities [1]. [sent-22, score-0.755]

7 The current strategies for suicide prevention have focused mainly on both the detection and treatment of mental disorders [13], and on the treatment of the suicidal behaviors themselves [4]. [sent-23, score-0.888]

8 However, despite prevention efforts including improvements in the treatment of depression, the lifetime prevalence of suicide attempts in the U. [sent-24, score-0.855]

9 This suggests that there is a need to improve understanding of the risk factors for suicide attempts beyond psychiatric disorders, particularly in non-clinical populations. [sent-27, score-0.715]

10 According to the National Strategy for Suicide Prevention, an important first step in a public health approach to suicide prevention is to identify those at increased risk for suicide attempts [1]. [sent-28, score-1.47]

11 Suicide attempts are, by far, the best predictor of completed suicide [12] and are also associated with major morbidity themselves [11]. [sent-29, score-0.656]

12 The estimation of suicide attempt risk is a challenging and complex task, with multiple risk factors linked to increased risk. [sent-30, score-0.78]

13 In the absence of reliable tools for identifying those at risk for suicide attempts, be they clinical or laboratory tests, risk detection still relays mainly on clinical variables. [sent-31, score-0.717]

14 The adequacy of the current predictive models and screening methods has been 1 questioned [12], and it has been suggested that the methods currently used for research on suicide risk factors and prediction models need revamping [9]. [sent-32, score-0.658]

15 Databases that model the behavior of human populations present typically many related questions and analyzing each one of them individually, or a small group of them, do not lead to conclusive results. [sent-33, score-0.14]

16 population with nearly 3,000 questions regarding, among others, their way of life, their medical conditions, depression and other mental disorders. [sent-36, score-0.288]

17 It contains yes-or-no questions, and some multiple-choice and questions with ordinal answers. [sent-37, score-0.14]

18 In this paper, we propose to model the subjects in this database using a nonparametric latent model that allows us to seek hidden causes and compact in a few features the immense redundant information. [sent-38, score-0.283]

19 Our starting point is the Indian Buffet Process (IBP) [5], because it allows us to infer which latent features influence the observations and how many features there are. [sent-39, score-0.229]

20 We need to adapt the observation model for discrete random variables, as the discrete nature of the database does not allow us to use the standard Gaussian observation model. [sent-40, score-0.193]

21 Furthermore, the multinomial-logit model, besides its versatility, allows the implementation of an efficient Gibbs sampler where the Laplace approximation [10] is used to integrate out the weighting factors, which can be efficiently computed using the Matrix Inversion Lemma. [sent-42, score-0.101]

22 The IBP model combined with discrete observations has already been tackled in several related works. [sent-43, score-0.074]

23 They apply the ICD to focused topic modeling, where the instances are documents and the observations are words from a finite vocabulary, and focus on decoupling the prevalence of a topic in a document and its prevalence in all documents. [sent-45, score-0.176]

24 Despite the discrete nature of the observations under this model, these assumptions are not appropriate for categorical observations such as the set of possible responses to the questions in the NESARC database. [sent-46, score-0.254]

25 In this model, each (discrete) component in the observation vector of an instance depends only on one of the active latent features of that object, randomly drawn from a multinomial distribution. [sent-48, score-0.208]

26 Our model is more flexible in the sense that it allows different probability distributions for every component in the observation vector, which is accomplished by weighting differently the latent variables. [sent-50, score-0.149]

27 2 The Indian Buffet Process In latent feature modeling, each object can be represented by a vector of latent features, and the observations are generated from a distribution determined by those latent feature values. [sent-51, score-0.379]

28 Typically, we have access to the set of observations and the main goal of these models is to find out the latent variables that represent the data. [sent-52, score-0.123]

29 The most common nonparametric tool for latent feature modeling is the Indian Buffet Process (IBP). [sent-53, score-0.128]

30 The nth row of Z, denoted by zn· , represents the vector of latent features of the nth data point, and every entry nk is denoted by znk . [sent-61, score-0.449]

31 Note that each element znk ∈ {0, 1} indicates whether the k th feature contributes to the nth data point. [sent-62, score-0.275]

32 Given a binary latent feature matrix Z, we assume that the N × D observation matrix X, where the nth row contains a D-dimensional observation vector xn· , is distributed according to a probability distribution p(X|Z). [sent-63, score-0.312]

33 Additionally, x·d stands for the dth column of X, and each element of the 2 matrix is denoted by xnd . [sent-64, score-0.206]

34 MCMC (Markov Chain Monte Carlo) methods have been broadly applied to infer the latent structure Z from a given observation matrix X (see, e. [sent-66, score-0.155]

35 In particular, we focus on the use of Gibbs sampling for posterior inference over the latent variables. [sent-69, score-0.116]

36 The algorithm iteratively samples the value of each element znk given the remaining variables, i. [sent-70, score-0.19]

37 , it samples from p(znk = 1|X, Z¬nk ) ∝ p(X|Z)p(znk = 1|Z¬nk ), (1) where Z¬nk denotes all the entries of Z other than znk . [sent-72, score-0.19]

38 The distribution p(znk = 1|Z¬nk ) can be readily derived from the exchangeable IBP and can be written as p(znk = 1|Z¬nk ) = m−n,k /N, where m−n,k is the number of data points with feature k, not including n, i. [sent-73, score-0.08]

39 , Rd }, where this finite set contains the indexes to all the possible values of xnd . [sent-81, score-0.146]

40 We introduce matrices Bd of size K × R to model the probability distribution over X, such that Bd links the hidden latent variables with the dth column of the observation matrix X. [sent-83, score-0.226]

41 We assume r that the probability of xnd taking value r (r = 1, . [sent-84, score-0.146]

42 , r πnd = p(xnd = r|zn· , Bd ) = exp (zn· bd ) ·r , R exp (zn· bd ·r (2) ) r =1 where bd denotes the rth column of Bd . [sent-89, score-1.497]

43 Note that the matrices Bd are used to weight differently ·r the contribution of every latent feature for every component d, similarly as in the standard Gaussian observation model in [5]. [sent-90, score-0.169]

44 We assume that the mixing vectors bd are Gaussian distributed with zero ·r 2 mean and covariance matrix Σb = σB I. [sent-91, score-0.53]

45 We consider that elements xnd are independent given the latent feature matrix Z and the D matrices Bd . [sent-97, score-0.305]

46 p(xnd |zn· , Bd ) = (3) n=1 d=1 Laplace approximation for inference In Section 2, the (heuristic) Gibbs sampling algorithm for the posterior inference over the latent variables of the IBP has been reviewed and it is detailed in [5]. [sent-103, score-0.116]

47 Recall that our model assumes independence among the observations given the hidden latent variables. [sent-108, score-0.165]

48 By defining (ρd )kr = N r n=1 znk πnd , the gradient of ψ(Bd ) can be derived as ψ = Md − ρd − 1 d 2 B . [sent-124, score-0.19]

49 The Hessian matrix can now be readily computed taking the derivatives of the gradient, yielding ψ=− 1 2 IRK + σB log p(x·d |β d , Z) N =− 1 diag(π nd ) − (π nd ) π nd ⊗ (zn· zn· ), 2 IRK − σB n=1 (7) R 2 1 where π nd = πnd , πnd , . [sent-128, score-0.166]

50 , πnd , and diag(π nd ) is a diagonal matrix with the values of the vector π nd as its diagonal elements. [sent-131, score-0.081]

51 9 by replacing K by K+ , Z by the submatrix containing only the non-zero columns of Z, and Bd MAP by the submatrix containing the K+ corresponding rows. [sent-141, score-0.095]

52 Since vn vn is a rank-one matrix, we can apply the Woodbury ·d identity [18] N times to invert the matrix − ψ, similar to the RLS (Recursive Least Squares) updates [7]. [sent-148, score-0.225]

53 , N , we compute (D(n) )−1 = D(n−1) − vn vn −1 = (D(n−1) )−1 + (D(n−1) )−1 vn vn (D(n−1) )−1 . [sent-152, score-0.388]

54 1 − vn (D(n−1) )−1 vn (12) For the first iteration, we define D(0) as the block-diagonal matrix D, whose inverse matrix involves computing the R matrix inversions of size K+ × K+ of the matrices in (11), which can be efficiently solved applying the Matrix Inversion Lemma. [sent-153, score-0.287]

55 We also multiply each pixel independently with equiprobable binary noise, hence each white pixel in the composite image can be turned black 50% of the times, while black pixels always remain black. [sent-161, score-0.162]

56 The Gibbs sampler has been initialized with K+ = 2, setting each znk = 1 2 with probability 1/2, and the hyperparameters have been set to α = 0. [sent-164, score-0.239]

57 After 200 iterations, the Gibbs sampler returns four latent features. [sent-166, score-0.132]

58 As expected, the black pixels are known to be black (almost zero probability of being white) and the white pixels have about a 50/50 chance of being black or white, due to the multiplicative noise. [sent-169, score-0.128]

59 The Gibbs sampler has used as many as nine hidden features, but after iteration 60, the first four features represent the base images and the others just lock on a noise pattern, which eventually fades away. [sent-170, score-0.144]

60 2 National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) The NESARC was designed to determine the magnitude of alcohol use disorders and their associated disabilities. [sent-172, score-0.154]

61 Two waves of interviews have been fielded for this survey (first wave in 2001-2002 and second wave in 2004-2005). [sent-173, score-0.151]

62 (b) Probability of each pixel being white, when a single feature is active (ordered to match the images on the left), computed using Bd . [sent-180, score-0.106]

63 (d) Probabilities of each pixel being white after 200 iterations of the Gibbs sampler inferred for the four data points on (c). [sent-183, score-0.129]

64 (e) The number of latent features K+ and (f) the approximate log of p(X|Z) over the 200 iterations of the Gibbs sampler. [sent-185, score-0.136]

65 The survey includes a question about having attempted suicide as well as other related questions such as ‘felt like wanted to die’ and ‘thought a lot about own death’. [sent-187, score-0.827]

66 In the present paper, we use the IBP with discrete observations for a preliminary study in seeking the latent causes which lead to committing suicide. [sent-188, score-0.211]

67 Most of the questions in the survey (over 2,500) are yes-or-no questions, which have four possible outcomes: ‘blank’ (B), ‘unknown’ (U), ‘yes’ (Y) and ‘no’ (N). [sent-189, score-0.175]

68 If a question is left blank the question was not asked1 . [sent-190, score-0.13]

69 If a question is said to be unknown either it was not answered or was unknown to the respondent. [sent-191, score-0.087]

70 In our ongoing study, we want to find a latent model that describes this database and can be used to infer patterns of behavior and, specifically, be able to predict suicide. [sent-192, score-0.126]

71 In this paper, we build an unsupervised model with the 20 variables that present the highest mutual information with the suicide attempt question, which are shown in Table 1 together with their code in the questionnaire. [sent-193, score-0.662]

72 We run the Gibbs sampler over 500 randomly chosen subjects out of the 13,670 that have answered affirmatively to having had a period of low mood. [sent-194, score-0.142]

73 We have initialized the sampler with an active feature, i. [sent-196, score-0.08]

74 , K+ = 1, and have set znk = 1 randomly with probability 0. [sent-198, score-0.19]

75 In Figure 2, we have plotted the posterior probability for each question when a single feature is active. [sent-201, score-0.107]

76 In these plots, white means 0 and black 1, and each row sums up to one. [sent-202, score-0.076]

77 Feature 1 is active for modeling the ‘blank’ and ‘no’ answers and, fundamentally, those who were not asked Questions 8 and 10. [sent-203, score-0.08]

78 Feature 3 models blank answers for most of the questions and negative responses to 1, 2, 5, 8 and 10, which are questions related to suicide. [sent-205, score-0.401]

79 Feature 5 models the ‘yes’ answer to Questions 3, 4, 6, 7, 8, 1 In a questionnaire of this size some questions are not asked when a previous question was answered in a predetermined way to reduce the burden of taking the survey. [sent-207, score-0.227]

80 For example, if a person has never had a period of low mood, the attempt suicide question is not asked. [sent-208, score-0.691]

81 Feature 7 models answering affirmatively to Questions 15, 16, 19 and 20, which are related to alcohol abuse. [sent-211, score-0.133]

82 We show the percentage of respondents that answered positively to the suicide attempt questions in Table 2, independently for the 500 samples that were used to learn the IBP and the 9,500 hold-out samples, together with the total number of respondents. [sent-212, score-0.86]

83 A dash indicates that the feature can be active or inactive. [sent-213, score-0.076]

84 Throughout the database, the prevalence of suicide attempt is 7. [sent-216, score-0.706]

85 As expected, Features 2, 4, 5 and 7 favor suicide attempt risk, although Feature 5 only mildly, and Features 1, 3 and 6 decrease the probability of attempting suicide. [sent-218, score-0.7]

86 From the above description of each feature, it is clear that having Features 4 or 7 active should increase the risk of attempting suicide, while having Features 3 and 1 active should cause the opposite effect. [sent-219, score-0.159]

87 The other combinations favor an increased rate of suicide attempts that goes from doubling (‘11’) to quadrupling (‘00’), to a ten-fold increase (‘01’), and the percentages of population with these features are, respectively, 21%, 6% and 3%. [sent-221, score-0.742]

88 In the final part of Table 2, we show combinations of features that significantly increase the suicide attempt rate for a reduced percentage of the population, as well as combinations of features that significantly decrease the suicide attempt rate for a large chunk of the population. [sent-222, score-1.43]

89 These results are interesting as they can be used to discard significant portions of the population in suicide attempt studies and focus on the groups that present much higher risk. [sent-223, score-0.695]

90 Hence, our IBP with discrete observations is being able to obtain features that describe the hidden structure of the NESARC database and makes it possible to pin-point the people that have a higher risk of attempting suicide. [sent-224, score-0.335]

91 We have applied our model to the NESARC database to find out the hidden features that characterize the suicide attempt risk. [sent-229, score-0.8]

92 We 7 Hidden features 1 0 1 1 1 - 1 0 0 1 1 0 0 1 1 1 1 0 1 0 1 1 1 0 0 0 1 - 1 1 - 1 1 0 0 - Suicide attempt probability Train Hold-out 6. [sent-230, score-0.116]

93 The ‘train ensemble’ columns contain the results for the 500 data points used to obtain the model, whereas the ‘hold-out ensemble’ columns contain the results for the remaining subjects. [sent-264, score-0.078]

94 These probabilities have been obtained with the posterior mean weights Bd MAP , when only one of the seven latent features (sorted from left to right to match the order in Table 2) is active. [sent-266, score-0.195]

95 have analyzed how each of the seven inferred features contributes to the suicide attempt probability. [sent-267, score-0.741]

96 8 References [1] Summary of national strategy for suicide prevention: Goals and objectives for action, 2007. [sent-273, score-0.599]

97 Cognitive therapy for the prevention of suicide attempts: a randomized controlled trial. [sent-302, score-0.726]

98 Trends in suicide ideation, plans, gestures, and attempts in the united states, 1990-1992 to 2001-2003. [sent-327, score-0.656]

99 The struggle to prevent and evaluate: application of population attributable risk and preventive fraction to suicide prevention research. [sent-332, score-0.818]

100 A suicide prevention program in a region with a very high suicide rate. [sent-384, score-1.325]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('suicide', 0.599), ('bd', 0.499), ('ibp', 0.222), ('znk', 0.19), ('nesarc', 0.166), ('xnd', 0.146), ('questions', 0.14), ('prevention', 0.127), ('alcohol', 0.108), ('vn', 0.097), ('zn', 0.089), ('latent', 0.083), ('blank', 0.072), ('mood', 0.066), ('attempt', 0.063), ('gibbs', 0.061), ('risk', 0.059), ('answered', 0.058), ('wave', 0.058), ('attempts', 0.057), ('laplace', 0.053), ('features', 0.053), ('buffet', 0.05), ('white', 0.05), ('bmap', 0.05), ('epidemiologic', 0.05), ('irk', 0.05), ('answers', 0.049), ('sampler', 0.049), ('depression', 0.048), ('disorders', 0.046), ('madrid', 0.046), ('indian', 0.046), ('feature', 0.045), ('prevalence', 0.044), ('emergency', 0.044), ('felt', 0.044), ('database', 0.043), ('yes', 0.043), ('nk', 0.043), ('hidden', 0.042), ('observation', 0.041), ('nth', 0.04), ('observations', 0.04), ('medical', 0.04), ('columns', 0.039), ('attempting', 0.038), ('carlos', 0.037), ('feel', 0.036), ('months', 0.036), ('map', 0.035), ('survey', 0.035), ('subjects', 0.035), ('readily', 0.035), ('went', 0.035), ('fernando', 0.035), ('md', 0.034), ('discrete', 0.034), ('drank', 0.033), ('dysthymia', 0.033), ('icd', 0.033), ('overnight', 0.033), ('rihmer', 0.033), ('rmatively', 0.033), ('ruiz', 0.033), ('suicidal', 0.033), ('valera', 0.033), ('population', 0.033), ('posterior', 0.033), ('matrix', 0.031), ('determinant', 0.031), ('active', 0.031), ('pixel', 0.03), ('dth', 0.029), ('isabel', 0.029), ('stayed', 0.029), ('question', 0.029), ('public', 0.029), ('kr', 0.029), ('treatment', 0.028), ('submatrix', 0.028), ('integrate', 0.027), ('psychiatry', 0.027), ('committing', 0.027), ('die', 0.027), ('mental', 0.027), ('causes', 0.027), ('dirichlet', 0.027), ('inversion', 0.026), ('black', 0.026), ('seven', 0.026), ('people', 0.026), ('weighting', 0.025), ('answering', 0.025), ('nd', 0.025), ('began', 0.024), ('mann', 0.024), ('wanted', 0.024), ('topic', 0.024), ('hessian', 0.024)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000007 52 nips-2012-Bayesian Nonparametric Modeling of Suicide Attempts

Author: Francisco Ruiz, Isabel Valera, Carlos Blanco, Fernando Pérez-Cruz

Abstract: The National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) database contains a large amount of information, regarding the way of life, medical conditions, etc., of a representative sample of the U.S. population. In this paper, we are interested in seeking the hidden causes behind the suicide attempts, for which we propose to model the subjects using a nonparametric latent model based on the Indian Buffet Process (IBP). Due to the nature of the data, we need to adapt the observation model for discrete random variables. We propose a generative model in which the observations are drawn from a multinomial-logit distribution given the IBP matrix. The implementation of an efficient Gibbs sampler is accomplished using the Laplace approximation, which allows integrating out the weighting factors of the multinomial-logit likelihood model. Finally, the experiments over the NESARC database show that our model properly captures some of the hidden causes that model suicide attempts. 1

2 0.17799577 136 nips-2012-Forward-Backward Activation Algorithm for Hierarchical Hidden Markov Models

Author: Kei Wakabayashi, Takao Miura

Abstract: Hierarchical Hidden Markov Models (HHMMs) are sophisticated stochastic models that enable us to capture a hierarchical context characterization of sequence data. However, existing HHMM parameter estimation methods require large computations of time complexity O(T N 2D ) at least for model inference, where D is the depth of the hierarchy, N is the number of states in each level, and T is the sequence length. In this paper, we propose a new inference method of HHMMs for which the time complexity is O(T N D+1 ). A key idea of our algorithm is application of the forward-backward algorithm to state activation probabilities. The notion of a state activation, which offers a simple formalization of the hierarchical transition behavior of HHMMs, enables us to conduct model inference efficiently. We present some experiments to demonstrate that our proposed method works more efficiently to estimate HHMM parameters than do some existing methods such as the flattening method and Gibbs sampling method. 1

3 0.10091404 7 nips-2012-A Divide-and-Conquer Method for Sparse Inverse Covariance Estimation

Author: Cho-jui Hsieh, Arindam Banerjee, Inderjit S. Dhillon, Pradeep K. Ravikumar

Abstract: We consider the composite log-determinant optimization problem, arising from the 1 regularized Gaussian maximum likelihood estimator of a sparse inverse covariance matrix, in a high-dimensional setting with a very large number of variables. Recent work has shown this estimator to have strong statistical guarantees in recovering the true structure of the sparse inverse covariance matrix, or alternatively the underlying graph structure of the corresponding Gaussian Markov Random Field, even in very high-dimensional regimes with a limited number of samples. In this paper, we are concerned with the computational cost in solving the above optimization problem. Our proposed algorithm partitions the problem into smaller sub-problems, and uses the solutions of the sub-problems to build a good approximation for the original problem. Our key idea for the divide step to obtain a sub-problem partition is as follows: we first derive a tractable bound on the quality of the approximate solution obtained from solving the corresponding sub-divided problems. Based on this bound, we propose a clustering algorithm that attempts to minimize this bound, in order to find effective partitions of the variables. For the conquer step, we use the approximate solution, i.e., solution resulting from solving the sub-problems, as an initial point to solve the original problem, and thereby achieve a much faster computational procedure. 1

4 0.075874202 59 nips-2012-Bayesian nonparametric models for bipartite graphs

Author: Francois Caron

Abstract: We develop a novel Bayesian nonparametric model for random bipartite graphs. The model is based on the theory of completely random measures and is able to handle a potentially infinite number of nodes. We show that the model has appealing properties and in particular it may exhibit a power-law behavior. We derive a posterior characterization, a generative process for network growth, and a simple Gibbs sampler for posterior simulation. Our model is shown to be well fitted to several real-world social networks. 1

5 0.069541343 166 nips-2012-Joint Modeling of a Matrix with Associated Text via Latent Binary Features

Author: Xianxing Zhang, Lawrence Carin

Abstract: A new methodology is developed for joint analysis of a matrix and accompanying documents, with the documents associated with the matrix rows/columns. The documents are modeled with a focused topic model, inferring interpretable latent binary features for each document. A new matrix decomposition is developed, with latent binary features associated with the rows/columns, and with imposition of a low-rank constraint. The matrix decomposition and topic model are coupled by sharing the latent binary feature vectors associated with each. The model is applied to roll-call data, with the associated documents defined by the legislation. Advantages of the proposed model are demonstrated for prediction of votes on a new piece of legislation, based only on the observed text of legislation. The coupling of the text and legislation is also shown to yield insight into the properties of the matrix decomposition for roll-call data. 1

6 0.069174871 154 nips-2012-How They Vote: Issue-Adjusted Models of Legislative Behavior

7 0.060485817 246 nips-2012-Nonparametric Max-Margin Matrix Factorization for Collaborative Prediction

8 0.056445476 349 nips-2012-Training sparse natural image models with a fast Gibbs sampler of an extended state space

9 0.053266805 316 nips-2012-Small-Variance Asymptotics for Exponential Family Dirichlet Process Mixture Models

10 0.050164003 127 nips-2012-Fast Bayesian Inference for Non-Conjugate Gaussian Process Regression

11 0.048933804 142 nips-2012-Generalization Bounds for Domain Adaptation

12 0.046705708 274 nips-2012-Priors for Diversity in Generative Latent Variable Models

13 0.046271 124 nips-2012-Factorial LDA: Sparse Multi-Dimensional Text Models

14 0.046069339 19 nips-2012-A Spectral Algorithm for Latent Dirichlet Allocation

15 0.045115188 355 nips-2012-Truncation-free Online Variational Inference for Bayesian Nonparametric Models

16 0.044137534 172 nips-2012-Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs

17 0.043613367 266 nips-2012-Patient Risk Stratification for Hospital-Associated C. diff as a Time-Series Classification Task

18 0.042828582 82 nips-2012-Continuous Relaxations for Discrete Hamiltonian Monte Carlo

19 0.042771667 138 nips-2012-Fully Bayesian inference for neural models with negative-binomial spiking

20 0.04270206 126 nips-2012-FastEx: Hash Clustering with Exponential Families


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.124), (1, 0.03), (2, -0.021), (3, -0.002), (4, -0.082), (5, -0.024), (6, 0.006), (7, 0.019), (8, 0.046), (9, -0.026), (10, 0.02), (11, 0.037), (12, -0.028), (13, -0.002), (14, 0.016), (15, -0.021), (16, 0.001), (17, -0.02), (18, 0.009), (19, -0.025), (20, 0.023), (21, 0.025), (22, -0.051), (23, -0.007), (24, 0.025), (25, -0.016), (26, 0.047), (27, -0.027), (28, -0.026), (29, -0.024), (30, -0.035), (31, 0.089), (32, -0.052), (33, 0.004), (34, 0.008), (35, -0.014), (36, -0.128), (37, -0.018), (38, -0.016), (39, -0.034), (40, 0.014), (41, -0.002), (42, -0.007), (43, -0.098), (44, 0.057), (45, 0.087), (46, 0.051), (47, -0.002), (48, 0.055), (49, 0.05)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.88407099 52 nips-2012-Bayesian Nonparametric Modeling of Suicide Attempts

Author: Francisco Ruiz, Isabel Valera, Carlos Blanco, Fernando Pérez-Cruz

Abstract: The National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) database contains a large amount of information, regarding the way of life, medical conditions, etc., of a representative sample of the U.S. population. In this paper, we are interested in seeking the hidden causes behind the suicide attempts, for which we propose to model the subjects using a nonparametric latent model based on the Indian Buffet Process (IBP). Due to the nature of the data, we need to adapt the observation model for discrete random variables. We propose a generative model in which the observations are drawn from a multinomial-logit distribution given the IBP matrix. The implementation of an efficient Gibbs sampler is accomplished using the Laplace approximation, which allows integrating out the weighting factors of the multinomial-logit likelihood model. Finally, the experiments over the NESARC database show that our model properly captures some of the hidden causes that model suicide attempts. 1

2 0.66975796 154 nips-2012-How They Vote: Issue-Adjusted Models of Legislative Behavior

Author: Sean Gerrish, David M. Blei

Abstract: We develop a probabilistic model of legislative data that uses the text of the bills to uncover lawmakers’ positions on specific political issues. Our model can be used to explore how a lawmaker’s voting patterns deviate from what is expected and how that deviation depends on what is being voted on. We derive approximate posterior inference algorithms based on variational methods. Across 12 years of legislative data, we demonstrate both improvement in heldout predictive performance and the model’s utility in interpreting an inherently multi-dimensional space. 1

3 0.63222837 136 nips-2012-Forward-Backward Activation Algorithm for Hierarchical Hidden Markov Models

Author: Kei Wakabayashi, Takao Miura

Abstract: Hierarchical Hidden Markov Models (HHMMs) are sophisticated stochastic models that enable us to capture a hierarchical context characterization of sequence data. However, existing HHMM parameter estimation methods require large computations of time complexity O(T N 2D ) at least for model inference, where D is the depth of the hierarchy, N is the number of states in each level, and T is the sequence length. In this paper, we propose a new inference method of HHMMs for which the time complexity is O(T N D+1 ). A key idea of our algorithm is application of the forward-backward algorithm to state activation probabilities. The notion of a state activation, which offers a simple formalization of the hierarchical transition behavior of HHMMs, enables us to conduct model inference efficiently. We present some experiments to demonstrate that our proposed method works more efficiently to estimate HHMM parameters than do some existing methods such as the flattening method and Gibbs sampling method. 1

4 0.62954319 166 nips-2012-Joint Modeling of a Matrix with Associated Text via Latent Binary Features

Author: Xianxing Zhang, Lawrence Carin

Abstract: A new methodology is developed for joint analysis of a matrix and accompanying documents, with the documents associated with the matrix rows/columns. The documents are modeled with a focused topic model, inferring interpretable latent binary features for each document. A new matrix decomposition is developed, with latent binary features associated with the rows/columns, and with imposition of a low-rank constraint. The matrix decomposition and topic model are coupled by sharing the latent binary feature vectors associated with each. The model is applied to roll-call data, with the associated documents defined by the legislation. Advantages of the proposed model are demonstrated for prediction of votes on a new piece of legislation, based only on the observed text of legislation. The coupling of the text and legislation is also shown to yield insight into the properties of the matrix decomposition for roll-call data. 1

5 0.59852004 349 nips-2012-Training sparse natural image models with a fast Gibbs sampler of an extended state space

Author: Lucas Theis, Jascha Sohl-dickstein, Matthias Bethge

Abstract: We present a new learning strategy based on an efficient blocked Gibbs sampler for sparse overcomplete linear models. Particular emphasis is placed on statistical image modeling, where overcomplete models have played an important role in discovering sparse representations. Our Gibbs sampler is faster than general purpose sampling schemes while also requiring no tuning as it is free of parameters. Using the Gibbs sampler and a persistent variant of expectation maximization, we are able to extract highly sparse distributions over latent sources from data. When applied to natural images, our algorithm learns source distributions which resemble spike-and-slab distributions. We evaluate the likelihood and quantitatively compare the performance of the overcomplete linear model to its complete counterpart as well as a product of experts model, which represents another overcomplete generalization of the complete linear model. In contrast to previous claims, we find that overcomplete representations lead to significant improvements, but that the overcomplete linear model still underperforms other models. 1

6 0.56736261 220 nips-2012-Monte Carlo Methods for Maximum Margin Supervised Topic Models

7 0.52461421 246 nips-2012-Nonparametric Max-Margin Matrix Factorization for Collaborative Prediction

8 0.52179521 111 nips-2012-Efficient Sampling for Bipartite Matching Problems

9 0.50274897 82 nips-2012-Continuous Relaxations for Discrete Hamiltonian Monte Carlo

10 0.49642161 107 nips-2012-Effective Split-Merge Monte Carlo Methods for Nonparametric Models of Sequential Data

11 0.49406227 54 nips-2012-Bayesian Probabilistic Co-Subspace Addition

12 0.493141 105 nips-2012-Dynamic Pruning of Factor Graphs for Maximum Marginal Prediction

13 0.49083757 26 nips-2012-A nonparametric variable clustering model

14 0.49012539 345 nips-2012-Topic-Partitioned Multinetwork Embeddings

15 0.48514569 274 nips-2012-Priors for Diversity in Generative Latent Variable Models

16 0.47848818 103 nips-2012-Distributed Probabilistic Learning for Camera Networks with Missing Data

17 0.46420029 365 nips-2012-Why MCA? Nonlinear sparse coding with spike-and-slab prior for neurally plausible image encoding

18 0.46140563 137 nips-2012-From Deformations to Parts: Motion-based Segmentation of 3D Objects

19 0.45690721 192 nips-2012-Learning the Dependency Structure of Latent Factors

20 0.4513917 124 nips-2012-Factorial LDA: Sparse Multi-Dimensional Text Models


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.044), (21, 0.026), (38, 0.073), (39, 0.022), (42, 0.016), (54, 0.02), (55, 0.45), (74, 0.039), (76, 0.098), (80, 0.07), (92, 0.04)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.79339141 52 nips-2012-Bayesian Nonparametric Modeling of Suicide Attempts

Author: Francisco Ruiz, Isabel Valera, Carlos Blanco, Fernando Pérez-Cruz

Abstract: The National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) database contains a large amount of information, regarding the way of life, medical conditions, etc., of a representative sample of the U.S. population. In this paper, we are interested in seeking the hidden causes behind the suicide attempts, for which we propose to model the subjects using a nonparametric latent model based on the Indian Buffet Process (IBP). Due to the nature of the data, we need to adapt the observation model for discrete random variables. We propose a generative model in which the observations are drawn from a multinomial-logit distribution given the IBP matrix. The implementation of an efficient Gibbs sampler is accomplished using the Laplace approximation, which allows integrating out the weighting factors of the multinomial-logit likelihood model. Finally, the experiments over the NESARC database show that our model properly captures some of the hidden causes that model suicide attempts. 1

2 0.79324579 158 nips-2012-ImageNet Classification with Deep Convolutional Neural Networks

Author: Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton

Abstract: We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called “dropout” that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry. 1

3 0.78862435 340 nips-2012-The representer theorem for Hilbert spaces: a necessary and sufficient condition

Author: Francesco Dinuzzo, Bernhard Schölkopf

Abstract: The representer theorem is a property that lies at the foundation of regularization theory and kernel methods. A class of regularization functionals is said to admit a linear representer theorem if every member of the class admits minimizers that lie in the finite dimensional subspace spanned by the representers of the data. A recent characterization states that certain classes of regularization functionals with differentiable regularization term admit a linear representer theorem for any choice of the data if and only if the regularization term is a radial nondecreasing function. In this paper, we extend such result by weakening the assumptions on the regularization term. In particular, the main result of this paper implies that, for a sufficiently large family of regularization functionals, radial nondecreasing functions are the only lower semicontinuous regularization terms that guarantee existence of a representer theorem for any choice of the data. 1

4 0.77508295 155 nips-2012-Human memory search as a random walk in a semantic network

Author: Joseph L. Austerweil, Joshua T. Abbott, Thomas L. Griffiths

Abstract: The human mind has a remarkable ability to store a vast amount of information in memory, and an even more remarkable ability to retrieve these experiences when needed. Understanding the representations and algorithms that underlie human memory search could potentially be useful in other information retrieval settings, including internet search. Psychological studies have revealed clear regularities in how people search their memory, with clusters of semantically related items tending to be retrieved together. These findings have recently been taken as evidence that human memory search is similar to animals foraging for food in patchy environments, with people making a rational decision to switch away from a cluster of related information as it becomes depleted. We demonstrate that the results that were taken as evidence for this account also emerge from a random walk on a semantic network, much like the random web surfer model used in internet search engines. This offers a simpler and more unified account of how people search their memory, postulating a single process rather than one process for exploring a cluster and one process for switching between clusters. 1

5 0.70787293 211 nips-2012-Meta-Gaussian Information Bottleneck

Author: Melanie Rey, Volker Roth

Abstract: We present a reformulation of the information bottleneck (IB) problem in terms of copula, using the equivalence between mutual information and negative copula entropy. Focusing on the Gaussian copula we extend the analytical IB solution available for the multivariate Gaussian case to distributions with a Gaussian dependence structure but arbitrary marginal densities, also called meta-Gaussian distributions. This opens new possibles applications of IB to continuous data and provides a solution more robust to outliers. 1

6 0.62588233 215 nips-2012-Minimizing Uncertainty in Pipelines

7 0.60909492 95 nips-2012-Density-Difference Estimation

8 0.4994165 306 nips-2012-Semantic Kernel Forests from Multiple Taxonomies

9 0.48662382 91 nips-2012-Deep Neural Networks Segment Neuronal Membranes in Electron Microscopy Images

10 0.47557655 193 nips-2012-Learning to Align from Scratch

11 0.47048619 238 nips-2012-Neurally Plausible Reinforcement Learning of Working Memory Tasks

12 0.47017914 231 nips-2012-Multiple Operator-valued Kernel Learning

13 0.4684459 90 nips-2012-Deep Learning of Invariant Features via Simulated Fixations in Video

14 0.45335791 298 nips-2012-Scalable Inference of Overlapping Communities

15 0.44798055 210 nips-2012-Memorability of Image Regions

16 0.44547945 188 nips-2012-Learning from Distributions via Support Measure Machines

17 0.4393653 93 nips-2012-Deep Spatio-Temporal Architectures and Learning for Protein Structure Prediction

18 0.43662348 345 nips-2012-Topic-Partitioned Multinetwork Embeddings

19 0.43569523 4 nips-2012-A Better Way to Pretrain Deep Boltzmann Machines

20 0.42781508 266 nips-2012-Patient Risk Stratification for Hospital-Associated C. diff as a Time-Series Classification Task