andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1228 knowledge-graph by maker-knowledge-mining

1228 andrew gelman stats-2012-03-25-Continuous variables in Bayesian networks


meta infos for this blog

Source: html

Introduction: Antti Rasinen writes: I’m a former undergrad machine learning student and a current software engineer with a Bayesian hobby. Today my two worlds collided. I ask for some enlightenment. On your blog you’ve repeatedly advocated continuous distributions with Bayesian models. Today I read this article by Ricky Ho, who writes: The strength of Bayesian network is it is highly scalable and can learn incrementally because all we do is to count the observed variables and update the probability distribution table. Similar to Neural Network, Bayesian network expects all data to be binary, categorical variable will need to be transformed into multiple binary variable as described above. Numeric variable is generally not a good fit for Bayesian network. The last sentence seems to be at odds with what you’ve said. Sadly, I don’t have enough expertise to say which view of the world is correct. During my undergrad years our team wrote an implementation of the Junction Tree algorithm. We r


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Antti Rasinen writes: I’m a former undergrad machine learning student and a current software engineer with a Bayesian hobby. [sent-1, score-0.514]

2 On your blog you’ve repeatedly advocated continuous distributions with Bayesian models. [sent-4, score-0.826]

3 Today I read this article by Ricky Ho, who writes: The strength of Bayesian network is it is highly scalable and can learn incrementally because all we do is to count the observed variables and update the probability distribution table. [sent-5, score-0.952]

4 Similar to Neural Network, Bayesian network expects all data to be binary, categorical variable will need to be transformed into multiple binary variable as described above. [sent-6, score-1.002]

5 Numeric variable is generally not a good fit for Bayesian network. [sent-7, score-0.195]

6 The last sentence seems to be at odds with what you’ve said. [sent-8, score-0.134]

7 Sadly, I don’t have enough expertise to say which view of the world is correct. [sent-9, score-0.064]

8 During my undergrad years our team wrote an implementation of the Junction Tree algorithm. [sent-10, score-0.266]

9 We really did not consider continuous variables at all. [sent-11, score-0.683]

10 I know continuous distributions are fine with small hiearchical models, but… How well do continuous distributions work with large graphs? [sent-12, score-1.542]

11 Do you have perhaps a good reference to a known large example of a large Bayesian network with several “numeric variables”? [sent-13, score-0.767]

12 My reply: The term “Bayesian network” is general and includes the possibility of continuous variables. [sent-14, score-0.637]

13 Ho is not wrong, exactly, but he’s only talking about a subset of possible Bayesian models. [sent-15, score-0.071]

14 I disagree with his recommendation to avoid continuous variables but perhaps this is good advice for the particular software he is working with. [sent-16, score-1.023]

15 In answer to the question, I don’t have any experience with large problems. [sent-17, score-0.136]

16 Here’s a small but difficult Bayesian analysis that involved many continuous parameters. [sent-18, score-0.585]

17 Here’s an example with some discrete parameters (which are often thought of as latent data) and some continuous parameters. [sent-19, score-0.804]

18 Even models that seem completely discrete can have continuous parameters representing the probabilities. [sent-20, score-0.866]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('continuous', 0.517), ('network', 0.325), ('bayesian', 0.266), ('numeric', 0.228), ('ho', 0.215), ('undergrad', 0.198), ('variables', 0.166), ('distributions', 0.152), ('variable', 0.139), ('large', 0.136), ('binary', 0.128), ('discrete', 0.126), ('incrementally', 0.114), ('software', 0.11), ('expects', 0.099), ('scalable', 0.094), ('sadly', 0.092), ('parameters', 0.092), ('neural', 0.09), ('today', 0.09), ('advocated', 0.088), ('categorical', 0.088), ('engineer', 0.085), ('transformed', 0.084), ('worlds', 0.079), ('strength', 0.075), ('representing', 0.074), ('odds', 0.073), ('tree', 0.072), ('subset', 0.071), ('repeatedly', 0.069), ('latent', 0.069), ('implementation', 0.068), ('small', 0.068), ('possibility', 0.064), ('expertise', 0.064), ('count', 0.063), ('recommendation', 0.063), ('update', 0.062), ('machine', 0.061), ('sentence', 0.061), ('former', 0.06), ('reference', 0.057), ('perhaps', 0.057), ('probabilities', 0.057), ('models', 0.057), ('includes', 0.056), ('good', 0.056), ('disagree', 0.054), ('observed', 0.053)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 1228 andrew gelman stats-2012-03-25-Continuous variables in Bayesian networks

Introduction: Antti Rasinen writes: I’m a former undergrad machine learning student and a current software engineer with a Bayesian hobby. Today my two worlds collided. I ask for some enlightenment. On your blog you’ve repeatedly advocated continuous distributions with Bayesian models. Today I read this article by Ricky Ho, who writes: The strength of Bayesian network is it is highly scalable and can learn incrementally because all we do is to count the observed variables and update the probability distribution table. Similar to Neural Network, Bayesian network expects all data to be binary, categorical variable will need to be transformed into multiple binary variable as described above. Numeric variable is generally not a good fit for Bayesian network. The last sentence seems to be at odds with what you’ve said. Sadly, I don’t have enough expertise to say which view of the world is correct. During my undergrad years our team wrote an implementation of the Junction Tree algorithm. We r

2 0.23294294 2003 andrew gelman stats-2013-08-30-Stan Project: Continuous Relaxations for Discrete MRFs

Introduction: Hamiltonian Monte Carlo (HMC), as used by Stan , is only defined for continuous parameters. We’d love to be able to do discrete sampling. So I was excited when I saw this: Yichuan Zhang, Charles Sutton, Amos J Storkey, and Zoubin Ghahramani. 2012. Continuous Relaxations for Discrete Hamiltonian Monte Carlo . NIPS 25. Abstract: Continuous relaxations play an important role in discrete optimization, but have not seen much use in approximate probabilistic inference. Here we show that a general form of the Gaussian Integral Trick makes it possible to transform a wide class of discrete variable undirected models into fully continuous systems. The continuous representation allows the use of gradient-based Hamiltonian Monte Carlo for inference, results in new ways of estimating normalization constants (partition functions), and in general opens up a number of new avenues for inference in difficult discrete systems. We demonstrate some of these continuous relaxation inference a

3 0.1993551 653 andrew gelman stats-2011-04-08-Multilevel regression with shrinkage for “fixed” effects

Introduction: Dean Eckles writes: I remember reading on your blog that you were working on some tools to fit multilevel models that also include “fixed” effects — such as continuous predictors — that are also estimated with shrinkage (for example, an L1 or L2 penalty). Any new developments on this front? I often find myself wanting to fit a multilevel model to some data, but also needing to include a number of “fixed” effects, mainly continuous variables. This makes me wary of overfitting to these predictors, so then I’d want to use some kind of shrinkage. As far as I can tell, the main options for doing this now is by going fully Bayesian and using a Gibbs sampler. With MCMCglmm or BUGS/JAGS I could just specify a prior on the fixed effects that corresponds to a desired penalty. However, this is pretty slow, especially with a large data set and because I’d like to select the penalty parameter by cross-validation (which is where this isn’t very Bayesian I guess?). My reply: We allow info

4 0.18456388 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes

Introduction: Konrad Scheffler writes: I was interested by your paper “Induction and deduction in Bayesian data analysis” and was wondering if you would entertain a few questions: – Under the banner of objective Bayesianism, I would posit something like this as a description of Bayesian inference: “Objective Bayesian probability is not a degree of belief (which would necessarily be subjective) but a measure of the plausibility of a hypothesis, conditional on a formally specified information state. One way of specifying a formal information state is to specify a model, which involves specifying both a prior distribution (typically for a set of unobserved variables) and a likelihood function (typically for a set of observed variables, conditioned on the values of the unobserved variables). Bayesian inference involves calculating the objective degree of plausibility of a hypothesis (typically the truth value of the hypothesis is a function of the variables mentioned above) given such a

5 0.17580295 1529 andrew gelman stats-2012-10-11-Bayesian brains?

Introduction: Psychology researcher Alison Gopnik discusses the idea that some of the systematic problems with human reasoning can be explained by systematic flaws in the statistical models we implicitly use. I really like this idea and I’ll return to it in a bit. But first I need to discuss a minor (but, I think, ultimately crucial) disagreement I have with how Gopnik describes Bayesian inference. She writes: The Bayesian idea is simple, but it turns out to be very powerful. It’s so powerful, in fact, that computer scientists are using it to design intelligent learning machines, and more and more psychologists think that it might explain human intelligence. Bayesian inference is a way to use statistical data to evaluate hypotheses and make predictions. These might be scientific hypotheses and predictions or everyday ones. So far, so good. Next comes the problem (as I see it). Gopnik writes: Here’s a simple bit of Bayesian election thinking. In early September, the polls suddenly im

6 0.16781141 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

7 0.16657914 1431 andrew gelman stats-2012-07-27-Overfitting

8 0.16629958 151 andrew gelman stats-2010-07-16-Wanted: Probability distributions for rank orderings

9 0.14870778 1554 andrew gelman stats-2012-10-31-It not necessary that Bayesian methods conform to the likelihood principle

10 0.14779903 811 andrew gelman stats-2011-07-20-Kind of Bayesian

11 0.14350338 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis

12 0.13613802 1205 andrew gelman stats-2012-03-09-Coming to agreement on philosophy of statistics

13 0.13457155 183 andrew gelman stats-2010-08-04-Bayesian models for simultaneous equation systems?

14 0.13316624 1438 andrew gelman stats-2012-07-31-What is a Bayesian?

15 0.13100176 1900 andrew gelman stats-2013-06-15-Exploratory multilevel analysis when group-level variables are of importance

16 0.12970936 904 andrew gelman stats-2011-09-13-My wikipedia edit

17 0.12889259 1615 andrew gelman stats-2012-12-10-A defense of Tom Wolfe based on the impossibility of the law of small numbers in network structure

18 0.12793446 1779 andrew gelman stats-2013-03-27-“Two Dogmas of Strong Objective Bayesianism”

19 0.12783709 1412 andrew gelman stats-2012-07-10-More questions on the contagion of obesity, height, etc.

20 0.12058605 2368 andrew gelman stats-2014-06-11-Bayes in the research conversation


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.181), (1, 0.166), (2, -0.045), (3, 0.054), (4, -0.028), (5, 0.031), (6, -0.024), (7, 0.017), (8, 0.041), (9, 0.004), (10, 0.013), (11, -0.028), (12, 0.002), (13, 0.011), (14, 0.047), (15, 0.055), (16, 0.048), (17, -0.002), (18, 0.003), (19, 0.045), (20, -0.02), (21, 0.094), (22, 0.009), (23, -0.015), (24, 0.01), (25, -0.027), (26, 0.024), (27, -0.015), (28, 0.004), (29, -0.042), (30, 0.035), (31, 0.041), (32, 0.007), (33, 0.029), (34, 0.034), (35, -0.014), (36, -0.002), (37, 0.056), (38, -0.043), (39, -0.019), (40, -0.002), (41, -0.033), (42, -0.004), (43, 0.064), (44, -0.02), (45, 0.002), (46, 0.039), (47, 0.082), (48, 0.018), (49, 0.059)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97557193 1228 andrew gelman stats-2012-03-25-Continuous variables in Bayesian networks

Introduction: Antti Rasinen writes: I’m a former undergrad machine learning student and a current software engineer with a Bayesian hobby. Today my two worlds collided. I ask for some enlightenment. On your blog you’ve repeatedly advocated continuous distributions with Bayesian models. Today I read this article by Ricky Ho, who writes: The strength of Bayesian network is it is highly scalable and can learn incrementally because all we do is to count the observed variables and update the probability distribution table. Similar to Neural Network, Bayesian network expects all data to be binary, categorical variable will need to be transformed into multiple binary variable as described above. Numeric variable is generally not a good fit for Bayesian network. The last sentence seems to be at odds with what you’ve said. Sadly, I don’t have enough expertise to say which view of the world is correct. During my undergrad years our team wrote an implementation of the Junction Tree algorithm. We r

2 0.80039382 1438 andrew gelman stats-2012-07-31-What is a Bayesian?

Introduction: Deborah Mayo recommended that I consider coming up with a new name for the statistical methods that I used, given that the term “Bayesian” has all sorts of associations that I dislike (as discussed, for example, in section 1 of this article ). I replied that I agree on Bayesian, I never liked the term and always wanted something better, but I couldn’t think of any convenient alternative. Also, I was finding that Bayesians (even the Bayesians I disagreed with) were reading my research articles, while non-Bayesians were simply ignoring them. So I thought it was best to identify with, and communicate with, those people who were willing to engage with me. More formally, I’m happy defining “Bayesian” as “using inference from the posterior distribution, p(theta|y)”. This says nothing about where the probability distributions come from (thus, no requirement to be “subjective” or “objective”) and it says nothing about the models (thus, no requirement to use the discrete models that hav

3 0.78710717 449 andrew gelman stats-2010-12-04-Generalized Method of Moments, whatever that is

Introduction: Xuequn Hu writes: I am an econ doctoral student, trying to do some empirical work using Bayesian methods. Recently I read a paper(and its discussion) that pitches Bayesian methods against GMM (Generalized Method of Moments), which is quite popular in econometrics for frequentists. I am wondering if you can, here or on your blog, give some insights about these two methods, from the perspective of a Bayesian statistician. I know GMM does not conform to likelihood principle, but Bayesian are often charged with strong distribution assumptions. I can’t actually help on this, since I don’t know what GMM is. My guess is that, like other methods that don’t explicitly use prior estimation, this method will work well if sufficient information is included as data. Which would imply a hierarchical structure.

4 0.77808541 117 andrew gelman stats-2010-06-29-Ya don’t know Bayes, Jack

Introduction: I came across this article on the philosophy of statistics by University of Michigan economist John DiNardo. I don’t have much to say about the substance of the article because most of it is an argument against something called “Bayesian methods” that doesn’t have much in common with the Bayesian data analysis that I do. If an quantitative, empirically-minded economist at a top university doesn’t know about modern Bayesian methods, then it’s a pretty good guess that confusion holds in many other quarters as well, so I thought I’d try to clear a couple of things up. (See also here .) In the short term, I know I have some readers at the University of Michigan, so maybe a couple of you could go over to Prof. DiNardo’s office and discuss this with him? For the rest of you, please spread the word. My point here is not to claim that DiNardo should be using Bayesian methods or to claim that he’s doing anything wrong in his applied work. It’s just that he’s fighting against a bu

5 0.77465481 2293 andrew gelman stats-2014-04-16-Looking for Bayesian expertise in India, for the purpose of analysis of sarcoma trials

Introduction: Prakash Nayak writes: I work as a musculoskeletal oncologist (surgeon) in Mumbai, India and am keen on sarcoma research. Sarcomas are rare disorders, and conventional frequentist analysis falls short of providing meaningful results for clinical application. I am thus keen on applying Bayesian analysis to a lot of trials performed with small numbers in this field. I need advise from you for a good starting point for someone uninitiated in Bayesian analysis. What to read, what courses to take and is there a way I could collaborate with any local/international statisticians dealing with these methods. I have attached a recent publication [Optimal timing of pulmonary metastasectomy – is a delayed operation beneficial or counterproductive?, by M. Kruger, J. D. Schmitto, B. Wiegmannn, T. K. Rajab, and A. Haverich] which is one amongst others I understand would benefit from some Bayesian analyses. I have no idea who in India works in this area so I’m just putting this one out

6 0.76213509 2368 andrew gelman stats-2014-06-11-Bayes in the research conversation

7 0.76162946 2254 andrew gelman stats-2014-03-18-Those wacky anti-Bayesians used to be intimidating, but now they’re just pathetic

8 0.7615307 1205 andrew gelman stats-2012-03-09-Coming to agreement on philosophy of statistics

9 0.75925171 183 andrew gelman stats-2010-08-04-Bayesian models for simultaneous equation systems?

10 0.75512969 1712 andrew gelman stats-2013-02-07-Philosophy and the practice of Bayesian statistics (with all the discussions!)

11 0.75292414 1719 andrew gelman stats-2013-02-11-Why waste time philosophizing?

12 0.75054842 1497 andrew gelman stats-2012-09-15-Our blog makes connections!

13 0.749879 291 andrew gelman stats-2010-09-22-Philosophy of Bayes and non-Bayes: A dialogue with Deborah Mayo

14 0.73992044 1779 andrew gelman stats-2013-03-27-“Two Dogmas of Strong Objective Bayesianism”

15 0.73644871 1529 andrew gelman stats-2012-10-11-Bayesian brains?

16 0.73622227 231 andrew gelman stats-2010-08-24-Yet another Bayesian job opportunity

17 0.73462874 2182 andrew gelman stats-2014-01-22-Spell-checking example demonstrates key aspects of Bayesian data analysis

18 0.73390698 1469 andrew gelman stats-2012-08-25-Ways of knowing

19 0.73275262 2273 andrew gelman stats-2014-03-29-References (with code) for Bayesian hierarchical (multilevel) modeling and structural equation modeling

20 0.73245114 1182 andrew gelman stats-2012-02-24-Untangling the Jeffreys-Lindley paradox


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(9, 0.02), (15, 0.024), (16, 0.029), (17, 0.085), (21, 0.051), (24, 0.174), (35, 0.01), (48, 0.012), (57, 0.017), (62, 0.013), (65, 0.012), (71, 0.011), (86, 0.044), (99, 0.381)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.9892996 309 andrew gelman stats-2010-10-01-Why Development Economics Needs Theory?

Introduction: Robert Neumann writes: in the JEP 24(3), page18, Daron Acemoglu states: Why Development Economics Needs Theory There is no general agreement on how much we should rely on economic theory in motivating empirical work and whether we should try to formulate and estimate “structural parameters.” I (Acemoglu) argue that the answer is largely “yes” because otherwise econometric estimates would lack external validity, in which case they can neither inform us about whether a particular model or theory is a useful approximation to reality, nor would they be useful in providing us guidance on what the effects of similar shocks and policies would be in different circumstances or if implemented in different scales. I therefore define “structural parameters” as those that provide external validity and would thus be useful in testing theories or in policy analysis beyond the specific environment and sample from which they are derived. External validity becomes a particularly challenging t

2 0.9882009 1136 andrew gelman stats-2012-01-23-Fight! (also a bit of reminiscence at the end)

Introduction: Martin Lindquist and Michael Sobel published a fun little article in Neuroimage on models and assumptions for causal inference with intermediate outcomes. As their subtitle indicates (“A response to the comments on our comment”), this is a topic of some controversy. Lindquist and Sobel write: Our original comment (Lindquist and Sobel, 2011) made explicit the types of assumptions neuroimaging researchers are making when directed graphical models (DGMs), which include certain types of structural equation models (SEMs), are used to estimate causal effects. When these assumptions, which many researchers are not aware of, are not met, parameters of these models should not be interpreted as effects. . . . [Judea] Pearl does not disagree with anything we stated. However, he takes exception to our use of potential outcomes notation, which is the standard notation used in the statistical literature on causal inference, and his comment is devoted to promoting his alternative conventions. [C

same-blog 3 0.98216987 1228 andrew gelman stats-2012-03-25-Continuous variables in Bayesian networks

Introduction: Antti Rasinen writes: I’m a former undergrad machine learning student and a current software engineer with a Bayesian hobby. Today my two worlds collided. I ask for some enlightenment. On your blog you’ve repeatedly advocated continuous distributions with Bayesian models. Today I read this article by Ricky Ho, who writes: The strength of Bayesian network is it is highly scalable and can learn incrementally because all we do is to count the observed variables and update the probability distribution table. Similar to Neural Network, Bayesian network expects all data to be binary, categorical variable will need to be transformed into multiple binary variable as described above. Numeric variable is generally not a good fit for Bayesian network. The last sentence seems to be at odds with what you’ve said. Sadly, I don’t have enough expertise to say which view of the world is correct. During my undergrad years our team wrote an implementation of the Junction Tree algorithm. We r

4 0.98181617 2314 andrew gelman stats-2014-05-01-Heller, Heller, and Gorfine on univariate and multivariate information measures

Introduction: Malka Gorfine writes: We noticed that the important topic of association measures and tests came up again in your blog, and we have few comments in this regard. It is useful to distinguish between the univariate and multivariate methods. A consistent multivariate method can recognise dependence between two vectors of random variables, while a univariate method can only loop over pairs of components and check for dependency between them. There are very few consistent multivariate methods. To the best of our knowledge there are three practical methods: 1) HSIC by Gretton et al. (http://www.gatsby.ucl.ac.uk/~gretton/papers/GreBouSmoSch05.pdf) 2) dcov by Szekely et al. (http://projecteuclid.org/euclid.aoas/1267453933) 3) A method we introduced in Heller et al (Biometrika, 2013, 503—510, http://biomet.oxfordjournals.org/content/early/2012/12/04/biomet.ass070.full.pdf+html, and an R package, HHG, is available as well http://cran.r-project.org/web/packages/HHG/index.html). A

5 0.98125267 2359 andrew gelman stats-2014-06-04-All the Assumptions That Are My Life

Introduction: Statisticians take tours in other people’s data. All methods of statistical inference rest on statistical models. Experiments typically have problems with compliance, measurement error, generalizability to the real world, and representativeness of the sample. Surveys typically have problems of undercoverage, nonresponse, and measurement error. Real surveys are done to learn about the general population. But real surveys are not random samples. For another example, consider educational tests: what are they exactly measuring? Nobody knows. Medical research: even if it’s a randomized experiment, the participants in the study won’t be a random sample from the population for whom you’d recommend treatment. You don’t need random sampling to generalize the results of a medical experiment to the general population but you need some substantive theory to make the assumption that effects in your nonrepresentative sample of people will be similar to effects in the population of interest. Ve

6 0.97995949 1230 andrew gelman stats-2012-03-26-Further thoughts on nonparametric correlation measures

7 0.97890472 397 andrew gelman stats-2010-11-06-Multilevel quantile regression

8 0.97668153 1076 andrew gelman stats-2011-12-21-Derman, Rodrik and the nature of statistical models

9 0.97564685 2136 andrew gelman stats-2013-12-16-Whither the “bet on sparsity principle” in a nonsparse world?

10 0.9751687 1616 andrew gelman stats-2012-12-10-John McAfee is a Heinlein hero

11 0.97283638 2295 andrew gelman stats-2014-04-18-One-tailed or two-tailed?

12 0.9726119 1746 andrew gelman stats-2013-03-02-Fishing for cherries

13 0.97183979 2142 andrew gelman stats-2013-12-21-Chasing the noise

14 0.97179449 259 andrew gelman stats-2010-09-06-Inbox zero. Really.

15 0.97150779 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging

16 0.97086722 1950 andrew gelman stats-2013-07-22-My talks that were scheduled for Tues at the Data Skeptics meetup and Wed at the Open Statistical Programming meetup

17 0.97082049 1467 andrew gelman stats-2012-08-23-The pinch-hitter syndrome again

18 0.97072649 1605 andrew gelman stats-2012-12-04-Write This Book

19 0.97065818 1502 andrew gelman stats-2012-09-19-Scalability in education

20 0.97039127 662 andrew gelman stats-2011-04-15-Bayesian statistical pragmatism