andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1401 knowledge-graph by maker-knowledge-mining

1401 andrew gelman stats-2012-06-30-David Hogg on statistics


meta infos for this blog

Source: html

Introduction: Data analysis recipes: Fitting a model to data : We go through the many considerations involved in fitting a model to data, using as an example the fit of a straight line to a set of points in a two-dimensional plane. Standard weighted least-squares fitting is only appropriate when there is a dimension along which the data points have negligible uncertainties, and another along which all the uncertainties can be described by Gaussians of known variance; these conditions are rarely met in practice. We consider cases of general, heterogeneous, and arbitrarily covariant two-dimensional uncertainties, and situations in which there are bad data (large outliers), unknown uncertainties, and unknown but expected intrinsic scatter in the linear relationship being fit. Above all we emphasize the importance of having a “generative model” for the data, even an approximate one. Once there is a generative model, the subsequent fitting is non-arbitrary because the model permits direct computation


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Data analysis recipes: Fitting a model to data : We go through the many considerations involved in fitting a model to data, using as an example the fit of a straight line to a set of points in a two-dimensional plane. [sent-1, score-0.787]

2 Standard weighted least-squares fitting is only appropriate when there is a dimension along which the data points have negligible uncertainties, and another along which all the uncertainties can be described by Gaussians of known variance; these conditions are rarely met in practice. [sent-2, score-1.255]

3 We consider cases of general, heterogeneous, and arbitrarily covariant two-dimensional uncertainties, and situations in which there are bad data (large outliers), unknown uncertainties, and unknown but expected intrinsic scatter in the linear relationship being fit. [sent-3, score-0.88]

4 Above all we emphasize the importance of having a “generative model” for the data, even an approximate one. [sent-4, score-0.071]

5 Once there is a generative model, the subsequent fitting is non-arbitrary because the model permits direct computation of the likelihood of the parameters or the posterior probability distribution. [sent-5, score-1.311]

6 Construction of a posterior probability distribution is indispensible if there are “nuisance parameters” to marginalize away. [sent-6, score-0.613]

7 Data analysis recipes: Probability calculus for inference : In this pedagogical text aimed at those wanting to start thinking about or brush up on probabilistic inference, I review the rules by which probability distribution functions can (and cannot) be combined. [sent-7, score-1.379]

8 I connect these rules to the operations performed in probabilistic data analysis. [sent-8, score-0.653]

9 Dimensional analysis is emphasized as a valuable tool for helping to construct non-wrong probabilistic statements. [sent-9, score-0.623]

10 The applications of probability calculus in constructing likelihoods, marginalized likelihoods, posterior probabilities, and posterior predictions are all discussed. [sent-10, score-1.001]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('uncertainties', 0.378), ('probabilistic', 0.234), ('fitting', 0.233), ('recipes', 0.225), ('posterior', 0.205), ('probability', 0.197), ('generative', 0.189), ('calculus', 0.183), ('likelihoods', 0.183), ('unknown', 0.168), ('data', 0.135), ('intrinsic', 0.125), ('brush', 0.125), ('gaussians', 0.125), ('indispensible', 0.125), ('marginalized', 0.125), ('model', 0.123), ('rules', 0.119), ('pedagogical', 0.112), ('negligible', 0.112), ('scatter', 0.108), ('permits', 0.105), ('heterogeneous', 0.105), ('nuisance', 0.105), ('arbitrarily', 0.105), ('parameters', 0.1), ('dimensional', 0.098), ('considerations', 0.093), ('outliers', 0.09), ('operations', 0.089), ('aimed', 0.088), ('subsequent', 0.088), ('distribution', 0.086), ('constructing', 0.086), ('weighted', 0.084), ('construction', 0.083), ('emphasized', 0.082), ('along', 0.082), ('analysis', 0.08), ('construct', 0.079), ('inference', 0.079), ('dimension', 0.078), ('helping', 0.077), ('wanting', 0.076), ('connect', 0.076), ('rarely', 0.071), ('approximate', 0.071), ('valuable', 0.071), ('computation', 0.071), ('situations', 0.071)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 1401 andrew gelman stats-2012-06-30-David Hogg on statistics

Introduction: Data analysis recipes: Fitting a model to data : We go through the many considerations involved in fitting a model to data, using as an example the fit of a straight line to a set of points in a two-dimensional plane. Standard weighted least-squares fitting is only appropriate when there is a dimension along which the data points have negligible uncertainties, and another along which all the uncertainties can be described by Gaussians of known variance; these conditions are rarely met in practice. We consider cases of general, heterogeneous, and arbitrarily covariant two-dimensional uncertainties, and situations in which there are bad data (large outliers), unknown uncertainties, and unknown but expected intrinsic scatter in the linear relationship being fit. Above all we emphasize the importance of having a “generative model” for the data, even an approximate one. Once there is a generative model, the subsequent fitting is non-arbitrary because the model permits direct computation

2 0.16064805 1961 andrew gelman stats-2013-07-29-Postdocs in probabilistic modeling! With David Blei! And Stan!

Introduction: David Blei writes: I have two postdoc openings for basic research in probabilistic modeling . The thrusts are (a) scalable inference and (b) model checking. We will be developing new methods and implementing them in probabilistic programming systems. I am open to applicants interested in many kinds of applications and from any field. “Scalable inference” means black-box VB and related ideas, and “probabilistic programming systems” means Stan! (You might be familiar with Stan as an implementation of Nuts for posterior sampling, but Stan is also an efficient program for computing probability densities and their gradients, and as such is an ideal platform for developing scalable implementations of variational inference and related algorithms.) And you know I like model checking. Here’s the full ad: ===== POSTDOC POSITIONS IN PROBABILISTIC MODELING ===== We expect to have two postdoctoral positions available for January 2014 (or later). These positions are in D

3 0.15138176 780 andrew gelman stats-2011-06-27-Bridges between deterministic and probabilistic models for binary data

Introduction: For the analysis of binary data, various deterministic models have been proposed, which are generally simpler to fit and easier to understand than probabilistic models. We claim that corresponding to any deterministic model is an implicit stochastic model in which the deterministic model fits imperfectly, with errors occurring at random. In the context of binary data, we consider a model in which the probability of error depends on the model prediction. We show how to fit this model using a stochastic modification of deterministic optimization schemes. The advantages of fitting the stochastic model explicitly (rather than implicitly, by simply fitting a deterministic model and accepting the occurrence of errors) include quantification of uncertainty in the deterministic model’s parameter estimates, better estimation of the true model error rate, and the ability to check the fit of the model nontrivially. We illustrate this with a simple theoretical example of item response data and w

4 0.14276941 1482 andrew gelman stats-2012-09-04-Model checking and model understanding in machine learning

Introduction: Last month I wrote : Computer scientists are often brilliant but they can be unfamiliar with what is done in the worlds of data collection and analysis. This goes the other way too: statisticians such as myself can look pretty awkward, reinventing (or failing to reinvent) various wheels when we write computer programs or, even worse, try to design software.Andrew MacNamara writes: Andrew MacNamara followed up with some thoughts: I [MacNamara] had some basic statistics training through my MBA program, after having completed an undergrad degree in computer science. Since then I’ve been very interested in learning more about statistical techniques, including things like GLM and censored data analyses as well as machine learning topics like neural nets, SVMs, etc. I began following your blog after some research into Bayesian analysis topics and I am trying to dig deeper on that side of things. One thing I have noticed is that there seems to be a distinction between data analysi

5 0.13990189 1095 andrew gelman stats-2012-01-01-Martin and Liu: Probabilistic inference based on consistency of model with data

Introduction: What better way to start then new year than with some hard-core statistical theory? Ryan Martin and Chuanhai Liu send along a new paper on inferential models: Probability is a useful tool for describing uncertainty, so it is natural to strive for a system of statistical inference based on probabilities for or against various hypotheses. But existing probabilistic inference methods struggle to provide a meaningful interpretation of the probabilities across experiments in sufficient generality. In this paper we further develop a promising new approach based on what are called inferential models (IMs). The fundamental idea behind IMs is that there is an unobservable auxiliary variable that itself describes the inherent uncertainty about the parameter of interest, and that posterior probabilistic inference can be accomplished by predicting this unobserved quantity. We describe a simple and intuitive three-step construction of a random set of candidate parameter values, each being co

6 0.1383169 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes

7 0.12780546 1363 andrew gelman stats-2012-06-03-Question about predictive checks

8 0.12648593 1712 andrew gelman stats-2013-02-07-Philosophy and the practice of Bayesian statistics (with all the discussions!)

9 0.12624162 774 andrew gelman stats-2011-06-20-The pervasive twoishness of statistics; in particular, the “sampling distribution” and the “likelihood” are two different models, and that’s a good thing

10 0.12385833 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging

11 0.12365258 2200 andrew gelman stats-2014-02-05-Prior distribution for a predicted probability

12 0.12332181 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis

13 0.11778508 1208 andrew gelman stats-2012-03-11-Gelman on Hennig on Gelman on Bayes

14 0.11719774 1422 andrew gelman stats-2012-07-20-Likelihood thresholds and decisions

15 0.11710822 1941 andrew gelman stats-2013-07-16-Priors

16 0.11698711 1972 andrew gelman stats-2013-08-07-When you’re planning on fitting a model, build up to it by fitting simpler models first. Then, once you have a model you like, check the hell out of it

17 0.11519995 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model

18 0.11207186 2208 andrew gelman stats-2014-02-12-How to think about “identifiability” in Bayesian inference?

19 0.11131135 779 andrew gelman stats-2011-06-25-Avoiding boundary estimates using a prior distribution as regularization

20 0.10824936 1713 andrew gelman stats-2013-02-08-P-values and statistical practice


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.142), (1, 0.182), (2, 0.009), (3, 0.042), (4, 0.006), (5, -0.002), (6, -0.002), (7, -0.017), (8, 0.008), (9, -0.015), (10, 0.001), (11, 0.015), (12, -0.074), (13, -0.043), (14, -0.098), (15, -0.013), (16, 0.042), (17, -0.006), (18, 0.008), (19, -0.044), (20, 0.027), (21, -0.017), (22, -0.001), (23, -0.036), (24, -0.001), (25, 0.057), (26, -0.021), (27, 0.035), (28, 0.043), (29, -0.027), (30, -0.063), (31, 0.01), (32, -0.044), (33, 0.021), (34, -0.021), (35, 0.016), (36, 0.007), (37, -0.066), (38, -0.011), (39, 0.014), (40, 0.01), (41, -0.017), (42, 0.033), (43, -0.028), (44, 0.036), (45, -0.011), (46, -0.0), (47, 0.012), (48, 0.039), (49, -0.018)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94448608 1401 andrew gelman stats-2012-06-30-David Hogg on statistics

Introduction: Data analysis recipes: Fitting a model to data : We go through the many considerations involved in fitting a model to data, using as an example the fit of a straight line to a set of points in a two-dimensional plane. Standard weighted least-squares fitting is only appropriate when there is a dimension along which the data points have negligible uncertainties, and another along which all the uncertainties can be described by Gaussians of known variance; these conditions are rarely met in practice. We consider cases of general, heterogeneous, and arbitrarily covariant two-dimensional uncertainties, and situations in which there are bad data (large outliers), unknown uncertainties, and unknown but expected intrinsic scatter in the linear relationship being fit. Above all we emphasize the importance of having a “generative model” for the data, even an approximate one. Once there is a generative model, the subsequent fitting is non-arbitrary because the model permits direct computation

2 0.85915762 1363 andrew gelman stats-2012-06-03-Question about predictive checks

Introduction: Klaas Metselaar writes: I [Metselaar] am currently involved in a discussion about the use of the notion “predictive” as used in “posterior predictive check”. I would argue that the notion “predictive” should be reserved for posterior checks using information not used in the determination of the posterior. I quote from the discussion: “However, the predictive uncertainty in a Bayesian calculation requires sampling from all the random variables, and this includes both the model parameters and the residual error”. My [Metselaar's] comment: This may be exactly the point I am worried about: shouldn’t the predictive uncertainty be defined as sampling from the posterior parameter distribution + residual error + sampling from the prediction error distribution? Residual error reduces to measurement error in the case of a model which is perfect for the sample of experiments. Measurement error could be reduced to almost zero by ideal and perfect measurement instruments. I would h

3 0.85032344 2029 andrew gelman stats-2013-09-18-Understanding posterior p-values

Introduction: David Kaplan writes: I came across your paper “Understanding Posterior Predictive P-values”, and I have a question regarding your statement “If a posterior predictive p-value is 0.4, say, that means that, if we believe the model, we think there is a 40% chance that tomorrow’s value of T(y_rep) will exceed today’s T(y).” This is perfectly understandable to me and represents the idea of calibration. However, I am unsure how this relates to statements about fit. If T is the LR chi-square or Pearson chi-square, then your statement that there is a 40% chance that tomorrows value exceeds today’s value indicates bad fit, I think. Yet, some literature indicates that high p-values suggest good fit. Could you clarify this? My reply: I think that “fit” depends on the question being asked. In this case, I’d say the model fits for this particular purpose, even though it might not fit for other purposes. And here’s the abstract of the paper: Posterior predictive p-values do not i

4 0.84636527 1284 andrew gelman stats-2012-04-26-Modeling probability data

Introduction: Rafael Huber writes: I conducted an experiment in which subjects where asked to estimate the probability of a certain event given a number of information (like a wheater forecaster or a stockmarket trader). These probability estimates are the dependent variable of my experiment. My goal is to model the data with a (hierarchical) Bayesian regression. A linear equation with all the presented information (quantified as log odds) defines the mu of a normal likelihood. The tau as precision is another free parameter. y[r] ~ dnorm( mu[r] , tau[ subj[r] ] ) mu[r] <- b0[ subj[r] ] + b1[ subj[r] ] * x1[r] + b2[ subj[r] ] * x2[r] + b3[ subj[r] ] * x3[r] My problem is that I do not believe that the normal is the correct probability distribution to model probability data (‌ because the error is limited). However, until now nobody was able to tell me how I can correctly model probability data. My reply: You can take the logit of the data before analyzing them. That is assuming there

5 0.8063491 774 andrew gelman stats-2011-06-20-The pervasive twoishness of statistics; in particular, the “sampling distribution” and the “likelihood” are two different models, and that’s a good thing

Introduction: Lots of good statistical methods make use of two models. For example: - Classical statistics: estimates and standard errors using the likelihood function; tests and p-values using the sampling distribution. (The sampling distribution is not equivalent to the likelihood, as has been much discussed, for example in sequential stopping problems.) - Bayesian data analysis: inference using the posterior distribution; model checking using the predictive distribution (which, again, depends on the data-generating process in a way that the likelihood does not). - Machine learning: estimation using the data; evaluation using cross-validation (which requires some rule for partitioning the data, a rule that stands outside of the data themselves). - Bootstrap, jackknife, etc: estimation using an “estimator” (which, I would argue, is based in some sense on a model for the data), uncertainties using resampling (which, I would argue, is close to the idea of a “sampling distribution” in

6 0.79280293 1983 andrew gelman stats-2013-08-15-More on AIC, WAIC, etc

7 0.79068398 1221 andrew gelman stats-2012-03-19-Whassup with deviance having a high posterior correlation with a parameter in the model?

8 0.78976327 1817 andrew gelman stats-2013-04-21-More on Bayesian model selection in high-dimensional settings

9 0.78650337 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging

10 0.78618336 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model

11 0.78401637 1460 andrew gelman stats-2012-08-16-“Real data can be a pain”

12 0.78309536 398 andrew gelman stats-2010-11-06-Quote of the day

13 0.77526653 82 andrew gelman stats-2010-06-12-UnConMax – uncertainty consideration maxims 7 +-- 2

14 0.77085751 996 andrew gelman stats-2011-11-07-Chi-square FAIL when many cells have small expected values

15 0.75942057 1287 andrew gelman stats-2012-04-28-Understanding simulations in terms of predictive inference?

16 0.75632131 1141 andrew gelman stats-2012-01-28-Using predator-prey models on the Canadian lynx series

17 0.74990368 1162 andrew gelman stats-2012-02-11-Adding an error model to a deterministic model

18 0.74844033 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes

19 0.74822599 1459 andrew gelman stats-2012-08-15-How I think about mixture models

20 0.74751508 1527 andrew gelman stats-2012-10-10-Another reason why you can get good inferences from a bad model


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(6, 0.021), (16, 0.028), (21, 0.292), (24, 0.198), (26, 0.013), (27, 0.029), (42, 0.021), (47, 0.011), (53, 0.01), (57, 0.012), (59, 0.011), (73, 0.034), (99, 0.21)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.97231573 672 andrew gelman stats-2011-04-20-The R code for those time-use graphs

Introduction: By popular demand, here’s my R script for the time-use graphs : # The data a1 <- c(4.2,3.2,11.1,1.3,2.2,2.0) a2 <- c(3.9,3.2,10.0,0.8,3.1,3.1) a3 <- c(6.3,2.5,9.8,0.9,2.2,2.4) a4 <- c(4.4,3.1,9.8,0.8,3.3,2.7) a5 <- c(4.8,3.0,9.9,0.7,3.3,2.4) a6 <- c(4.0,3.4,10.5,0.7,3.3,2.1) a <- rbind(a1,a2,a3,a4,a5,a6) avg <- colMeans (a) avg.array <- t (array (avg, rev(dim(a)))) diff <- a - avg.array country.name <- c("France", "Germany", "Japan", "Britain", "USA", "Turkey") # The line plots par (mfrow=c(2,3), mar=c(4,4,2,.5), mgp=c(2,.7,0), tck=-.02, oma=c(3,0,4,0), bg="gray96", fg="gray30") for (i in 1:6){ plot (c(1,6), c(-1,1.7), xlab="", ylab="", xaxt="n", yaxt="n", bty="l", type="n") lines (1:6, diff[i,], col="blue") points (1:6, diff[i,], pch=19, col="black") if (i>3){ axis (1, c(1,3,5), c ("Work,\nstudy", "Eat,\nsleep", "Leisure"), mgp=c(2,1.5,0), tck=0, cex.axis=1.2) axis (1, c(2,4,6), c ("Unpaid\nwork", "Personal\nCare", "Other"), mgp=c(2,1.5,0),

2 0.95633698 2298 andrew gelman stats-2014-04-21-On deck this week

Introduction: Mon : Ticket to Baaaath Tues : Ticket to Baaaaarf Wed : Thinking of doing a list experiment? Here’s a list of reasons why you should think again Thurs : An open site for researchers to post and share papers Fri : Questions about “Too Good to Be True” Sat : Sleazy sock puppet can’t stop spamming our discussion of compressed sensing and promoting the work of Xiteng Liu Sun : White stripes and dead armadillos

3 0.92606825 1615 andrew gelman stats-2012-12-10-A defense of Tom Wolfe based on the impossibility of the law of small numbers in network structure

Introduction: A tall thin young man came to my office today to talk about one of my current pet topics: stories and social science. I brought up Tom Wolfe and his goal of compressing an entire city into a single novel, and how this reminded me of the psychologists Kahneman and Tversky’s concept of “the law of small numbers,” the idea that we expect any small sample to replicate all the properties of the larger population that it represents. Strictly speaking, the law of small numbers is impossible—any small sample necessarily has its own unique features—but this is even more true if we consider network properties. The average American knows about 700 people (depending on how you define “know”) and this defines a social network over the population. Now suppose you look at a few hundred people and all their connections. This mini-network will almost necessarily look much much sparser than the national network, as we’re removing the connections to the people not in the sample. Now consider how

same-blog 4 0.92504644 1401 andrew gelman stats-2012-06-30-David Hogg on statistics

Introduction: Data analysis recipes: Fitting a model to data : We go through the many considerations involved in fitting a model to data, using as an example the fit of a straight line to a set of points in a two-dimensional plane. Standard weighted least-squares fitting is only appropriate when there is a dimension along which the data points have negligible uncertainties, and another along which all the uncertainties can be described by Gaussians of known variance; these conditions are rarely met in practice. We consider cases of general, heterogeneous, and arbitrarily covariant two-dimensional uncertainties, and situations in which there are bad data (large outliers), unknown uncertainties, and unknown but expected intrinsic scatter in the linear relationship being fit. Above all we emphasize the importance of having a “generative model” for the data, even an approximate one. Once there is a generative model, the subsequent fitting is non-arbitrary because the model permits direct computation

5 0.92152917 151 andrew gelman stats-2010-07-16-Wanted: Probability distributions for rank orderings

Introduction: Dietrich Stoyan writes: I asked the IMS people for an expert in statistics of voting/elections and they wrote me your name. I am a statistician, but never worked in the field voting/elections. It was my son-in-law who asked me for statistical theories in that field. He posed in particular the following problem: The aim of the voting is to come to a ranking of c candidates. Every vote is a permutation of these c candidates. The problem is to have probability distributions in the set of all permutations of c elements. Are there theories for such distributions? I should be very grateful for a fast answer with hints to literature. (I confess that I do not know your books.) My reply: Rather than trying to model the ranks directly, I’d recommend modeling a latent continuous outcome which then implies a distribution on ranks, if the ranks are of interest. There are lots of distributions of c-dimensional continuous outcomes. In political science, the usual way to start is

6 0.91484094 1232 andrew gelman stats-2012-03-27-Banned in NYC school tests

7 0.9058122 432 andrew gelman stats-2010-11-27-Neumann update

8 0.90340447 894 andrew gelman stats-2011-09-07-Hipmunk FAIL: Graphics without content is not enough

9 0.89456356 1826 andrew gelman stats-2013-04-26-“A Vast Graveyard of Undead Theories: Publication Bias and Psychological Science’s Aversion to the Null”

10 0.88652122 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”

11 0.87959599 514 andrew gelman stats-2011-01-13-News coverage of statistical issues…how did I do?

12 0.87536991 1275 andrew gelman stats-2012-04-22-Please stop me before I barf again

13 0.86945462 62 andrew gelman stats-2010-06-01-Two Postdoc Positions Available on Bayesian Hierarchical Modeling

14 0.86012065 2306 andrew gelman stats-2014-04-26-Sleazy sock puppet can’t stop spamming our discussion of compressed sensing and promoting the work of Xiteng Liu

15 0.85945421 810 andrew gelman stats-2011-07-20-Adding more information can make the variance go up (depending on your model)

16 0.85368371 854 andrew gelman stats-2011-08-15-A silly paper that tries to make fun of multilevel models

17 0.84772688 1857 andrew gelman stats-2013-05-15-Does quantum uncertainty have a place in everyday applied statistics?

18 0.8381052 1728 andrew gelman stats-2013-02-19-The grasshopper wins, and Greg Mankiw’s grandmother would be “shocked and appalled” all over again

19 0.83754468 433 andrew gelman stats-2010-11-27-One way that psychology research is different than medical research

20 0.83053064 659 andrew gelman stats-2011-04-13-Jim Campbell argues that Larry Bartels’s “Unequal Democracy” findings are not robust