andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1881 knowledge-graph by maker-knowledge-mining

1881 andrew gelman stats-2013-06-03-Boot


meta infos for this blog

Source: html

Introduction: Joshua Hartshorne writes: I ran several large-N experiments (separate participants) and looked at performance against age. What we want to do is compare age-of-peak-performance across the different tasks (again, different participants). We bootstrapped age-of-peak-performance. On each iteration, we sampled (with replacement) the X scores at each age, where X=num of participants at that age, and recorded the age at which performance peaked on that task. We then recorded the age at which performance was at peak and repeated. Once we had distributions of age-of-peak-performance, we used the means and SDs to calculate t-statistics to compare the results across different tasks. For graphical presentation, we used medians, interquartile ranges, and 95% confidence intervals (based on the distributions: the range within which 75% and 95% of the bootstrapped peaks appeared). While a number of people we consulted with thought this made a lot of sense, one reviewer of the paper insist


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Joshua Hartshorne writes: I ran several large-N experiments (separate participants) and looked at performance against age. [sent-1, score-0.18]

2 What we want to do is compare age-of-peak-performance across the different tasks (again, different participants). [sent-2, score-0.47]

3 On each iteration, we sampled (with replacement) the X scores at each age, where X=num of participants at that age, and recorded the age at which performance peaked on that task. [sent-4, score-1.129]

4 We then recorded the age at which performance was at peak and repeated. [sent-5, score-0.835]

5 Once we had distributions of age-of-peak-performance, we used the means and SDs to calculate t-statistics to compare the results across different tasks. [sent-6, score-0.498]

6 For graphical presentation, we used medians, interquartile ranges, and 95% confidence intervals (based on the distributions: the range within which 75% and 95% of the bootstrapped peaks appeared). [sent-7, score-0.771]

7 My instinct is that it would be better to fit a curve to each dataset rather than to just pick the age at which the raw data average is highest. [sent-11, score-0.868]

8 You could, for example, fit a Gaussian process or even a lowess and find the age at which the fitted curve is maximized. [sent-12, score-0.785]

9 I’m guessing that will be more accurate than taking the max of the raw data. [sent-13, score-0.242]

10 Whatever data summary you use, though, getting standard errors via bootstrap seems reasonable to me. [sent-14, score-0.39]

11 With large N, the statistic should have an approximately symmetric sampling distribution if it’s not near the boundary, so you can use estimates and standard errors, shouldn’t need to bother with quantiles. [sent-15, score-0.533]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('age', 0.356), ('bootstrapped', 0.25), ('bootstrap', 0.201), ('recorded', 0.198), ('participants', 0.194), ('performance', 0.18), ('curve', 0.16), ('raw', 0.146), ('intervals', 0.141), ('interquartile', 0.133), ('num', 0.133), ('medians', 0.133), ('sds', 0.133), ('confidence', 0.127), ('asymmetric', 0.125), ('compare', 0.122), ('instinct', 0.12), ('peaks', 0.12), ('distributions', 0.118), ('percentiles', 0.112), ('peaked', 0.112), ('consulted', 0.112), ('iteration', 0.112), ('symmetric', 0.112), ('lowess', 0.109), ('quantiles', 0.109), ('errors', 0.106), ('replacement', 0.102), ('peak', 0.101), ('ranges', 0.101), ('joshua', 0.099), ('reviewer', 0.098), ('max', 0.096), ('tasks', 0.096), ('boundary', 0.095), ('across', 0.094), ('statistic', 0.094), ('distribution', 0.092), ('constructing', 0.091), ('sampled', 0.089), ('citations', 0.089), ('fit', 0.086), ('calculate', 0.085), ('standard', 0.083), ('gaussian', 0.081), ('bother', 0.079), ('different', 0.079), ('corresponding', 0.075), ('fitted', 0.074), ('approximately', 0.073)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999982 1881 andrew gelman stats-2013-06-03-Boot

Introduction: Joshua Hartshorne writes: I ran several large-N experiments (separate participants) and looked at performance against age. What we want to do is compare age-of-peak-performance across the different tasks (again, different participants). We bootstrapped age-of-peak-performance. On each iteration, we sampled (with replacement) the X scores at each age, where X=num of participants at that age, and recorded the age at which performance peaked on that task. We then recorded the age at which performance was at peak and repeated. Once we had distributions of age-of-peak-performance, we used the means and SDs to calculate t-statistics to compare the results across different tasks. For graphical presentation, we used medians, interquartile ranges, and 95% confidence intervals (based on the distributions: the range within which 75% and 95% of the bootstrapped peaks appeared). While a number of people we consulted with thought this made a lot of sense, one reviewer of the paper insist

2 0.15305972 2128 andrew gelman stats-2013-12-09-How to model distributions that have outliers in one direction

Introduction: Shravan writes: I have a problem very similar to the one presented chapter 6 of BDA, the speed of light example. You use the distribution of the minimum scores from the posterior predictive distribution, show that it’s not realistic given the data, and suggest that an asymmetric contaminated normal distribution or a symmetric long-tailed distribution would be better. How does one use such a distribution? My reply: You can actually use a symmetric long-tailed distribution such as t with low degrees of freedom. One striking feature of symmetric long-tailed distributions is that a small random sample from such a distribution can have outliers on one side or the other and look asymmetric. Just to see this, try the following in R: par (mfrow=c(3,3), mar=c(1,1,1,1)) for (i in 1:9) hist (rt (100, 2), xlab="", ylab="", main="") You’ll see some skewed distributions. So that’s the message (which I learned from an offhand comment of Rubin, actually): if you want to model

3 0.15261643 486 andrew gelman stats-2010-12-26-Age and happiness: The pattern isn’t as clear as you might think

Introduction: A couple people pointed me to this recent news article which discusses “why, beyond middle age, people get happier as they get older.” Here’s the story: When people start out on adult life, they are, on average, pretty cheerful. Things go downhill from youth to middle age until they reach a nadir commonly known as the mid-life crisis. So far, so familiar. The surprising part happens after that. Although as people move towards old age they lose things they treasure–vitality, mental sharpness and looks–they also gain what people spend their lives pursuing: happiness. This curious finding has emerged from a new branch of economics that seeks a more satisfactory measure than money of human well-being. Conventional economics uses money as a proxy for utility–the dismal way in which the discipline talks about happiness. But some economists, unconvinced that there is a direct relationship between money and well-being, have decided to go to the nub of the matter and measure happiness i

4 0.13641888 2201 andrew gelman stats-2014-02-06-Bootstrap averaging: Examples where it works and where it doesn’t work

Introduction: Aki and I write : The very generality of the boostrap creates both opportunity and peril, allowing researchers to solve otherwise intractable problems but also sometimes leading to an answer with an inappropriately high level of certainty. We demonstrate with two examples from our own research: one problem where bootstrap smoothing was effective and led us to an improved method, and another case where bootstrap smoothing would not solve the underlying problem. Our point in these examples is not to disparage bootstrapping but rather to gain insight into where it will be more or less effective as a smoothing tool. An example where bootstrap smoothing works well Bayesian posterior distributions are commonly summarized using Monte Carlo simulations, and inferences for scalar parameters or quantities of interest can be summarized using 50% or 95% intervals. A interval for a continuous quantity is typically constructed either as a central probability interval (with probabili

5 0.12677272 1283 andrew gelman stats-2012-04-26-Let’s play “Guess the smoother”!

Introduction: Andre de Boer writes: In my profession as a risk manager I encountered this graph: I can’t figure out what kind of regression this is, would you be so kind to enlighten me? The points represent (maturity,yield) of bonds. My reply: That’s a fun problem, reverse-engineering a curve fit! My first guess is lowess, although it seems too flat and asympoty on the right side of the graph to be lowess. Maybe a Gaussian process? Looks too smooth to be a spline. I guess I’ll go with my original guess, on the theory that lowess is the most accessible smoother out there, and if someone fit something much more complicated they’d make more of a big deal about it. On the other hand, if the curve is an automatic output of some software (Excel? Stata?) then it could be just about anything. Does anyone have any ideas?

6 0.12236241 774 andrew gelman stats-2011-06-20-The pervasive twoishness of statistics; in particular, the “sampling distribution” and the “likelihood” are two different models, and that’s a good thing

7 0.12215152 1178 andrew gelman stats-2012-02-21-How many data points do you really have?

8 0.12073246 437 andrew gelman stats-2010-11-29-The mystery of the U-shaped relationship between happiness and age

9 0.11757202 1527 andrew gelman stats-2012-10-10-Another reason why you can get good inferences from a bad model

10 0.11641884 480 andrew gelman stats-2010-12-21-Instead of “confidence interval,” let’s say “uncertainty interval”

11 0.11235424 2220 andrew gelman stats-2014-02-22-Quickies

12 0.1110927 2176 andrew gelman stats-2014-01-19-Transformations for non-normal data

13 0.11008231 1206 andrew gelman stats-2012-03-10-95% intervals that I don’t believe, because they’re from a flat prior I don’t believe

14 0.10907234 870 andrew gelman stats-2011-08-25-Why it doesn’t make sense in general to form confidence intervals by inverting hypothesis tests

15 0.1074853 1913 andrew gelman stats-2013-06-24-Why it doesn’t make sense in general to form confidence intervals by inverting hypothesis tests

16 0.10347376 1543 andrew gelman stats-2012-10-21-Model complexity as a function of sample size

17 0.10172004 1605 andrew gelman stats-2012-12-04-Write This Book

18 0.098209307 678 andrew gelman stats-2011-04-25-Democrats do better among the most and least educated groups

19 0.094549671 50 andrew gelman stats-2010-05-25-Looking for Sister Right

20 0.089380637 277 andrew gelman stats-2010-09-14-In an introductory course, when does learning occur?


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.15), (1, 0.057), (2, 0.053), (3, -0.038), (4, 0.067), (5, -0.017), (6, -0.003), (7, 0.021), (8, -0.0), (9, -0.027), (10, 0.005), (11, 0.005), (12, -0.028), (13, -0.017), (14, -0.016), (15, -0.017), (16, 0.021), (17, -0.012), (18, 0.024), (19, -0.047), (20, 0.061), (21, 0.021), (22, 0.009), (23, -0.037), (24, 0.064), (25, -0.005), (26, -0.051), (27, 0.003), (28, 0.041), (29, 0.073), (30, -0.005), (31, -0.045), (32, -0.011), (33, -0.001), (34, 0.009), (35, 0.066), (36, 0.014), (37, 0.066), (38, -0.031), (39, 0.018), (40, 0.052), (41, -0.036), (42, 0.094), (43, -0.014), (44, -0.005), (45, -0.041), (46, -0.008), (47, -0.0), (48, 0.072), (49, -0.027)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96548432 1881 andrew gelman stats-2013-06-03-Boot

Introduction: Joshua Hartshorne writes: I ran several large-N experiments (separate participants) and looked at performance against age. What we want to do is compare age-of-peak-performance across the different tasks (again, different participants). We bootstrapped age-of-peak-performance. On each iteration, we sampled (with replacement) the X scores at each age, where X=num of participants at that age, and recorded the age at which performance peaked on that task. We then recorded the age at which performance was at peak and repeated. Once we had distributions of age-of-peak-performance, we used the means and SDs to calculate t-statistics to compare the results across different tasks. For graphical presentation, we used medians, interquartile ranges, and 95% confidence intervals (based on the distributions: the range within which 75% and 95% of the bootstrapped peaks appeared). While a number of people we consulted with thought this made a lot of sense, one reviewer of the paper insist

2 0.71570301 1672 andrew gelman stats-2013-01-14-How do you think about the values in a confidence interval?

Introduction: Philip Jones writes: As an interested reader of your blog, I wondered if you might consider a blog entry sometime on the following question I posed on CrossValidated (StackExchange). I originally posed the question based on my uncertainty about 95% CIs: “Are all values within the 95% CI equally likely (probable), or are the values at the “tails” of the 95% CI less likely than those in the middle of the CI closer to the point estimate?” I posed this question based on discordant information I found at a couple of different web sources (I posted these sources in the body of the question). I received some interesting replies, and the replies were not unanimous, in fact there is some serious disagreement there! After seeing this disagreement, I naturally thought of you, and whether you might be able to clear this up. Please note I am not referring to credible intervals, but rather to the common medical journal reporting standard of confidence intervals. My response: First

3 0.71345967 1527 andrew gelman stats-2012-10-10-Another reason why you can get good inferences from a bad model

Introduction: John Cook considers how people justify probability distribution assumptions: Sometimes distribution assumptions are not justified. Sometimes distributions can be derived from fundamental principles [or] . . . on theoretical grounds. For example, large samples and the central limit theorem together may justify assuming that something is normally distributed. Often the choice of distribution is somewhat arbitrary, chosen by intuition or for convenience, and then empirically shown to work well enough. Sometimes a distribution can be a bad fit and still work well, depending on what you’re asking of it. Cook continues: The last point is particularly interesting. It’s not hard to imagine that a poor fit would produce poor results. It’s surprising when a poor fit produces good results. And then he gives an example of an effective but inaccurate model used to model survival times in a clinical trial. Cook explains: The [poorly-fitting] method works well because of the q

4 0.69494337 1178 andrew gelman stats-2012-02-21-How many data points do you really have?

Introduction: Chris Harrison writes: I have just come across your paper in the 2009 American Scientist. Another problem that I frequently come across is when people do power spectral analyses of signals. If one has 1200 points (fairly modest in this day and age) then there are 600 power spectral estimates. People will then determine the 95% confidence limits and pick out any spectral estimate that sticks up above this, claiming that it is significant. But there will be on average 30 estimates that stick up too high or too low. So in general there will be 15 spectral estimates which are higher than the 95% confidence limit which could happen just by chance. I suppose that this means that you have to set a much higher confidence limit, which would depend on the number of data in your signal. I would also like your opinion about a paper in the Proceedings of the National Academy of Science, “The causality analysis of climate change and large-scale human crisis” by David D. Zhang, Harry F. L

5 0.67547703 1913 andrew gelman stats-2013-06-24-Why it doesn’t make sense in general to form confidence intervals by inverting hypothesis tests

Introduction: I’m reposing this classic from 2011 . . . Peter Bergman pointed me to this discussion from Cyrus of a presentation by Guido Imbens on design of randomized experiments. Cyrus writes: The standard analysis that Imbens proposes includes (1) a Fisher-type permutation test of the sharp null hypothesis–what Imbens referred to as “testing”–along with a (2) Neyman-type point estimate of the sample average treatment effect and confidence interval–what Imbens referred to as “estimation.” . . . Imbens claimed that testing and estimation are separate enterprises with separate goals and that the two should not be confused. I [Cyrus] took it as a warning against proposals that use “inverted” tests in order to produce point estimates and confidence intervals. There is no reason that such confidence intervals will have accurate coverage except under rather dire assumptions, meaning that they are not “confidence intervals” in the way that we usually think of them. I agree completely. T

6 0.67435437 1346 andrew gelman stats-2012-05-27-Average predictive comparisons when changing a pair of variables

7 0.6738078 870 andrew gelman stats-2011-08-25-Why it doesn’t make sense in general to form confidence intervals by inverting hypothesis tests

8 0.67046946 996 andrew gelman stats-2011-11-07-Chi-square FAIL when many cells have small expected values

9 0.66112339 2176 andrew gelman stats-2014-01-19-Transformations for non-normal data

10 0.64643824 603 andrew gelman stats-2011-03-07-Assumptions vs. conditions, part 2

11 0.63995022 2128 andrew gelman stats-2013-12-09-How to model distributions that have outliers in one direction

12 0.63836551 410 andrew gelman stats-2010-11-12-The Wald method has been the subject of extensive criticism by statisticians for exaggerating results”

13 0.6323781 1470 andrew gelman stats-2012-08-26-Graphs showing regression uncertainty: the code!

14 0.62866455 1918 andrew gelman stats-2013-06-29-Going negative

15 0.62435919 1509 andrew gelman stats-2012-09-24-Analyzing photon counts

16 0.62067497 2258 andrew gelman stats-2014-03-21-Random matrices in the news

17 0.6206286 480 andrew gelman stats-2010-12-21-Instead of “confidence interval,” let’s say “uncertainty interval”

18 0.61867529 2201 andrew gelman stats-2014-02-06-Bootstrap averaging: Examples where it works and where it doesn’t work

19 0.61724406 799 andrew gelman stats-2011-07-13-Hypothesis testing with multiple imputations

20 0.61187226 2046 andrew gelman stats-2013-10-01-I’ll say it again


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(12, 0.011), (16, 0.1), (18, 0.011), (20, 0.027), (21, 0.018), (22, 0.011), (24, 0.182), (35, 0.018), (36, 0.026), (40, 0.012), (41, 0.013), (42, 0.025), (45, 0.023), (47, 0.022), (48, 0.014), (53, 0.016), (59, 0.012), (62, 0.046), (63, 0.029), (65, 0.019), (68, 0.021), (78, 0.048), (82, 0.011), (86, 0.025), (99, 0.184)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94579017 1881 andrew gelman stats-2013-06-03-Boot

Introduction: Joshua Hartshorne writes: I ran several large-N experiments (separate participants) and looked at performance against age. What we want to do is compare age-of-peak-performance across the different tasks (again, different participants). We bootstrapped age-of-peak-performance. On each iteration, we sampled (with replacement) the X scores at each age, where X=num of participants at that age, and recorded the age at which performance peaked on that task. We then recorded the age at which performance was at peak and repeated. Once we had distributions of age-of-peak-performance, we used the means and SDs to calculate t-statistics to compare the results across different tasks. For graphical presentation, we used medians, interquartile ranges, and 95% confidence intervals (based on the distributions: the range within which 75% and 95% of the bootstrapped peaks appeared). While a number of people we consulted with thought this made a lot of sense, one reviewer of the paper insist

2 0.93031549 1206 andrew gelman stats-2012-03-10-95% intervals that I don’t believe, because they’re from a flat prior I don’t believe

Introduction: Arnaud Trolle (no relation ) writes: I have a question about the interpretation of (non-)overlapping of 95% credibility intervals. In a Bayesian ANOVA (a within-subjects one), I computed 95% credibility intervals about the main effects of a factor. I’d like to compare two by two the main effects across the different conditions of the factor. Can I directly interpret the (non-)overlapping of these credibility intervals and make the following statements: “As the 95% credibility intervals do not overlap, both conditions have significantly different main effects” or conversely “As the 95% credibility intervals overlap, the main effects of both conditions are not significantly different, i.e. equivalent”? I heard that, in the case of classical confidence intervals, the second statement is false, but what happens when working within a Bayesian framework? My reply: I think it makes more sense to directly look at inference for the difference. Also, your statements about equivalence

3 0.92631775 1019 andrew gelman stats-2011-11-19-Validation of Software for Bayesian Models Using Posterior Quantiles

Introduction: I love this stuff : This article presents a simulation-based method designed to establish the computational correctness of software developed to fit a specific Bayesian model, capitalizing on properties of Bayesian posterior distributions. We illustrate the validation technique with two examples. The validation method is shown to find errors in software when they exist and, moreover, the validation output can be informative about the nature and location of such errors. We also compare our method with that of an earlier approach. I hope we can put it into Stan.

4 0.91899693 799 andrew gelman stats-2011-07-13-Hypothesis testing with multiple imputations

Introduction: Vincent Yip writes: I have read your paper [with Kobi Abayomi and Marc Levy] regarding multiple imputation application. In order to diagnostic my imputed data, I used Kolmogorov-Smirnov (K-S) tests to compare the distribution differences between the imputed and observed values of a single attribute as mentioned in your paper. My question is: For example I have this attribute X with the following data: (NA = missing) Original dataset: 1, NA, 3, 4, 1, 5, NA Imputed dataset: 1, 2 , 3, 4, 1, 5, 6 a) in order to run the KS test, will I treat the observed data as 1, 3, 4,1, 5? b) and for the observed data, will I treat 1, 2 , 3, 4, 1, 5, 6 as the imputed dataset for the K-S test? or just 2 ,6? c) if I used m=5, I will have 5 set of imputed data sets. How would I apply K-S test to 5 of them and compare to the single observed distribution? Do I combine the 5 imputed data set into one by averaging each imputed values so I get one single imputed data and compare with the ob

5 0.9178893 1293 andrew gelman stats-2012-05-01-Huff the Magic Dragon

Introduction: Upon reading this , Susan remarked, “Don’t you think it’s interesting that a guy who promotes smoking has a last name of ‘Huff’? Reminds me of the Dennis/Dentist studies.” Good point. P.S. As discussed in the linked thread, the great statistician R. A. Fisher was notorious for minimizing the risks of smoking. How does this connect to Fisher’s name, one might ask?

6 0.91780174 1871 andrew gelman stats-2013-05-27-Annals of spam

7 0.91637397 1155 andrew gelman stats-2012-02-05-What is a prior distribution?

8 0.91542572 1080 andrew gelman stats-2011-12-24-Latest in blog advertising

9 0.9117794 1851 andrew gelman stats-2013-05-11-Actually, I have no problem with this graph

10 0.91157663 1838 andrew gelman stats-2013-05-03-Setting aside the politics, the debate over the new health-care study reveals that we’re moving to a new high standard of statistical journalism

11 0.91040611 807 andrew gelman stats-2011-07-17-Macro causality

12 0.90996766 1240 andrew gelman stats-2012-04-02-Blogads update

13 0.90989614 1219 andrew gelman stats-2012-03-18-Tips on “great design” from . . . Microsoft!

14 0.90960503 2029 andrew gelman stats-2013-09-18-Understanding posterior p-values

15 0.90915775 639 andrew gelman stats-2011-03-31-Bayes: radical, liberal, or conservative?

16 0.90893829 898 andrew gelman stats-2011-09-10-Fourteen magic words: an update

17 0.9082793 548 andrew gelman stats-2011-02-01-What goes around . . .

18 0.90786594 953 andrew gelman stats-2011-10-11-Steve Jobs’s cancer and science-based medicine

19 0.90769666 488 andrew gelman stats-2010-12-27-Graph of the year

20 0.90614623 1584 andrew gelman stats-2012-11-19-Tradeoffs in information graphics