andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1511 knowledge-graph by maker-knowledge-mining

1511 andrew gelman stats-2012-09-26-What do statistical p-values mean when the sample = the population?


meta infos for this blog

Source: html

Introduction: Felipe Nunes writes: I have many friends working with data that they claim to be considered as a ‘population’. For example, the universe of bills presented in a Congress, the roll call votes of all deputies in a legislature, a survey with all deputies in a country, the outcomes of an election, or the set of electoral institutions around the world. Because of the nature of these data, we do not know how to interpret the p-value. I have seen many arguments been made, but I have never seen a formal response to the question. So I don’t know what to say. The most common arguments among the community of young researchers in Brazil are: (1) don’t interpret p-value when you have population, but don’t infer anything either; (2) interpret the p-value because of error measurement which is also present, (3) there is no such a thing as a population, so always look at p-values, (4) don’t worry about p-value, interpret the coefficients substantively, and (5) if you are frequentist you interpret p-


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Felipe Nunes writes: I have many friends working with data that they claim to be considered as a ‘population’. [sent-1, score-0.36]

2 For example, the universe of bills presented in a Congress, the roll call votes of all deputies in a legislature, a survey with all deputies in a country, the outcomes of an election, or the set of electoral institutions around the world. [sent-2, score-1.961]

3 Because of the nature of these data, we do not know how to interpret the p-value. [sent-3, score-0.63]

4 I have seen many arguments been made, but I have never seen a formal response to the question. [sent-4, score-0.684]

5 If you have a paper or any other reference that can help with this discussion, please refer to me as well. [sent-7, score-0.34]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('interpret', 0.489), ('deputies', 0.414), ('population', 0.213), ('felipe', 0.188), ('arguments', 0.179), ('legislature', 0.178), ('brazil', 0.17), ('substantively', 0.164), ('infer', 0.146), ('universe', 0.143), ('bills', 0.139), ('roll', 0.135), ('seen', 0.133), ('electoral', 0.121), ('institutions', 0.12), ('votes', 0.108), ('congress', 0.107), ('refer', 0.107), ('frequentist', 0.106), ('formal', 0.106), ('young', 0.103), ('community', 0.095), ('coefficients', 0.095), ('reference', 0.095), ('friends', 0.093), ('measurement', 0.093), ('worry', 0.092), ('outcomes', 0.09), ('presented', 0.087), ('country', 0.086), ('election', 0.085), ('nature', 0.081), ('present', 0.077), ('please', 0.077), ('considered', 0.077), ('common', 0.073), ('call', 0.071), ('many', 0.069), ('error', 0.066), ('survey', 0.066), ('claim', 0.065), ('among', 0.065), ('response', 0.064), ('either', 0.062), ('help', 0.061), ('know', 0.06), ('researchers', 0.057), ('working', 0.056), ('reply', 0.055), ('set', 0.053)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1511 andrew gelman stats-2012-09-26-What do statistical p-values mean when the sample = the population?

Introduction: Felipe Nunes writes: I have many friends working with data that they claim to be considered as a ‘population’. For example, the universe of bills presented in a Congress, the roll call votes of all deputies in a legislature, a survey with all deputies in a country, the outcomes of an election, or the set of electoral institutions around the world. Because of the nature of these data, we do not know how to interpret the p-value. I have seen many arguments been made, but I have never seen a formal response to the question. So I don’t know what to say. The most common arguments among the community of young researchers in Brazil are: (1) don’t interpret p-value when you have population, but don’t infer anything either; (2) interpret the p-value because of error measurement which is also present, (3) there is no such a thing as a population, so always look at p-values, (4) don’t worry about p-value, interpret the coefficients substantively, and (5) if you are frequentist you interpret p-

2 0.16024143 972 andrew gelman stats-2011-10-25-How do you interpret standard errors from a regression fit to the entire population?

Introduction: David Radwin asks a question which comes up fairly often in one form or another: How should one respond to requests for statistical hypothesis tests for population (or universe) data? I [Radwin] first encountered this issue as an undergraduate when a professor suggested a statistical significance test for my paper comparing roll call votes between freshman and veteran members of Congress. Later I learned that such tests apply only to samples because their purpose is to tell you whether the difference in the observed sample is likely to exist in the population. If you have data for the whole population, like all members of the 103rd House of Representatives, you do not need a test to discern the true difference in the population. Sometimes researchers assume some sort of superpopulation like “all possible Congresses” or “Congresses across all time” and that the members of any given Congress constitute a sample. In my current work in education research, it is sometimes asserted t

3 0.1150433 291 andrew gelman stats-2010-09-22-Philosophy of Bayes and non-Bayes: A dialogue with Deborah Mayo

Introduction: I sent Deborah Mayo a link to my paper with Cosma Shalizi on the philosophy of statistics, and she sent me the link to this conference which unfortunately already occurred. (It’s too bad, because I’d have liked to have been there.) I summarized my philosophy as follows: I am highly sympathetic to the approach of Lakatos (or of Popper, if you consider Lakatos’s “Popper_2″ to be a reasonable simulation of the true Popperism), in that (a) I view statistical models as being built within theoretical structures, and (b) I see the checking and refutation of models to be a key part of scientific progress. A big problem I have with mainstream Bayesianism is its “inductivist” view that science can operate completely smoothly with posterior updates: the idea that new data causes us to increase the posterior probability of good models and decrease the posterior probability of bad models. I don’t buy that: I see models as ever-changing entities that are flexible and can be patched and ex

4 0.10998408 764 andrew gelman stats-2011-06-14-Examining US Legislative process with “Many Bills”

Introduction: This is Many Bills , a visualization of US bills by IBM: I learned about it a few days ago from Irene Ros at Foo Camp . It definitely looks better than my own analysis of US Senate bills .

5 0.10019898 2248 andrew gelman stats-2014-03-15-Problematic interpretations of confidence intervals

Introduction: Rink Hoekstra writes: A couple of months ago, you were visiting the University of Groningen, and after the talk you gave there I spoke briefly with you about a study that I conducted with Richard Morey, Jeff Rouder and Eric-Jan Wagenmakers. In the study, we found that researchers’  knowledge of how to interpret a confidence interval (CI), was almost as limited as the knowledge of students who had had no inferential statistics course yet. Our manuscript was recently accepted for publication in  Psychonomic Bulletin & Review , and it’s now available online (see e.g.,  here ). Maybe it’s interesting to discuss on your blog, especially since CIs are often promoted (for example in the new guidelines of Psychological Science ), but apparently researchers seem to have little idea how to interpret them. Given that the confidence percentage of a CI tells something about the procedure rather than about the data at hand, this might be understandable, but, according to us, it’s problematic neve

6 0.093528152 302 andrew gelman stats-2010-09-28-This is a link to a news article about a scientific paper

7 0.08910577 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

8 0.08727321 352 andrew gelman stats-2010-10-19-Analysis of survey data: Design based models vs. hierarchical modeling?

9 0.080378622 586 andrew gelman stats-2011-02-23-A statistical version of Arrow’s paradox

10 0.079210967 405 andrew gelman stats-2010-11-10-Estimation from an out-of-date census

11 0.075766891 2364 andrew gelman stats-2014-06-08-Regression and causality and variable ordering

12 0.073299885 2292 andrew gelman stats-2014-04-15-When you believe in things that you don’t understand

13 0.069218978 2351 andrew gelman stats-2014-05-28-Bayesian nonparametric weighted sampling inference

14 0.069117442 1149 andrew gelman stats-2012-02-01-Philosophy of Bayesian statistics: my reactions to Cox and Mayo

15 0.068969265 1156 andrew gelman stats-2012-02-06-Bayesian model-building by pure thought: Some principles and examples

16 0.068357572 1876 andrew gelman stats-2013-05-29-Another one of those “Psychological Science” papers (this time on biceps size and political attitudes among college students)

17 0.067016229 7 andrew gelman stats-2010-04-27-Should Mister P be allowed-encouraged to reside in counter-factual populations?

18 0.066967681 50 andrew gelman stats-2010-05-25-Looking for Sister Right

19 0.066011459 1754 andrew gelman stats-2013-03-08-Cool GSS training video! And cumulative file 1972-2012!

20 0.065798186 2312 andrew gelman stats-2014-04-29-Ken Rice presents a unifying approach to statistical inference and hypothesis testing


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.119), (1, 0.024), (2, 0.042), (3, -0.016), (4, -0.015), (5, 0.017), (6, -0.033), (7, 0.013), (8, 0.015), (9, -0.041), (10, 0.02), (11, -0.035), (12, 0.013), (13, 0.003), (14, 0.011), (15, 0.035), (16, -0.013), (17, -0.012), (18, 0.003), (19, 0.028), (20, -0.006), (21, 0.03), (22, -0.008), (23, 0.002), (24, -0.03), (25, -0.018), (26, 0.028), (27, -0.007), (28, 0.031), (29, 0.027), (30, 0.023), (31, 0.009), (32, -0.004), (33, -0.005), (34, -0.017), (35, 0.003), (36, -0.006), (37, 0.008), (38, -0.0), (39, 0.023), (40, 0.006), (41, -0.03), (42, 0.009), (43, -0.026), (44, -0.027), (45, -0.0), (46, 0.027), (47, -0.002), (48, 0.002), (49, 0.017)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96686333 1511 andrew gelman stats-2012-09-26-What do statistical p-values mean when the sample = the population?

Introduction: Felipe Nunes writes: I have many friends working with data that they claim to be considered as a ‘population’. For example, the universe of bills presented in a Congress, the roll call votes of all deputies in a legislature, a survey with all deputies in a country, the outcomes of an election, or the set of electoral institutions around the world. Because of the nature of these data, we do not know how to interpret the p-value. I have seen many arguments been made, but I have never seen a formal response to the question. So I don’t know what to say. The most common arguments among the community of young researchers in Brazil are: (1) don’t interpret p-value when you have population, but don’t infer anything either; (2) interpret the p-value because of error measurement which is also present, (3) there is no such a thing as a population, so always look at p-values, (4) don’t worry about p-value, interpret the coefficients substantively, and (5) if you are frequentist you interpret p-

2 0.75596714 70 andrew gelman stats-2010-06-07-Mister P goes on a date

Introduction: I recently wrote something on the much-discussed OK Cupid analysis of political attitudes of a huge sample of people in their dating database. My quick comment was that their analysis was interesting, but participants on an online dating site must certainly be far from a random sample of Americans. But suppose I want to not just criticize but also think in a positive direction. OK Cupid’s database is huge, and one thing statistical methods are good at–Bayesian methods in particular–is combining a huge amount of noisy, biased data with a smaller amount of good data. This is what we did in our radon study, using a high-quality survey of 5000 houses in 125 counties to calibrate a set of crappier surveys totaling 80,000 houses in 3000 counties. How would it work for OK Cupid? We’d want to take their data and poststratify on: Age Sex Marital/family status Education Income Partisanship Ideology Political participation Religion and religious attendance State Urban/rural/

3 0.73432672 1940 andrew gelman stats-2013-07-16-A poll that throws away data???

Introduction: Mark Blumenthal writes: What do you think about the “random rejection” method used by PPP that was attacked at some length today by a Republican pollster. Our just published post on the debate includes all the details as I know them. The Storify of Martino’s tweets has some additional data tables linked to toward the end. Also, more specifically, setting aside Martino’s suggestion of manipulation (which is also quite possible with post-stratification weights), would the PPP method introduce more potential random error than weighting? From Blumenthal’s blog: B.J. Martino, a senior vice president at the Republican polling firm The Tarrance Group, went on an 30-minute Twitter rant on Tuesday questioning the unorthodox method used by PPP [Public Policy Polling] to select samples and weight data: “Looking at @ppppolls new VA SW. Wondering how many interviews they discarded to get down to 601 completes? Because @ppppolls discards a LOT of interviews. Of 64,811 conducted

4 0.72210741 777 andrew gelman stats-2011-06-23-Combining survey data obtained using different modes of sampling

Introduction: I’m involved (with Irv Garfinkel and others) in a planned survey of New York City residents. It’s hard to reach people in the city–not everyone will answer their mail or phone, and you can’t send an interviewer door-to-door in a locked apartment building. (I think it violates IRB to have a plan of pushing all the buzzers by the entrance and hoping someone will let you in.) So the plan is to use multiple modes, including phone, in person household, random street intercepts and mail. The question then is how to combine these samples. My suggested approach is to divide the population into poststrata based on various factors (age, ethnicity, family type, housing type, etc), then to pool responses within each poststratum, then to runs some regressions including postratsta and also indicators for mode, to understand how respondents from different modes differ, after controlling for the demographic/geographic adjustments. Maybe this has already been done and written up somewhere? P.

5 0.71142489 972 andrew gelman stats-2011-10-25-How do you interpret standard errors from a regression fit to the entire population?

Introduction: David Radwin asks a question which comes up fairly often in one form or another: How should one respond to requests for statistical hypothesis tests for population (or universe) data? I [Radwin] first encountered this issue as an undergraduate when a professor suggested a statistical significance test for my paper comparing roll call votes between freshman and veteran members of Congress. Later I learned that such tests apply only to samples because their purpose is to tell you whether the difference in the observed sample is likely to exist in the population. If you have data for the whole population, like all members of the 103rd House of Representatives, you do not need a test to discern the true difference in the population. Sometimes researchers assume some sort of superpopulation like “all possible Congresses” or “Congresses across all time” and that the members of any given Congress constitute a sample. In my current work in education research, it is sometimes asserted t

6 0.71024007 405 andrew gelman stats-2010-11-10-Estimation from an out-of-date census

7 0.68780869 136 andrew gelman stats-2010-07-09-Using ranks as numbers

8 0.6816597 1725 andrew gelman stats-2013-02-17-“1.7%” ha ha ha

9 0.68037468 142 andrew gelman stats-2010-07-12-God, Guns, and Gaydar: The Laws of Probability Push You to Overestimate Small Groups

10 0.68003219 404 andrew gelman stats-2010-11-09-“Much of the recent reported drop in interstate migration is a statistical artifact”

11 0.6744138 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

12 0.67217374 368 andrew gelman stats-2010-10-25-Is instrumental variables analysis particularly susceptible to Type M errors?

13 0.67130369 352 andrew gelman stats-2010-10-19-Analysis of survey data: Design based models vs. hierarchical modeling?

14 0.66178334 1681 andrew gelman stats-2013-01-19-Participate in a short survey about the weight of evidence provided by statistics

15 0.66103107 1409 andrew gelman stats-2012-07-08-Is linear regression unethical in that it gives more weight to cases that are far from the average?

16 0.66093767 2167 andrew gelman stats-2014-01-10-Do you believe that “humans and other living things have evolved over time”?

17 0.65919155 784 andrew gelman stats-2011-07-01-Weighting and prediction in sample surveys

18 0.65358591 1289 andrew gelman stats-2012-04-29-We go to war with the data we have, not the data we want

19 0.65200323 2152 andrew gelman stats-2013-12-28-Using randomized incentives as an instrument for survey nonresponse?

20 0.65101165 2295 andrew gelman stats-2014-04-18-One-tailed or two-tailed?


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(0, 0.018), (9, 0.029), (16, 0.151), (21, 0.036), (24, 0.086), (29, 0.034), (73, 0.188), (86, 0.044), (95, 0.014), (99, 0.279)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.95406485 794 andrew gelman stats-2011-07-09-The quest for the holy graph

Introduction: Eytan Adar writes: I was just going through the latest draft of your paper with Anthony Unwin . I heard part of it at the talk you gave (remotely) here at UMich. I’m curious about your discussion of the Baby Name Voyager . The tool in itself is simple, attractive, and useful. No argument from me there. It’s an awesome demonstration of how subtle interactions can be very helpful (click and it zooms, type and it filters… falls perfectly into the Shneiderman visualization mantra). It satisfies a very common use case: finding appropriate names for children. That said, I can’t help but feeling that what you are really excited about is the very static analysis on last letters (you spend most of your time on this). This analysis, incidentally, is not possible to infer from the interactive application (which doesn’t support this type of filtering and pivoting). In a sense, the two visualizations don’t have anything to do with each other (other than a shared context/dataset).

2 0.93801039 655 andrew gelman stats-2011-04-10-“Versatile, affordable chicken has grown in popularity”

Introduction: Awhile ago I was cleaning out the closet and found some old unread magazines. Good stuff. As we’ve discussed before , lots of things are better read a few years late. Today I was reading the 18 Nov 2004 issue of the London Review of Books, which contained (among other things) the following: - A review by Jenny Diski of a biography of Stanley Milgram. Diski appears to want to debunk: Milgram was a whiz at devising sexy experiments, but barely interested in any theoretical basis for them. They all have the same instant attractiveness of style, and then an underlying emptiness. Huh? Michael Jordan couldn’t hit the curveball and he was reportedly an easy mark for golf hustlers but that doesn’t diminish his greatness on the basketball court. She also criticizes Milgram for being “no help at all” for solving international disputes. OK, fine. I haven’t solved any international disputes either. Milgram, though, . . . he conducted an imaginative experiment whose results stu

3 0.93719596 1925 andrew gelman stats-2013-07-04-“Versatile, affordable chicken has grown in popularity”

Introduction: From two years ago : Awhile ago I was cleaning out the closet and found some old unread magazines. Good stuff. As we’ve discussed before , lots of things are better read a few years late. Today I was reading the 18 Nov 2004 issue of the London Review of Books, which contained (among other things) the following: - A review by Jenny Diski of a biography of Stanley Milgram. Diski appears to want to debunk: Milgram was a whiz at devising sexy experiments, but barely interested in any theoretical basis for them. They all have the same instant attractiveness of style, and then an underlying emptiness. Huh? Michael Jordan couldn’t hit the curveball and he was reportedly an easy mark for golf hustlers but that doesn’t diminish his greatness on the basketball court. She also criticizes Milgram for being “no help at all” for solving international disputes. OK, fine. I haven’t solved any international disputes either. Milgram, though, . . . he conducted an imaginative exp

4 0.93405402 497 andrew gelman stats-2011-01-02-Hipmunk update

Introduction: Florence from customer support at Hipmunk writes: Hipmunk now includes American Airlines in our search results. Please note that users will be taken directly to AA.com to complete the booking/transaction. . . . we are steadily increasing the number of flights that we offer on Hipmunk. As you may recall, Hipmunk is a really cool flight-finder that didn’t actually work (as of 16 Sept 2010). At the time, I was a bit annoyed at the NYT columnist who plugged Hipmunk without actually telling his readers that the site didn’t actually do the job. (I discovered the problem myself because I couldn’t believe that my flight options to Raleigh-Durham were really so meager, so I checked on Expedia and found a good flight.) I do think Hipmunk’s graphics are beautiful, though, so I’m rooting for them to catch up. P.S. Apparently they include Amtrak Northeast Corridor trains, so I’ll give them a try, next time I travel. The regular Amtrak website is about as horrible as you’d expect.

same-blog 5 0.92731875 1511 andrew gelman stats-2012-09-26-What do statistical p-values mean when the sample = the population?

Introduction: Felipe Nunes writes: I have many friends working with data that they claim to be considered as a ‘population’. For example, the universe of bills presented in a Congress, the roll call votes of all deputies in a legislature, a survey with all deputies in a country, the outcomes of an election, or the set of electoral institutions around the world. Because of the nature of these data, we do not know how to interpret the p-value. I have seen many arguments been made, but I have never seen a formal response to the question. So I don’t know what to say. The most common arguments among the community of young researchers in Brazil are: (1) don’t interpret p-value when you have population, but don’t infer anything either; (2) interpret the p-value because of error measurement which is also present, (3) there is no such a thing as a population, so always look at p-values, (4) don’t worry about p-value, interpret the coefficients substantively, and (5) if you are frequentist you interpret p-

6 0.92509151 161 andrew gelman stats-2010-07-24-Differences in color perception by sex, also the Bechdel test for women in movies

7 0.92099988 1748 andrew gelman stats-2013-03-04-PyStan!

8 0.9209733 2238 andrew gelman stats-2014-03-09-Hipmunk worked

9 0.89982569 7 andrew gelman stats-2010-04-27-Should Mister P be allowed-encouraged to reside in counter-factual populations?

10 0.89051068 2020 andrew gelman stats-2013-09-12-Samplers for Big Science: emcee and BAT

11 0.87967914 1099 andrew gelman stats-2012-01-05-Approaching harmonic convergence

12 0.87736392 917 andrew gelman stats-2011-09-20-Last post on Hipmunk

13 0.87540805 280 andrew gelman stats-2010-09-16-Meet Hipmunk, a really cool flight-finder that doesn’t actually work

14 0.87115526 496 andrew gelman stats-2011-01-01-Tukey’s philosophy

15 0.86036623 798 andrew gelman stats-2011-07-12-Sometimes a graph really is just ugly

16 0.85748529 722 andrew gelman stats-2011-05-20-Why no Wegmania?

17 0.85522723 159 andrew gelman stats-2010-07-23-Popular governor, small state

18 0.85478717 564 andrew gelman stats-2011-02-08-Different attitudes about parenting, possibly deriving from different attitudes about self

19 0.85436761 1712 andrew gelman stats-2013-02-07-Philosophy and the practice of Bayesian statistics (with all the discussions!)

20 0.8538776 55 andrew gelman stats-2010-05-27-In Linux, use jags() to call Jags instead of using bugs() to call OpenBugs