andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-556 knowledge-graph by maker-knowledge-mining

556 andrew gelman stats-2011-02-04-Patterns

meta infos for this blog

Source: html

Introduction: Pete Gries writes: I [Gries] am not sure if what you are suggesting by “doing data analysis in a patternless way” is a pitch for deductive over inductive approaches as a solution to the problem of reporting and publication bias. If so, I may somewhat disagree. A constant quest to prove or disprove theory in a deductive manner is one of the primary causes of both reporting and publication bias. I’m actually becoming a proponent of a remarkably non-existent species – “applied political science” – because there is so much animosity in our discipline to inductive empirical statistical work that seeks to answer real world empirical questions rather than contribute to parsimonious theory building. Anyone want to start a JAPS – Journal of Applied Political Science? Our discipline is in danger of irrelevance. My reply: By “doing data analysis in a patternless way,” I meant statistical methods such as least squares, maximum likelihood, etc., that estimate parameters independently witho

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Pete Gries writes: I [Gries] am not sure if what you are suggesting by “doing data analysis in a patternless way” is a pitch for deductive over inductive approaches as a solution to the problem of reporting and publication bias. [sent-1, score-1.528]

2 A constant quest to prove or disprove theory in a deductive manner is one of the primary causes of both reporting and publication bias. [sent-3, score-1.417]

3 My reply: By “doing data analysis in a patternless way,” I meant statistical methods such as least squares, maximum likelihood, etc. [sent-7, score-0.532]

4 , that estimate parameters independently without recognizing the constraints and relationships between them. [sent-8, score-0.585]

5 If you estimate each study on its own, without reference to all the other work being done in the same field, then you’re depriving yourself of a lot of information and inviting noisy estimates and, in particular, overestimates of small effects. [sent-9, score-0.751]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('gries', 0.344), ('patternless', 0.295), ('deductive', 0.265), ('inductive', 0.258), ('discipline', 0.221), ('animosity', 0.157), ('disprove', 0.157), ('depriving', 0.148), ('reporting', 0.146), ('empirical', 0.144), ('proponent', 0.141), ('parsimonious', 0.141), ('quest', 0.141), ('publication', 0.138), ('inviting', 0.132), ('pete', 0.129), ('remarkably', 0.123), ('danger', 0.123), ('pitch', 0.123), ('overestimates', 0.123), ('seeks', 0.123), ('independently', 0.111), ('species', 0.108), ('theory', 0.106), ('applied', 0.105), ('recognizing', 0.102), ('relationships', 0.099), ('contribute', 0.098), ('manner', 0.098), ('estimate', 0.097), ('squares', 0.097), ('prove', 0.096), ('constant', 0.095), ('becoming', 0.095), ('constraints', 0.094), ('noisy', 0.09), ('causes', 0.09), ('maximum', 0.089), ('suggesting', 0.088), ('primary', 0.085), ('political', 0.084), ('without', 0.082), ('meant', 0.081), ('reference', 0.079), ('approaches', 0.077), ('science', 0.076), ('somewhat', 0.074), ('solution', 0.071), ('likelihood', 0.069), ('analysis', 0.067)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 556 andrew gelman stats-2011-02-04-Patterns

2 0.13922963 614 andrew gelman stats-2011-03-15-Induction within a model, deductive inference for model evaluation

Introduction: Jonathan Livengood writes: I have a couple of questions on your paper with Cosma Shalizi on “Philosophy and the practice of Bayesian statistics.” First, you distinguish between inductive approaches and hypothetico-deductive approaches to inference and locate statistical practice (at least, the practice of model building and checking) on the hypothetico-deductive side. Do you think that there are any interesting elements of statistical practice that are properly inductive? For example, suppose someone is playing around with a system that more or less resembles a toy model, like drawing balls from an urn or some such, and where the person has some well-defined priors. The person makes a number of draws from the urn and applies Bayes theorem to get a posterior. On your view, is that person making an induction? If so, how much space is there in statistical practice for genuine inductions like this? Second, I agree with you that one ought to distinguish induction from other kind

3 0.10711683 466 andrew gelman stats-2010-12-13-“The truth wears off: Is there something wrong with the scientific method?”

Introduction: Gur Huberman asks what I think of this magazine article by Johah Lehrer (see also here ). My reply is that it reminds me a bit of what I wrote here . Or see here for the quick powerpoint version: The short story is that if you screen for statistical significance when estimating small effects, you will necessarily overestimate the magnitudes of effects, sometimes by a huge amount. I know that Dave Krantz has thought about this issue for awhile; it came up when Francis Tuerlinckx and I wrote our paper on Type S errors, ten years ago. My current thinking is that most (almost all?) research studies of the sort described by Lehrer should be accompanied by retrospective power analyses, or informative Bayesian inferences. Either of these approaches–whether classical or Bayesian, the key is that they incorporate real prior information, just as is done in a classical prospective power analysis–would, I think, moderate the tendency to overestimate the magnitude of effects. In answ

4 0.1051507 2007 andrew gelman stats-2013-09-03-Popper and Jaynes

Introduction: Deborah Mayo quotes me as saying, “Popper has argued (convincingly, in my opinion) that scientific inference is not inductive but deductive.” She then follows up with: Gelman employs significance test-type reasoning to reject a model when the data sufficiently disagree. Now, strictly speaking, a model falsification, even to inferring something as weak as “the model breaks down,” is not purely deductive, but Gelman is right to see it as about as close as one can get, in statistics, to a deductive falsification of a model. But where does that leave him as a Jaynesian? My reply: I was influenced by reading a toy example from Jaynes’s book where he sets up a model (for the probability of a die landing on each of its six sides) based on first principles, then presents some data that contradict the model, then expands the model. I’d seen very little of this sort of this reasoning before in statistics! In physics it’s the standard way to go: you set up a model based on physic

5 0.099832483 1750 andrew gelman stats-2013-03-05-Watership Down, thick description, applied statistics, immutability of stories, and playing tennis with a net

Introduction: For the past several months I’ve been circling around and around some questions related to the issue of how we build trust in statistical methods and statistical results. There are lots of examples but let me start with my own career. My most cited publications are my books and my methods papers, but I think that much of my credibility as a statistical researcher comes from my applied work. It somehow matters, I think, when judging my statistical work, that I’ve done (and continue to do) real research in social and environmental science. Why is this? It’s not just that my applied work gives me good examples for my textbooks. It’s also that the applied work motivated the new methods. Most of the successful theory and methods that my collaborators and I have developed, we developed in the context of trying to solve active applied problems. We weren’t trying to shave a half a point off the predictive error in the Boston housing data; rather, we were attacking new problems that we

6 0.099468559 1096 andrew gelman stats-2012-01-02-Graphical communication for legal scholarship

7 0.098979428 1291 andrew gelman stats-2012-04-30-Systematic review of publication bias in studies on publication bias

8 0.087526344 2263 andrew gelman stats-2014-03-24-Empirical implications of Empirical Implications of Theoretical Models

9 0.082526207 998 andrew gelman stats-2011-11-08-Bayes-Godel

10 0.082104906 2097 andrew gelman stats-2013-11-11-Why ask why? Forward causal inference and reverse causal questions

11 0.076838352 779 andrew gelman stats-2011-06-25-Avoiding boundary estimates using a prior distribution as regularization

12 0.074970774 957 andrew gelman stats-2011-10-14-Questions about a study of charter schools

13 0.07457079 33 andrew gelman stats-2010-05-14-Felix Salmon wins the American Statistical Association’s Excellence in Statistical Reporting Award

14 0.073951505 247 andrew gelman stats-2010-09-01-How does Bayes do it?

15 0.07257165 1652 andrew gelman stats-2013-01-03-“The Case for Inductive Theory Building”

16 0.072409473 32 andrew gelman stats-2010-05-14-Causal inference in economics

17 0.071651824 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

18 0.071252115 1560 andrew gelman stats-2012-11-03-Statistical methods that work in some settings but not others

19 0.07044553 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?

20 0.070242211 524 andrew gelman stats-2011-01-19-Data exploration and multiple comparisons

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.133), (1, 0.036), (2, 0.001), (3, -0.073), (4, -0.031), (5, 0.013), (6, -0.052), (7, -0.011), (8, -0.013), (9, 0.01), (10, 0.01), (11, -0.009), (12, -0.003), (13, 0.02), (14, -0.03), (15, -0.013), (16, -0.04), (17, 0.001), (18, 0.005), (19, -0.007), (20, -0.006), (21, -0.032), (22, 0.009), (23, 0.036), (24, 0.022), (25, 0.008), (26, 0.031), (27, 0.011), (28, -0.018), (29, -0.027), (30, 0.008), (31, -0.01), (32, 0.023), (33, -0.02), (34, 0.008), (35, -0.001), (36, -0.04), (37, 0.003), (38, 0.018), (39, -0.032), (40, 0.026), (41, 0.028), (42, -0.039), (43, 0.03), (44, -0.002), (45, 0.012), (46, 0.015), (47, 0.013), (48, 0.001), (49, -0.013)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96368015 556 andrew gelman stats-2011-02-04-Patterns

2 0.70929587 744 andrew gelman stats-2011-06-03-Statistical methods for healthcare regulation: rating, screening and surveillance

Introduction: Here is my discussion of a recent article by David Spiegelhalter, Christopher Sherlaw-Johnson, Martin Bardsley, Ian Blunt, Christopher Wood and Olivia Grigg, that is scheduled to appear in the Journal of the Royal Statistical Society: I applaud the authors’ use of a mix of statistical methods to attack an important real-world problem. Policymakers need results right away, and I admire the authors’ ability and willingness to combine several different modeling and significance testing ideas for the purposes of rating and surveillance. That said, I am uncomfortable with the statistical ideas here, for three reasons. First, I feel that the proposed methods, centered as they are around data manipulation and corrections for uncertainty, has serious defects compared to a more model-based approach. My problem with methods based on p-values and z-scores–however they happen to be adjusted–is that they draw discussion toward error rates, sequential analysis, and other technical statistical

3 0.69026822 32 andrew gelman stats-2010-05-14-Causal inference in economics

Introduction: Aaron Edlin points me to this issue of the Journal of Economic Perspectives that focuses on statistical methods for causal inference in economics. (Michael Bishop’s page provides some links .) To quickly summarize my reactions to Angrist and Pischke’s book: I pretty much agree with them that the potential-outcomes or natural-experiment approach is the most useful way to think about causality in economics and related fields. My main amendments to Angrist and Pischke would be to recognize that: 1. Modeling is important, especially modeling of interactions . It’s unfortunate to see a debate between experimentalists and modelers. Some experimenters (not Angrist and Pischke) make the mistake of avoiding models: Once they have their experimental data, they check their brains at the door and do nothing but simple differences, not realizing how much more can be learned. Conversely, some modelers are unduly dismissive of experiments and formal observational studies, forgetting t

4 0.69005352 789 andrew gelman stats-2011-07-07-Descriptive statistics, causal inference, and story time

Introduction: Dave Backus points me to this review by anthropologist Mike McGovern of two books by economist Paul Collier on the politics of economic development in Africa. My first reaction was that this was interesting but non-statistical so I’d have to either post it on the sister blog or wait until the 30 days of statistics was over. But then I looked more carefully and realized that this discussion is very relevant to applied statistics. Here’s McGovern’s substantive critique: Much of the fundamental intellectual work in Collier’s analyses is, in fact, ethnographic. Because it is not done very self-consciously and takes place within a larger econometric rhetoric in which such forms of knowledge are dismissed as “subjective” or worse still biased by the political (read “leftist”) agendas of the academics who create them, it is often ethnography of a low quality. . . . Despite the adoption of a Naipaulian unsentimental-dispatches-from-the-trenches rhetoric, the story told in Collier’s

5 0.68293792 769 andrew gelman stats-2011-06-15-Mr. P by another name . . . is still great!

Introduction: Brendan Nyhan points me to this from Don Taylor: Can national data be used to estimate state-level results? . . . A challenge is the fact that the sample size in many states is very small . . . Richard [Gonzales] used a regression approach to extrapolate this information to provide a state-level support for health reform: To get around the challenge presented by small sample sizes, the model presented here combines the benefits of incorporating auxiliary demographic information about the states with the hierarchical modeling approach commonly used in small area estimation. The model is designed to “shrink” estimates toward the average level of support in the region when there are few observations available, while simultaneously adjusting for the demographics and political ideology in the state. This approach therefore takes fuller advantage of all information available in the data to estimate state-level public opinion. This is a great idea, and it is already being used al

6 0.68142909 785 andrew gelman stats-2011-07-02-Experimental reasoning in social science

7 0.68104666 1750 andrew gelman stats-2013-03-05-Watership Down, thick description, applied statistics, immutability of stories, and playing tennis with a net

8 0.67945158 757 andrew gelman stats-2011-06-10-Controversy over the Christakis-Fowler findings on the contagion of obesity

9 0.67416477 756 andrew gelman stats-2011-06-10-Christakis-Fowler update

10 0.67405158 33 andrew gelman stats-2010-05-14-Felix Salmon wins the American Statistical Association’s Excellence in Statistical Reporting Award

11 0.66139024 309 andrew gelman stats-2010-10-01-Why Development Economics Needs Theory?

12 0.65669191 1878 andrew gelman stats-2013-05-31-How to fix the tabloids? Toward replicable social science research

13 0.65569222 1889 andrew gelman stats-2013-06-08-Using trends in R-squared to measure progress in criminology??

14 0.64954054 960 andrew gelman stats-2011-10-15-The bias-variance tradeoff

15 0.64935446 1763 andrew gelman stats-2013-03-14-Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses

16 0.64735472 2179 andrew gelman stats-2014-01-20-The AAA Tranche of Subprime Science

17 0.64702421 1944 andrew gelman stats-2013-07-18-You’ll get a high Type S error rate if you use classical statistical methods to analyze data from underpowered studies

18 0.64583856 511 andrew gelman stats-2011-01-11-One more time on that ESP study: The problem of overestimates and the shrinkage solution

19 0.64461702 1441 andrew gelman stats-2012-08-02-“Based on my experiences, I think you could make general progress by constructing a solution to your specific problem.”

20 0.64284521 601 andrew gelman stats-2011-03-05-Against double-blind reviewing: Political science and statistics are not like biology and physics

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.028), (16, 0.047), (21, 0.048), (22, 0.028), (24, 0.148), (27, 0.015), (45, 0.024), (49, 0.032), (76, 0.013), (77, 0.031), (81, 0.222), (86, 0.01), (99, 0.253)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.97196013 915 andrew gelman stats-2011-09-17-(Worst) graph of the year

Introduction: This (forwarded to me from Jeff, from a powerpoint by Willam Gawthrop) wins not on form but on content: Really this graph should stand alone but it’s so wonderful that I can’t resist pointing out a few things: - The gap between 610 and 622 A.D. seems to be about the same as the previous 600 years, and only a little less than the 1400 years before that. - “Pious and devout” Jews are portrayed as having steadily increased in nonviolence up to the present day. Been to Israel lately? - I assume the line labeled “Bible” is referring to Christians? I’m sort of amazed to see pious and devout Christians listed as being maximally violent at the beginning. Huh? I thought Christ was supposed to be a nonviolent, mellow dude. The line starts at 3 B.C., implying that baby Jesus was at the extreme of violence. Gong forward, we can learn from the graph that pious and devout Christians in 1492 or 1618, say, were much more peaceful than Jesus and his crew. - Most amusingly g

2 0.95238376 552 andrew gelman stats-2011-02-03-Model Makers’ Hippocratic Oath

Introduction: Emanuel Derman and Paul Wilmott wonder how to get their fellow modelers to give up their fantasy of perfection. In a Business Week article they proposed, not entirely in jest, a model makers’ Hippocratic Oath: I will remember that I didn’t make the world and that it doesn’t satisfy my equations. Though I will use models boldly to estimate value, I will not be overly impressed by mathematics. I will never sacrifice reality for elegance without explaining why I have done so. Nor will I give the people who use my model false comfort about its accuracy. Instead, I will make explicit its assumptions and oversights. I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension. Found via Abductive Intelligence .

3 0.94429618 1762 andrew gelman stats-2013-03-13-“I have no idea who Catalina Garcia is, but she makes a decent ruler”: I don’t know if John Lee “little twerp” Anderson actually suffers from tall-person syndrome, but he is indeed tall

Introduction: I just want to share with you the best comment we’ve every had in the nearly ten-year history of this blog. Also it has statistical content! Here’s the story. After seeing an amusing article by Tom Scocca relating how reporter John Lee Anderson called someone as a “little twerp” on twitter: I conjectured that Anderson suffered from “tall person syndrome,” that problem that some people of above-average height have, that they think they’re more important than other people because they literally look down on them. But I had no idea of Anderson’s actual height. Commenter Gary responded with this impressive bit of investigative reporting: Based on this picture: he appears to be fairly tall. But the perspective makes it hard to judge. Based on this picture: he appears to be about 9-10 inches taller than Catalina Garcia. But how tall is Catalina Garcia? Not that tall – she’s shorter than the high-wire artist Phillipe Petit: And he doesn’t appear

4 0.923226 1632 andrew gelman stats-2012-12-20-Who exactly are those silly academics who aren’t as smart as a Vegas bookie?

Introduction: I get suspicious when I hear unsourced claims that unnamed experts somewhere are making foolish statements. For example, I recently came across this, from a Super Bowl-themed article from 2006 by Stephen Dubner and Steven Levitt: As it happens, there is one betting strategy that will routinely beat a bookie, and you don’t even have to be smart to use it. One of the most undervalued N.F.L. bets is the home underdog — a team favored to lose but playing in its home stadium. If you had bet $5,000 on the home underdog in every N.F.L. game over the past two decades, you would be up about $150,000 by now (a winning rate of roughly 53 percent). So far, so good. I wonder if this pattern still holds. But then Dubner and Levitt continue: This fact has led some academics to conclude that bookmakers simply aren’t very smart. If an academic researcher can find this loophole, shouldn’t a professional bookie be able to? But the fact is most bookies are doing just fine. So could it be

5 0.91484016 484 andrew gelman stats-2010-12-24-Foreign language skills as an intrinsic good; also, beware the tyranny of measurement

Introduction: This link on education reform send me to this blog on foreign languages in Canadian public schools: The demand for French immersion education in Vancouver so far outstrips the supply that the school board allocates places by lottery. But why? Is it because French is a useful employment skill? Because learning to speak French makes you a better person? Or is it because parents know intuitively what economists can show econometrically: peer effects matter. Being with high achieving peers raises a student’s own achievement level. . . . Several studies have found that Anglophones who can speak French enjoy an earning premium. The question is: do bilingual Anglophones earn more because speaking French is a valuable skill in the workplace? Or do they earn more because they’re on average smarter and more capable people (after all, they’ve mastered two languages)? And the blog features this comments like this : French immersion classes (as opposed to science, maths or any

same-blog 6 0.90611124 556 andrew gelman stats-2011-02-04-Patterns

7 0.90018725 849 andrew gelman stats-2011-08-11-The Reliability of Cluster Surveys of Conflict Mortality: Violent Deaths and Non-Violent Deaths

8 0.89830017 1962 andrew gelman stats-2013-07-30-The Roy causal model?

9 0.8904748 1129 andrew gelman stats-2012-01-20-Bugs Bunny, the governor of Massachusetts, the Dow 36,000 guy, presidential qualifications, and Peggy Noonan

10 0.8892982 1222 andrew gelman stats-2012-03-20-5 books book

11 0.88371021 1033 andrew gelman stats-2011-11-28-Greece to head statistician: Tell the truth, go to jail

12 0.87691057 1705 andrew gelman stats-2013-02-04-Recently in the sister blog

13 0.86886168 858 andrew gelman stats-2011-08-17-Jumping off the edge of the world

14 0.86522406 1321 andrew gelman stats-2012-05-15-A statistical research project: Weeding out the fraudulent citations

15 0.83962053 658 andrew gelman stats-2011-04-11-Statistics in high schools: Towards more accessible conceptions of statistical inference

16 0.83120203 1057 andrew gelman stats-2011-12-14-Hey—I didn’t know that!

17 0.81766009 1759 andrew gelman stats-2013-03-12-How tall is Jon Lee Anderson?

18 0.81367171 145 andrew gelman stats-2010-07-13-Statistical controversy regarding human rights violations in Colomnbia

19 0.81055886 2002 andrew gelman stats-2013-08-30-Blogging

20 0.81018025 2088 andrew gelman stats-2013-11-04-Recently in the sister blog