andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-147 knowledge-graph by maker-knowledge-mining

147 andrew gelman stats-2010-07-15-Quote of the day: statisticians and defaults

meta infos for this blog

Source: html

Introduction: On statisticians and statistical software: Statisticians are particularly sensitive to default settings, which makes sense considering that statistics is, in many ways, a science based on defaults. What is a “statistical method” if not a recommended default analysis, backed up by some combination of theory and experience?

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 On statisticians and statistical software: Statisticians are particularly sensitive to default settings, which makes sense considering that statistics is, in many ways, a science based on defaults. [sent-1, score-2.254]

2 What is a “statistical method” if not a recommended default analysis, backed up by some combination of theory and experience? [sent-2, score-1.4]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('default', 0.458), ('statisticians', 0.36), ('backed', 0.32), ('sensitive', 0.272), ('recommended', 0.244), ('combination', 0.228), ('considering', 0.22), ('settings', 0.22), ('software', 0.212), ('experience', 0.173), ('statistical', 0.17), ('particularly', 0.168), ('method', 0.161), ('ways', 0.157), ('theory', 0.15), ('makes', 0.112), ('based', 0.109), ('science', 0.107), ('sense', 0.105), ('analysis', 0.095), ('statistics', 0.092), ('many', 0.081)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 147 andrew gelman stats-2010-07-15-Quote of the day: statisticians and defaults

2 0.25127214 1564 andrew gelman stats-2012-11-06-Choose your default, or your default will choose you (election forecasting edition)

Introduction: Statistics is the science of defaults. One of the differences between statistics and other branches of engineering is that we have a special love for default procedures, perhaps because so many statistical problems are routine (or, at least, people would like them to be). We have standard estimates for all sorts of models, books of statistical tests, and default settings for everything. Recently I’ve been working on default weakly informative priors (which are not the same as the typically noninformative “reference priors” of the Bayesian literature). From a Bayesian point of view, the appropriate default procedure could be defined as that which is appropriate for the population of problems that one might be studying. More generally, much of our job as statisticians is to come up with methods that will be used by others in routine practice. (Much of the rest of our job is to come up with methods for evaluating new and existing statistical methods, and methods for coming up wi

3 0.15316403 1859 andrew gelman stats-2013-05-16-How do we choose our default methods?

Introduction: I was asked to write an article for the Committee of Presidents of Statistical Societies (COPSS) 50th anniversary volume. Here it is (it’s labeled as “Chapter 1,” which isn’t right; that’s just what came out when I used the template that was supplied). The article begins as follows: The field of statistics continues to be divided into competing schools of thought. In theory one might imagine choosing the uniquely best method for each problem as it arises, but in practice we choose for ourselves (and recom- mend to others) default principles, models, and methods to be used in a wide variety of settings. This article briefly considers the informal criteria we use to decide what methods to use and what principles to apply in statistics problems. And then I follow up with these sections: Statistics: the science of defaults Ways of knowing The pluralist’s dilemma And here’s the concluding paragraph: Statistics is a young science in which progress is being made in many

4 0.14486483 426 andrew gelman stats-2010-11-22-Postdoc opportunity here at Columbia — deadline soon!

Introduction: The deadline for this year’s Earth Institute postdocs is 1 Dec, so it’s time to apply right away ! It’s a highly competitive interdisciplinary program, and we’ve had some statisticians in the past. We’re particularly interested in statisticians who have research interests in development and public health. It’s fine–not just fine, but ideal–if you are interested in statistical methods also.

5 0.1391279 1898 andrew gelman stats-2013-06-14-Progress! (on the understanding of the role of randomization in Bayesian inference)

Introduction: Leading theoretical statistician Larry Wassserman in 2008 : Some of the greatest contributions of statistics to science involve adding additional randomness and leveraging that randomness. Examples are randomized experiments, permutation tests, cross-validation and data-splitting. These are unabashedly frequentist ideas and, while one can strain to fit them into a Bayesian framework, they don’t really have a place in Bayesian inference. The fact that Bayesian methods do not naturally accommodate such a powerful set of statistical ideas seems like a serious deficiency. To which I responded on the second-to-last paragraph of page 8 here . Larry Wasserman in 2013 : Some people say that there is no role for randomization in Bayesian inference. In other words, the randomization mechanism plays no role in Bayes’ theorem. But this is not really true. Without randomization, we can indeed derive a posterior for theta but it is highly sensitive to the prior. This is just a restat

6 0.12088088 801 andrew gelman stats-2011-07-13-On the half-Cauchy prior for a global scale parameter

7 0.12044996 2317 andrew gelman stats-2014-05-04-Honored oldsters write about statistics

8 0.11608952 231 andrew gelman stats-2010-08-24-Yet another Bayesian job opportunity

9 0.10813425 1469 andrew gelman stats-2012-08-25-Ways of knowing

10 0.10785279 1572 andrew gelman stats-2012-11-10-I don’t like this cartoon

11 0.10452421 1990 andrew gelman stats-2013-08-20-Job opening at an organization that promotes reproducible research!

12 0.10064958 2303 andrew gelman stats-2014-04-23-Thinking of doing a list experiment? Here’s a list of reasons why you should think again

13 0.099411115 1482 andrew gelman stats-2012-09-04-Model checking and model understanding in machine learning

14 0.097465724 846 andrew gelman stats-2011-08-09-Default priors update?

15 0.096846834 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update

16 0.094488256 1019 andrew gelman stats-2011-11-19-Validation of Software for Bayesian Models Using Posterior Quantiles

17 0.093744896 738 andrew gelman stats-2011-05-30-Works well versus well understood

18 0.092931256 2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters

19 0.09256307 773 andrew gelman stats-2011-06-18-Should we always be using the t and robit instead of the normal and logit?

20 0.092115432 1979 andrew gelman stats-2013-08-13-Convincing Evidence

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.113), (1, 0.047), (2, -0.074), (3, 0.003), (4, -0.035), (5, 0.006), (6, -0.098), (7, 0.051), (8, -0.05), (9, 0.012), (10, -0.03), (11, -0.06), (12, 0.016), (13, -0.004), (14, -0.032), (15, 0.001), (16, -0.052), (17, -0.003), (18, 0.022), (19, -0.028), (20, 0.034), (21, -0.041), (22, 0.009), (23, 0.09), (24, -0.022), (25, 0.027), (26, 0.006), (27, 0.048), (28, -0.008), (29, -0.051), (30, 0.015), (31, 0.05), (32, 0.052), (33, -0.007), (34, 0.013), (35, -0.013), (36, -0.028), (37, 0.061), (38, -0.055), (39, -0.005), (40, -0.008), (41, -0.002), (42, -0.037), (43, 0.025), (44, -0.008), (45, -0.01), (46, -0.066), (47, 0.027), (48, 0.004), (49, 0.068)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97734582 147 andrew gelman stats-2010-07-15-Quote of the day: statisticians and defaults

2 0.81305671 1859 andrew gelman stats-2013-05-16-How do we choose our default methods?

3 0.72906822 1979 andrew gelman stats-2013-08-13-Convincing Evidence

Introduction: Keith O’Rourke and I wrote an article that begins: Textbooks on statistics emphasize care and precision, via concepts such as reliability and validity in measurement, random sampling and treatment assignment in data collection, and causal identification and bias in estimation. But how do researchers decide what to believe and what to trust when choosing which statistical methods to use? How do they decide the credibility of methods? Statisticians and statistical practitioners seem to rely on a sense of anecdotal evidence based on personal experience and on the attitudes of trusted colleagues. Authorship, reputation, and past experience are thus central to decisions about statistical procedures. It’s for a volume on theoretical or methodological research on authorship, functional roles, reputation, and credibility in social media, edited by Sorin Matei and Elisa Bertino.

4 0.71435112 33 andrew gelman stats-2010-05-14-Felix Salmon wins the American Statistical Association’s Excellence in Statistical Reporting Award

Introduction: The official announcement: The Excellence in Statistical Reporting Award for 2010 is presented to Felix Salmon for his body of work, which exemplifies the highest standards of scientific reporting. His insightful use of statistics as a tool to understanding the world of business and economics, areas that are critical in today’s economy, sets a new standard in statistical investigative reporting. Here are some examples: Tiger Woods Nigerian spammers How the government fudges job statistics This one is important to me. The idea is that “statistical reporting” is not just traditional science reporting (journalist talks with scientists and tries to understand the consensus) or science popularization or silly feature stories about the lottery. Salmon is doing investigative reporting using statistical thinking. Also, from a political angle, Salmon’s smart and quantitatively sophisticated work (as well as that of others such as Nate Silver) is an important counterweigh

5 0.71367311 498 andrew gelman stats-2011-01-02-Theoretical vs applied statistics

Introduction: Anish Thomas writes: I was wondering if you could provide me with some guidance regarding statistical training. My background is in Industrial/Organizational Psychology, with an emphasis on Quantitative Psychology and currently working in the employee selection industry. I am considering pursuing a masters degree in Statistics. As l look through several program options, I am curious about the real difference between theoretical and applied Statistics. It would be very enlightening if you could shed some light on the difference. Specifically: 1. Is theoretical side more mathematically oriented (i.e., theorems and proofs) than applied? 2. Are the skills acquired in a ‘theoretical’ class difficult to transfer to the ‘applied’ side and vice versa? 3. I see theoretical statistics as the part that engages in developing the methods and applied statistics as pure application of the methods. Is this perception completely off base? My reply: 1. The difference between theoretic

6 0.67889261 557 andrew gelman stats-2011-02-05-Call for book proposals

7 0.67079616 1564 andrew gelman stats-2012-11-06-Choose your default, or your default will choose you (election forecasting edition)

8 0.65778345 231 andrew gelman stats-2010-08-24-Yet another Bayesian job opportunity

9 0.65141994 1110 andrew gelman stats-2012-01-10-Jobs in statistics research! In New Jersey!

10 0.63811117 744 andrew gelman stats-2011-06-03-Statistical methods for healthcare regulation: rating, screening and surveillance

11 0.63775504 1013 andrew gelman stats-2011-11-16-My talk at Math for America on Saturday

12 0.62990284 241 andrew gelman stats-2010-08-29-Ethics and statistics in development research

13 0.62846756 2317 andrew gelman stats-2014-05-04-Honored oldsters write about statistics

14 0.62019843 1721 andrew gelman stats-2013-02-13-A must-read paper on statistical analysis of experimental data

15 0.61787122 738 andrew gelman stats-2011-05-30-Works well versus well understood

16 0.61708105 2151 andrew gelman stats-2013-12-27-Should statistics have a Nobel prize?

17 0.6077407 1594 andrew gelman stats-2012-11-28-My talk on statistical graphics at Mit this Thurs aft

18 0.60238278 816 andrew gelman stats-2011-07-22-“Information visualization” vs. “Statistical graphics”

19 0.59880263 2072 andrew gelman stats-2013-10-21-The future (and past) of statistical sciences

20 0.58349442 1909 andrew gelman stats-2013-06-21-Job openings at conservative political analytics firm!

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(16, 0.074), (21, 0.178), (24, 0.205), (99, 0.358)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.99632394 514 andrew gelman stats-2011-01-13-News coverage of statistical issues…how did I do?

Introduction: This post is by Phil Price. A reporter once told me that the worst-kept secret of journalism is that every story has errors. And it’s true that just about every time I know about something first-hand, the news stories about it have some mistakes. Reporters aren’t subject-matter experts, they have limited time, and they generally can’t keep revisiting the things they are saying and checking them for accuracy. Many of us have published papers with errors — my most recent paper has an incorrect figure — and that’s after working on them carefully for weeks! One way that reporters can try to get things right is by quoting experts. Even then, there are problems with taking quotes out of context, or with making poor choices about what material to include or exclude, or, of course, with making a poor selection of experts. Yesterday, I was interviewed by an NPR reporter about the risks of breathing radon (a naturally occurring radioactive gas): who should test for it, how dangerous

2 0.99456626 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”

Introduction: Macartan Humphreys pointed me to this excellent guide . Here are the 10 items: 1. A causal claim is a statement about what didn’t happen. 2. There is a fundamental problem of causal inference. 3. You can estimate average causal effects even if you cannot observe any individual causal effects. 4. If you know that, on average, A causes B and that B causes C, this does not mean that you know that A causes C. 5. The counterfactual model is all about contribution, not attribution. 6. X can cause Y even if there is no “causal path” connecting X and Y. 7. Correlation is not causation. 8. X can cause Y even if X is not a necessary condition or a sufficient condition for Y. 9. Estimating average causal effects does not require that treatment and control groups are identical. 10. There is no causation without manipulation. The article follows with crisp discussions of each point. My favorite is item #6, not because it’s the most important but because it brings in some real s

3 0.99271071 1615 andrew gelman stats-2012-12-10-A defense of Tom Wolfe based on the impossibility of the law of small numbers in network structure

Introduction: A tall thin young man came to my office today to talk about one of my current pet topics: stories and social science. I brought up Tom Wolfe and his goal of compressing an entire city into a single novel, and how this reminded me of the psychologists Kahneman and Tversky’s concept of “the law of small numbers,” the idea that we expect any small sample to replicate all the properties of the larger population that it represents. Strictly speaking, the law of small numbers is impossible—any small sample necessarily has its own unique features—but this is even more true if we consider network properties. The average American knows about 700 people (depending on how you define “know”) and this defines a social network over the population. Now suppose you look at a few hundred people and all their connections. This mini-network will almost necessarily look much much sparser than the national network, as we’re removing the connections to the people not in the sample. Now consider how

4 0.99210078 1401 andrew gelman stats-2012-06-30-David Hogg on statistics

Introduction: Data analysis recipes: Fitting a model to data : We go through the many considerations involved in fitting a model to data, using as an example the fit of a straight line to a set of points in a two-dimensional plane. Standard weighted least-squares fitting is only appropriate when there is a dimension along which the data points have negligible uncertainties, and another along which all the uncertainties can be described by Gaussians of known variance; these conditions are rarely met in practice. We consider cases of general, heterogeneous, and arbitrarily covariant two-dimensional uncertainties, and situations in which there are bad data (large outliers), unknown uncertainties, and unknown but expected intrinsic scatter in the linear relationship being fit. Above all we emphasize the importance of having a “generative model” for the data, even an approximate one. Once there is a generative model, the subsequent fitting is non-arbitrary because the model permits direct computation

5 0.99167478 810 andrew gelman stats-2011-07-20-Adding more information can make the variance go up (depending on your model)

Introduction: Andy McKenzie writes: In their March 9 “ counterpoint ” in nature biotech to the prospect that we should try to integrate more sources of data in clinical practice (see “ point ” arguing for this), Isaac Kohane and David Margulies claim that, “Finally, how much better is our new knowledge than older knowledge? When is the incremental benefit of a genomic variant(s) or gene expression profile relative to a family history or classic histopathology insufficient and when does it add rather than subtract variance?” Perhaps I am mistaken (thus this email), but it seems that this claim runs contra to the definition of conditional probability. That is, if you have a hierarchical model, and the family history / classical histopathology already suggests a parameter estimate with some variance, how could the new genomic info possibly increase the variance of that parameter estimate? Surely the question is how much variance the new genomic info reduces and whether it therefore justifies t

6 0.98797637 432 andrew gelman stats-2010-11-27-Neumann update

7 0.98393661 659 andrew gelman stats-2011-04-13-Jim Campbell argues that Larry Bartels’s “Unequal Democracy” findings are not robust

same-blog 8 0.9827069 147 andrew gelman stats-2010-07-15-Quote of the day: statisticians and defaults

9 0.98250496 1826 andrew gelman stats-2013-04-26-“A Vast Graveyard of Undead Theories: Publication Bias and Psychological Science’s Aversion to the Null”

10 0.98221278 2306 andrew gelman stats-2014-04-26-Sleazy sock puppet can’t stop spamming our discussion of compressed sensing and promoting the work of Xiteng Liu

11 0.97907627 1275 andrew gelman stats-2012-04-22-Please stop me before I barf again

12 0.97848386 789 andrew gelman stats-2011-07-07-Descriptive statistics, causal inference, and story time

13 0.9781127 1728 andrew gelman stats-2013-02-19-The grasshopper wins, and Greg Mankiw’s grandmother would be “shocked and appalled” all over again

14 0.97690833 1989 andrew gelman stats-2013-08-20-Correcting for multiple comparisons in a Bayesian regression model

15 0.97665125 2037 andrew gelman stats-2013-09-25-Classical probability does not apply to quantum systems (causal inference edition)

16 0.97626829 62 andrew gelman stats-2010-06-01-Two Postdoc Positions Available on Bayesian Hierarchical Modeling

17 0.97194958 1824 andrew gelman stats-2013-04-25-Fascinating graphs from facebook data

18 0.97108132 486 andrew gelman stats-2010-12-26-Age and happiness: The pattern isn’t as clear as you might think

19 0.96876252 1459 andrew gelman stats-2012-08-15-How I think about mixture models

20 0.96393573 2112 andrew gelman stats-2013-11-25-An interesting but flawed attempt to apply general forecasting principles to contextualize attitudes toward risks of global warming