andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-421 knowledge-graph by maker-knowledge-mining

421 andrew gelman stats-2010-11-19-Just chaid


meta infos for this blog

Source: html

Introduction: Reading somebody else’s statistics rant made me realize the inherent contradictions in much of my own statistical advice. Jeff Lax sent along this article by Philip Schrodt, along with the cryptic comment: Perhaps of interest to you. perhaps not. Not meant to be an excuse for you to rant against hypothesis testing again. In his article, Schrodt makes a reasonable and entertaining argument against the overfitting of data and the overuse of linear models. He states that his article is motivated by the quantitative papers he has been sent to review for journals or conferences, and he explicitly excludes “studies of United States voting behavior,” so at least I think Mister P is off the hook. I notice a bit of incoherence in Schrodt’s position–on one hand, he criticizes “kitchen-sink models” for overfitting and he criticizes “using complex methods without understanding the underlying assumptions” . . . but then later on he suggests that political scientists in this countr


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Reading somebody else’s statistics rant made me realize the inherent contradictions in much of my own statistical advice. [sent-1, score-0.209]

2 Jeff Lax sent along this article by Philip Schrodt, along with the cryptic comment: Perhaps of interest to you. [sent-2, score-0.075]

3 Not meant to be an excuse for you to rant against hypothesis testing again. [sent-4, score-0.107]

4 In his article, Schrodt makes a reasonable and entertaining argument against the overfitting of data and the overuse of linear models. [sent-5, score-0.12]

5 He states that his article is motivated by the quantitative papers he has been sent to review for journals or conferences, and he explicitly excludes “studies of United States voting behavior,” so at least I think Mister P is off the hook. [sent-6, score-0.145]

6 I notice a bit of incoherence in Schrodt’s position–on one hand, he criticizes “kitchen-sink models” for overfitting and he criticizes “using complex methods without understanding the underlying assumptions” . [sent-7, score-0.634]

7 but then later on he suggests that political scientists in this country start using mysterious (to me) methods such as correspondence analysis, support vector machines, neural networks, Fourier analysis, hidden Markov models, topological clustering algorithms, and something called CHAID! [sent-10, score-0.674]

8 Not to burst anyone’s bubble here, but if you really think that multiple regression involves assumptions that are too much for the average political scientist, what do you think is going to happen with topological clustering algorithms, neural networks, and the rest? [sent-11, score-0.641]

9 As in many rants of this sort (my own not excepted), there is an inherent tension between two feelings : 1. [sent-13, score-0.167]

10 The despair that people are using methods that are too simple for the phenomena they are trying to understand. [sent-14, score-0.358]

11 The near-certain feeling that many people are using models too complicated for them to understand. [sent-16, score-0.323]

12 On one hand, I find myself telling people to go simple, simple, simple. [sent-18, score-0.143]

13 When someone gives me their regression coefficient I ask for the average, when someone gives me the average I ask for a scatterplot, when someone gives me a scatterplot I ask them to carefully describe one data point, please. [sent-19, score-0.991]

14 On the other hand, I’m always getting on people’s case about too-simple assumptions, for example analyzing state-level election results over a 50 year period and thinking that controlling for “state dummies” solves all their problems. [sent-20, score-0.163]

15 For example, suppose I suggest that someone, instead of pooling 50 years of data, instead do a separate analysis of each year or each decade and then plot their estimates over time. [sent-23, score-0.254]

16 This recommendation of the secret weapon actually satisfied both criteria 1 and 2 above: the model varies by year (or by decade) and is thus more flexible than the “year dummies” model that preceded it; but at the same time the new model is simpler and cleaner. [sent-24, score-0.268]

17 Still and all, there’s a bit of incoherence in telling people to go more sophisticated and simpler at the same time, and I think people who have worked with me have seen me oscillate in my advice, first suggesting very basic methods and then pulling out models that are too complicated to fit. [sent-25, score-0.855]

18 He writes that learning hierarchical models “doesn’t give you ANOVA. [sent-28, score-0.116]

19 And I think Schrodt is mixing apples and oranges by throwing in computational methods (“genetic algorithms and simulated annealing methods”) in his list of models. [sent-31, score-0.826]

20 Genetic algorithms and simulated annealing methods can be used for optimization and other computational tasks but they’re not models in the statistical (or political science) sense of the word. [sent-32, score-0.786]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('schrodt', 0.511), ('algorithms', 0.185), ('methods', 0.157), ('vermont', 0.146), ('annealing', 0.146), ('dummies', 0.146), ('topological', 0.146), ('incoherence', 0.125), ('neural', 0.122), ('clustering', 0.12), ('overfitting', 0.12), ('models', 0.116), ('criticizes', 0.116), ('scatterplot', 0.111), ('rant', 0.107), ('simulated', 0.104), ('genetic', 0.102), ('inherent', 0.102), ('assumptions', 0.101), ('year', 0.096), ('simpler', 0.094), ('decade', 0.093), ('gives', 0.091), ('networks', 0.09), ('ask', 0.088), ('someone', 0.088), ('hand', 0.082), ('average', 0.079), ('telling', 0.079), ('computational', 0.078), ('complicated', 0.078), ('oranges', 0.078), ('apples', 0.078), ('echoing', 0.078), ('oscillate', 0.078), ('preceded', 0.078), ('article', 0.075), ('burst', 0.073), ('excepted', 0.073), ('simple', 0.072), ('excludes', 0.07), ('fourier', 0.067), ('solves', 0.067), ('advice', 0.066), ('tension', 0.065), ('suggest', 0.065), ('using', 0.065), ('correspondence', 0.064), ('conferences', 0.064), ('people', 0.064)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 421 andrew gelman stats-2010-11-19-Just chaid

Introduction: Reading somebody else’s statistics rant made me realize the inherent contradictions in much of my own statistical advice. Jeff Lax sent along this article by Philip Schrodt, along with the cryptic comment: Perhaps of interest to you. perhaps not. Not meant to be an excuse for you to rant against hypothesis testing again. In his article, Schrodt makes a reasonable and entertaining argument against the overfitting of data and the overuse of linear models. He states that his article is motivated by the quantitative papers he has been sent to review for journals or conferences, and he explicitly excludes “studies of United States voting behavior,” so at least I think Mister P is off the hook. I notice a bit of incoherence in Schrodt’s position–on one hand, he criticizes “kitchen-sink models” for overfitting and he criticizes “using complex methods without understanding the underlying assumptions” . . . but then later on he suggests that political scientists in this countr

2 0.14170422 417 andrew gelman stats-2010-11-17-Clutering and variance components

Introduction: Raymond Lim writes: Do you have any recommendations on clustering and binary models? My particular problem is I’m running a firm fixed effect logit and want to cluster by industry-year (every combination of industry-year). My control variable of interest in measured by industry-year and when I cluster by industry-year, the standard errors are 300x larger than when I don’t cluster. Strangely, this problem only occurs when doing logit and not OLS (linear probability). Also, clustering just by field doesn’t blow up the errors. My hunch is it has something to do with the non-nested structure of year, but I don’t understand why this is only problematic under logit and not OLS. My reply: I’d recommend including four multilevel variance parameters, one for firm, one for industry, one for year, and one for industry-year. (In lmer, that’s (1 | firm) + (1 | industry) + (1 | year) + (1 | industry.year)). No need to include (1 | firm.year) since in your data this is the error term. Try

3 0.13760933 1392 andrew gelman stats-2012-06-26-Occam

Introduction: Cosma Shalizi and Larry Wasserman discuss some papers from a conference on Ockham’s Razor. I don’t have anything new to add on this so let me link to past blog entries on the topic and repost the following from 2004 : A lot has been written in statistics about “parsimony”—that is, the desire to explain phenomena using fewer parameters–but I’ve never seen any good general justification for parsimony. (I don’t count “Occam’s Razor,” or “Ockham’s Razor,” or whatever, as a justification. You gotta do better than digging up a 700-year-old quote.) Maybe it’s because I work in social science, but my feeling is: if you can approximate reality with just a few parameters, fine. If you can use more parameters to fold in more information, that’s even better. In practice, I often use simple models—because they are less effort to fit and, especially, to understand. But I don’t kid myself that they’re better than more complicated efforts! My favorite quote on this comes from Rad

4 0.11926425 1763 andrew gelman stats-2013-03-14-Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses

Introduction: Some things I respect When it comes to meta-models of statistics, here are two philosophies that I respect: 1. (My) Bayesian approach, which I associate with E. T. Jaynes, in which you construct models with strong assumptions, ride your models hard, check their fit to data, and then scrap them and improve them as necessary. 2. At the other extreme, model-free statistical procedures that are designed to work well under very weak assumptions—for example, instead of assuming a distribution is Gaussian, you would just want the procedure to work well under some conditions on the smoothness of the second derivative of the log density function. Both the above philosophies recognize that (almost) all important assumptions will be wrong, and they resolve this concern via aggressive model checking or via robustness. And of course there are intermediate positions, such as working with Bayesian models that have been shown to be robust, and then still checking them. Or, to flip it arou

5 0.11881656 1788 andrew gelman stats-2013-04-04-When is there “hidden structure in data” to be discovered?

Introduction: Michael Collins sent along the following announcement for a talk: Fast learning algorithms for discovering the hidden structure in data Daniel Hsu, Microsoft Research 11am, Wednesday April 10th, Interschool lab, 7th floor CEPSR, Columbia University A major challenge in machine learning is to reliably and automatically discover hidden structure in data with minimal human intervention. For instance, one may be interested in understanding the stratification of a population into subgroups, the thematic make-up of a collection of documents, or the dynamical process governing a complex time series. Many of the core statistical estimation problems for these applications are, in general, provably intractable for both computational and statistical reasons; and therefore progress is made by shifting the focus to realistic instances that rule out the intractable cases. In this talk, I’ll describe a general computational approach for correctly estimating a wide class of statistical mod

6 0.11817397 1418 andrew gelman stats-2012-07-16-Long discussion about causal inference and the use of hierarchical models to bridge between different inferential settings

7 0.11264254 1482 andrew gelman stats-2012-09-04-Model checking and model understanding in machine learning

8 0.10983785 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis

9 0.10907675 1469 andrew gelman stats-2012-08-25-Ways of knowing

10 0.10628562 2255 andrew gelman stats-2014-03-19-How Americans vote

11 0.1046066 1719 andrew gelman stats-2013-02-11-Why waste time philosophizing?

12 0.10429409 1690 andrew gelman stats-2013-01-23-When are complicated models helpful in psychology research and when are they overkill?

13 0.10269571 2254 andrew gelman stats-2014-03-18-Those wacky anti-Bayesians used to be intimidating, but now they’re just pathetic

14 0.099596128 2245 andrew gelman stats-2014-03-12-More on publishing in journals

15 0.099287249 1972 andrew gelman stats-2013-08-07-When you’re planning on fitting a model, build up to it by fitting simpler models first. Then, once you have a model you like, check the hell out of it

16 0.097343504 291 andrew gelman stats-2010-09-22-Philosophy of Bayes and non-Bayes: A dialogue with Deborah Mayo

17 0.095775694 1205 andrew gelman stats-2012-03-09-Coming to agreement on philosophy of statistics

18 0.09511669 419 andrew gelman stats-2010-11-18-Derivative-based MCMC as a breakthrough technique for implementing Bayesian statistics

19 0.094250031 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes

20 0.094056219 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.224), (1, 0.056), (2, -0.017), (3, 0.034), (4, -0.008), (5, 0.016), (6, -0.048), (7, -0.023), (8, 0.077), (9, 0.047), (10, 0.025), (11, 0.003), (12, -0.026), (13, -0.015), (14, -0.002), (15, 0.028), (16, 0.01), (17, -0.016), (18, -0.017), (19, -0.006), (20, 0.006), (21, -0.023), (22, -0.008), (23, 0.008), (24, 0.001), (25, -0.017), (26, -0.013), (27, -0.019), (28, 0.001), (29, 0.011), (30, 0.013), (31, 0.013), (32, 0.037), (33, 0.01), (34, 0.037), (35, -0.037), (36, 0.009), (37, -0.004), (38, -0.0), (39, 0.013), (40, 0.001), (41, 0.027), (42, 0.005), (43, 0.032), (44, -0.004), (45, -0.041), (46, -0.056), (47, -0.002), (48, 0.026), (49, -0.031)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98356348 421 andrew gelman stats-2010-11-19-Just chaid

Introduction: Reading somebody else’s statistics rant made me realize the inherent contradictions in much of my own statistical advice. Jeff Lax sent along this article by Philip Schrodt, along with the cryptic comment: Perhaps of interest to you. perhaps not. Not meant to be an excuse for you to rant against hypothesis testing again. In his article, Schrodt makes a reasonable and entertaining argument against the overfitting of data and the overuse of linear models. He states that his article is motivated by the quantitative papers he has been sent to review for journals or conferences, and he explicitly excludes “studies of United States voting behavior,” so at least I think Mister P is off the hook. I notice a bit of incoherence in Schrodt’s position–on one hand, he criticizes “kitchen-sink models” for overfitting and he criticizes “using complex methods without understanding the underlying assumptions” . . . but then later on he suggests that political scientists in this countr

2 0.89792907 1739 andrew gelman stats-2013-02-26-An AI can build and try out statistical models using an open-ended generative grammar

Introduction: David Duvenaud writes: I’ve been following your recent discussions about how an AI could do statistics [see also here ]. I was especially excited about your suggestion for new statistical methods using “a language-like approach to recursively creating new models from a specified list of distributions and transformations, and an automatic approach to checking model fit.” Your discussion of these ideas was exciting to me and my colleagues because we recently did some work taking a step in this direction, automatically searching through a grammar over Gaussian process regression models. Roger Grosse previously did the same thing , but over matrix decomposition models using held-out predictive likelihood to check model fit. These are both examples of automatic Bayesian model-building by a search over more and more complex models, as you suggested. One nice thing is that both grammars include lots of standard models for free, and they seem to work pretty well, although the

3 0.86557043 1690 andrew gelman stats-2013-01-23-When are complicated models helpful in psychology research and when are they overkill?

Introduction: Nick Brown is bothered by this article , “An unscented Kalman filter approach to the estimation of nonlinear dynamical systems models,” by Sy-Miin Chow, Emilio Ferrer, and John Nesselroade. The introduction of the article cites a bunch of articles in serious psych/statistics journals. The question is, are such advanced statistical techniques really needed, or even legitimate, with the kind of very rough data that is usually available in psych applications? Or is it just fishing in the hope of discovering patterns that are not really there? I wrote: It seems like a pretty innocuous literature review. I agree that many of the applications are silly (for example, they cite the work of the notorious John Gottman in fitting a predator-prey model to spousal relations (!)), but overall they just seem to be presenting very standard ideas for the mathematical-psychology audience. It’s not clear whether advanced techniques are always appropriate here, but they come in through a natura

4 0.84169292 575 andrew gelman stats-2011-02-15-What are the trickiest models to fit?

Introduction: John Salvatier writes: What do you and your readers think are the trickiest models to fit? If I had an algorithm that I claimed could fit many models with little fuss, what kinds of models would really impress you? I am interested in testing different MCMC sampling methods to evaluate their performance and I want to stretch the bounds of their abilities. I don’t know what’s the trickiest, but just about anything I work on in a serious way gives me some troubles. This reminds me that we should finish our Bayesian Benchmarks paper already.

5 0.82609963 1482 andrew gelman stats-2012-09-04-Model checking and model understanding in machine learning

Introduction: Last month I wrote : Computer scientists are often brilliant but they can be unfamiliar with what is done in the worlds of data collection and analysis. This goes the other way too: statisticians such as myself can look pretty awkward, reinventing (or failing to reinvent) various wheels when we write computer programs or, even worse, try to design software.Andrew MacNamara writes: Andrew MacNamara followed up with some thoughts: I [MacNamara] had some basic statistics training through my MBA program, after having completed an undergrad degree in computer science. Since then I’ve been very interested in learning more about statistical techniques, including things like GLM and censored data analyses as well as machine learning topics like neural nets, SVMs, etc. I began following your blog after some research into Bayesian analysis topics and I am trying to dig deeper on that side of things. One thing I have noticed is that there seems to be a distinction between data analysi

6 0.82191449 1763 andrew gelman stats-2013-03-14-Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses

7 0.81602335 1718 andrew gelman stats-2013-02-11-Toward a framework for automatic model building

8 0.81521338 244 andrew gelman stats-2010-08-30-Useful models, model checking, and external validation: a mini-discussion

9 0.80658323 496 andrew gelman stats-2011-01-01-Tukey’s philosophy

10 0.80320555 1395 andrew gelman stats-2012-06-27-Cross-validation (What is it good for?)

11 0.79662514 964 andrew gelman stats-2011-10-19-An interweaving-transformation strategy for boosting MCMC efficiency

12 0.79385489 1406 andrew gelman stats-2012-07-05-Xiao-Li Meng and Xianchao Xie rethink asymptotics

13 0.78946775 2136 andrew gelman stats-2013-12-16-Whither the “bet on sparsity principle” in a nonsparse world?

14 0.78924656 1392 andrew gelman stats-2012-06-26-Occam

15 0.78838414 72 andrew gelman stats-2010-06-07-Valencia: Summer of 1991

16 0.78725827 776 andrew gelman stats-2011-06-22-Deviance, DIC, AIC, cross-validation, etc

17 0.7802844 1425 andrew gelman stats-2012-07-23-Examples of the use of hierarchical modeling to generalize to new settings

18 0.77801764 757 andrew gelman stats-2011-06-10-Controversy over the Christakis-Fowler findings on the contagion of obesity

19 0.77499771 1070 andrew gelman stats-2011-12-19-The scope for snooping

20 0.77300137 789 andrew gelman stats-2011-07-07-Descriptive statistics, causal inference, and story time


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(5, 0.015), (15, 0.036), (16, 0.079), (21, 0.038), (24, 0.13), (30, 0.012), (45, 0.018), (55, 0.01), (63, 0.144), (65, 0.01), (85, 0.031), (86, 0.023), (99, 0.302)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.98849595 1621 andrew gelman stats-2012-12-13-Puzzles of criminal justice

Introduction: Four recent news stories about crime and punishment made me realize, yet again, how little I understand all this. 1. “HSBC to Pay $1.92 Billion to Settle Charges of Money Laundering” : State and federal authorities decided against indicting HSBC in a money-laundering case over concerns that criminal charges could jeopardize one of the world’s largest banks and ultimately destabilize the global financial system. Instead, HSBC announced on Tuesday that it had agreed to a record $1.92 billion settlement with authorities. . . . I don’t understand this idea of punishing the institution. I have the same problem when the NCAA punishes a college football program. These are individual people breaking the law (or the rules), right? So why not punish them directly? Giving 40 lashes to a bunch of HSBC executives and garnisheeing their salaries for life, say, that wouldn’t destabilize the global financial system would it? From the article: “A money-laundering indictment, or a guilt

2 0.98472619 1484 andrew gelman stats-2012-09-05-Two exciting movie ideas: “Second Chance U” and “The New Dirty Dozen”

Introduction: I have a great idea for a movie. Actually two movies based on two variants of a similar idea. It all started when I saw this story: Dr. Anil Potti, the controversial cancer researcher whose work at Duke University led to lawsuits from patients, is now a medical oncologist at the Cancer Center of North Dakota in Grand Forks. When asked about Dr. Potti’s controversial appointment, his new boss said : If a guy can’t get a second chance here in North Dakota, where he trained, man, you can’t get a second chance anywhere. (Link from Retraction Watch , of course.) Potti’s boss is also quoted as saying, “Most, if not all, his patients have loved him.” On the other hand, the news article reports: “The North Carolina medical board’s website lists settlements against Potti of at least $75,000.” I guess there’s no reason you can’t love a guy and still want a juicy malpractice settlement. Second Chance U I don’t give two poops about Dr. Anil Potti. But seeing the above s

3 0.98024827 1480 andrew gelman stats-2012-09-02-“If our product is harmful . . . we’ll stop making it.”

Introduction: After our discussion of the sad case of Darrell Huff, the celebrated “How to Lie with Statistics” guy who had a lucrative side career disparaging the link between smoking and cancer, I was motivated to follow John Mashey’s recommendation and read the book, Golden Holocaust: Origins of the Cigarette Catastrophe and the Case for Abolition, by historian Robert Proctor. My first stop upon receiving the book was the index, in particular the entry for Rubin, Donald B. I followed the reference to pages 440-442 and found the description of Don’s activities to be accurate, neither diminished nor overstated, to the best of my knowledge. Rubin is the second-most-famous statistician to have been paid by the cigarette industry, but several other big and small names have been on the payroll at one time or another. Here’s a partial list . Just including the people I know or have heard of: Herbert Solomon, Stanford Richard Tweedie, Bond U Arnold Zellner, U of Chicago Paul Switzer, S

4 0.97989571 293 andrew gelman stats-2010-09-23-Lowess is great

Introduction: I came across this old blog entry that was just hilarious–but it’s from 2005 so I think most of you haven’t seen it. It’s the story of two people named Martin Voracek and Maryanne Fisher who in a published discussion criticized lowess (a justly popular nonlinear regression method). Curious, I looked up “Martin Voracek” on the web and found an article in the British Medical Journal whose the title promised “trend analysis.” I was wondering what statistical methods they used–something more sophisticated than lowess, perhaps? They did have one figure, and here it is: Voracek and Fisher, the critics of lowess, are fit straight lines to data to clearly nonlinear data! It’s most obvious in their leftmost graph. Voracek and Fisher get full credit for showing scatterplots, but hey . . . they should try lowess next time! What’s really funny in the graph are the little dotted lines indicating inferential uncertainty in the regression lines–all under the assumption of linearity,

5 0.97781181 782 andrew gelman stats-2011-06-29-Putting together multinomial discrete regressions by combining simple logits

Introduction: When predicting 0/1 data we can use logit (or probit or robit or some other robust model such as invlogit (0.01 + 0.98*X*beta)). Logit is simple enough and we can use bayesglm to regularize and avoid the problem of separation. What if there are more than 2 categories? If they’re ordered (1, 2, 3, etc), we can do ordered logit (and use bayespolr() to avoid separation). If the categories are unordered (vanilla, chocolate, strawberry), there are unordered multinomial logit and probit models out there. But it’s not so easy to fit these multinomial model in a multilevel setting (with coefficients that vary by group), especially if the computation is embedded in an iterative routine such as mi where you have real time constraints at each step. So this got me wondering whether we could kluge it with logits. Here’s the basic idea (in the ordered and unordered forms): - If you have a variable that goes 1, 2, 3, etc., set up a series of logits: 1 vs. 2,3,…; 2 vs. 3,…; and so forth

6 0.97411668 313 andrew gelman stats-2010-10-03-A question for psychometricians

7 0.96790671 102 andrew gelman stats-2010-06-21-Why modern art is all in the mind

8 0.96358025 286 andrew gelman stats-2010-09-20-Are the Democrats avoiding a national campaign?

9 0.9624688 2103 andrew gelman stats-2013-11-16-Objects of the class “Objects of the class”

10 0.95799702 544 andrew gelman stats-2011-01-29-Splitting the data

11 0.95522416 1078 andrew gelman stats-2011-12-22-Tables as graphs: The Ramanujan principle

12 0.95415109 1690 andrew gelman stats-2013-01-23-When are complicated models helpful in psychology research and when are they overkill?

same-blog 13 0.9536345 421 andrew gelman stats-2010-11-19-Just chaid

14 0.95076716 1201 andrew gelman stats-2012-03-07-Inference = data + model

15 0.95059407 33 andrew gelman stats-2010-05-14-Felix Salmon wins the American Statistical Association’s Excellence in Statistical Reporting Award

16 0.94912916 1506 andrew gelman stats-2012-09-21-Building a regression model . . . with only 27 data points

17 0.94829941 291 andrew gelman stats-2010-09-22-Philosophy of Bayes and non-Bayes: A dialogue with Deborah Mayo

18 0.94019282 2148 andrew gelman stats-2013-12-25-Spam!

19 0.93796605 745 andrew gelman stats-2011-06-04-High-level intellectual discussions in the Columbia statistics department

20 0.93628752 428 andrew gelman stats-2010-11-24-Flawed visualization of U.S. voting maybe has some good features