andrew_gelman_stats andrew_gelman_stats-2014 andrew_gelman_stats-2014-2258 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: From 2010 : Mark Buchanan wrote a cover article for the New Scientist on random matrices, a heretofore obscure area of probability theory that his headline writer characterizes as “the deep law that shapes our reality.” It’s interesting stuff, and he gets into some statistical applications at the end, so I’ll give you my take on it. But first, some background. About two hundred years ago, the mathematician/physicist Laplace discovered what is now called the central limit theorem, which is that, under certain conditions, the average of a large number of small random variables has an approximate normal (bell-shaped) distribution. A bit over 100 years ago, social scientists such as Galton applied this theorem to all sorts of biological and social phenomena. The central limit theorem, in its generality, is also important in the information that it indirectly conveys when it fails. For example, the distribution of the heights of adult men or women is nicely bell-shaped, but the
sentIndex sentText sentNum sentScore
1 From 2010 : Mark Buchanan wrote a cover article for the New Scientist on random matrices, a heretofore obscure area of probability theory that his headline writer characterizes as “the deep law that shapes our reality. [sent-1, score-0.407]
2 About two hundred years ago, the mathematician/physicist Laplace discovered what is now called the central limit theorem, which is that, under certain conditions, the average of a large number of small random variables has an approximate normal (bell-shaped) distribution. [sent-4, score-0.826]
3 A bit over 100 years ago, social scientists such as Galton applied this theorem to all sorts of biological and social phenomena. [sent-5, score-0.472]
4 For example, the distribution of the heights of adult men or women is nicely bell-shaped, but the distribution of the heights of all adults has a different, more spread-out distribution. [sent-7, score-0.731]
5 The central limit theorem is an example of an attractor—a mathematical model that appears as a limit as sample size gets large. [sent-13, score-0.754]
6 (Or, for other models, such as that used to describe the distribution of incomes, the attractor might be a power-law distribution. [sent-16, score-0.559]
7 ) The beauty of an attractor is that, if you believe the model, it can be used to explain an observed pattern without needing to know the details of its components. [sent-17, score-0.416]
8 A random matrix is an array of numbers, where each number is drawn from some specified probability distribution. [sent-20, score-0.683]
9 You can compute the eigenvalues of a square matrix—that’s a set of numbers summarizing the structure of the matrix—and they will have a probability distribution that is induced by the probability distribution of the individual elements of the matrix. [sent-21, score-0.829]
10 Over the past few decades, mathematicians such as Alan Edelman have performed computer simulations and proved theorems deriving the distribution of the eigenvalues of a random matrix, as the dimension of the matrix becomes large. [sent-22, score-1.16]
11 That is, for a broad range of different input models (distributions of the random matrices), you get the same output—the same eigenvalue distribution—as the sample size becomes large. [sent-24, score-0.69]
12 If the eigenvalue distribution is an attractor, this means that a lot of physical and social phenomena which can be modeled by eigenvalues (including, apparently, quantum energy levels and some properties of statistical tests) might have a common structure. [sent-28, score-0.917]
13 Nevertheless, Kuemmeth’s team found that random matrix theory described the measured levels very accurately. [sent-32, score-0.749]
14 Thus, I don’t quite understand this quote: Random matrix theory has got mathematicians like Percy Deift of New York University imagining that there might be more general patterns there too. [sent-34, score-0.549]
15 While random matrix theory suggests that this is a promising approach, it also points to hidden dangers. [sent-43, score-0.649]
16 As more and more complex data is collected, the number of variables being studied grows, and the number of apparent correlations between them grows even faster. [sent-44, score-0.512]
17 The new idea is that mathematical theory might enable the distribution of these correlations to be understood for a general range of cases. [sent-51, score-0.623]
18 We are in fact studying the properties of hierarchical models when the number of cases and variables becomes large, and it’s a hard problem. [sent-55, score-0.419]
19 Maybe the ideas from random matrix theory will be relevant here too. [sent-56, score-0.649]
20 That might be an illusion, and random matrix theory could be the tool to separate what is real and what is not. [sent-59, score-0.649]
wordName wordTfidf (topN-words)
[('attractor', 0.343), ('matrix', 0.308), ('random', 0.217), ('distribution', 0.216), ('buchanan', 0.208), ('eigenvalues', 0.188), ('theorem', 0.177), ('eigenvalue', 0.171), ('limit', 0.149), ('correlations', 0.139), ('theory', 0.124), ('heights', 0.123), ('mathematicians', 0.117), ('kuemmeth', 0.114), ('becomes', 0.114), ('levels', 0.1), ('variables', 0.095), ('grows', 0.094), ('central', 0.093), ('number', 0.092), ('social', 0.089), ('energy', 0.087), ('begun', 0.082), ('structure', 0.077), ('incomes', 0.074), ('range', 0.073), ('observed', 0.073), ('matrices', 0.072), ('factors', 0.071), ('mathematical', 0.071), ('probability', 0.066), ('properties', 0.066), ('size', 0.063), ('curve', 0.063), ('large', 0.062), ('small', 0.062), ('sorts', 0.061), ('measurements', 0.06), ('years', 0.056), ('men', 0.053), ('similar', 0.052), ('models', 0.052), ('edelman', 0.052), ('futures', 0.052), ('funnel', 0.052), ('electrons', 0.052), ('discernible', 0.052), ('sift', 0.052), ('fluctuating', 0.052), ('example', 0.052)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999982 2258 andrew gelman stats-2014-03-21-Random matrices in the news
Introduction: From 2010 : Mark Buchanan wrote a cover article for the New Scientist on random matrices, a heretofore obscure area of probability theory that his headline writer characterizes as “the deep law that shapes our reality.” It’s interesting stuff, and he gets into some statistical applications at the end, so I’ll give you my take on it. But first, some background. About two hundred years ago, the mathematician/physicist Laplace discovered what is now called the central limit theorem, which is that, under certain conditions, the average of a large number of small random variables has an approximate normal (bell-shaped) distribution. A bit over 100 years ago, social scientists such as Galton applied this theorem to all sorts of biological and social phenomena. The central limit theorem, in its generality, is also important in the information that it indirectly conveys when it fails. For example, the distribution of the heights of adult men or women is nicely bell-shaped, but the
2 0.20107111 2296 andrew gelman stats-2014-04-19-Index or indicator variables
Introduction: Someone who doesn’t want his name shared (for the perhaps reasonable reason that he’ll “one day not be confused, and would rather my confusion not live on online forever”) writes: I’m exploring HLMs and stan, using your book with Jennifer Hill as my field guide to this new territory. I think I have a generally clear grasp on the material, but wanted to be sure I haven’t gone astray. The problem in working on involves a multi-nation survey of students, and I’m especially interested in understanding the effects of country, religion, and sex, and the interactions among those factors (using IRT to estimate individual-level ability, then estimating individual, school, and country effects). Following the basic approach laid out in chapter 13 for such interactions between levels, I think I need to create a matrix of indicator variables for religion and sex. Elsewhere in the book, you recommend against indicator variables in favor of a single index variable. Am I right in thinking t
3 0.1947953 1527 andrew gelman stats-2012-10-10-Another reason why you can get good inferences from a bad model
Introduction: John Cook considers how people justify probability distribution assumptions: Sometimes distribution assumptions are not justified. Sometimes distributions can be derived from fundamental principles [or] . . . on theoretical grounds. For example, large samples and the central limit theorem together may justify assuming that something is normally distributed. Often the choice of distribution is somewhat arbitrary, chosen by intuition or for convenience, and then empirically shown to work well enough. Sometimes a distribution can be a bad fit and still work well, depending on what you’re asking of it. Cook continues: The last point is particularly interesting. It’s not hard to imagine that a poor fit would produce poor results. It’s surprising when a poor fit produces good results. And then he gives an example of an effective but inaccurate model used to model survival times in a clinical trial. Cook explains: The [poorly-fitting] method works well because of the q
4 0.15746218 1753 andrew gelman stats-2013-03-06-Stan 1.2.0 and RStan 1.2.0
Introduction: Stan 1.2.0 and RStan 1.2.0 are now available for download. See: http://mc-stan.org/ Here are the highlights. Full Mass Matrix Estimation during Warmup Yuanjun Gao, a first-year grad student here at Columbia (!), built a regularized mass-matrix estimator. This helps for posteriors with high correlation among parameters and varying scales. We’re still testing this ourselves, so the estimation procedure may change in the future (don’t worry — it satisfies detailed balance as is, but we might be able to make it more computationally efficient in terms of time per effective sample). It’s not the default option. The major reason is the matrix operations required are expensive, raising the algorithm cost to , where is the average number of leapfrog steps, is the number of iterations, and is the number of parameters. Yuanjun did a great job with the Cholesky factorizations and implemented this about as efficiently as is possible. (His homework for Andrew’s class w
5 0.14989704 2072 andrew gelman stats-2013-10-21-The future (and past) of statistical sciences
Introduction: In connection with this workshop, I was asked to write a few paragraphs describing my perspective on “the current and near-term future state of the statistical sciences you are most familiar with.” Here’s what I wrote: I think that, at any given time, the field of statistics has a core, but that core changes over time. There are different paradigmatic ways to solve problems. 100 or 150 years ago, the thing to do was to identify a phenomenon of interest with some probability distribution and then use the mathematics of that distribution to gain insight into the underlying process. Thus, for example, if certain data looked like they came from a normal distribution, one could surmise that the values in question arose by adding many small independent pieces. If the data looked like they came from a binomial distribution, that would imply independence and equal probabilities. Waiting times that followed an exponential distribution could be considered as coming from a memoryless
6 0.14101654 1474 andrew gelman stats-2012-08-29-More on scaled-inverse Wishart and prior independence
7 0.12831931 1267 andrew gelman stats-2012-04-17-Hierarchical-multilevel modeling with “big data”
8 0.12712623 1991 andrew gelman stats-2013-08-21-BDA3 table of contents (also a new paper on visualization)
9 0.12637554 2128 andrew gelman stats-2013-12-09-How to model distributions that have outliers in one direction
10 0.12522167 1477 andrew gelman stats-2012-08-30-Visualizing Distributions of Covariance Matrices
11 0.12113508 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes
13 0.11767215 1955 andrew gelman stats-2013-07-25-Bayes-respecting experimental design and other things
14 0.11664469 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging
15 0.11647694 851 andrew gelman stats-2011-08-12-year + (1|year)
16 0.11640628 1628 andrew gelman stats-2012-12-17-Statistics in a world where nothing is random
17 0.11553009 931 andrew gelman stats-2011-09-29-Hamiltonian Monte Carlo stories
18 0.11502188 555 andrew gelman stats-2011-02-04-Handy Matrix Cheat Sheet, with Gradients
19 0.11330538 773 andrew gelman stats-2011-06-18-Should we always be using the t and robit instead of the normal and logit?
20 0.11122803 2176 andrew gelman stats-2014-01-19-Transformations for non-normal data
topicId topicWeight
[(0, 0.246), (1, 0.095), (2, 0.05), (3, -0.004), (4, 0.034), (5, 0.022), (6, 0.01), (7, -0.01), (8, -0.007), (9, 0.036), (10, -0.056), (11, -0.015), (12, -0.05), (13, -0.025), (14, -0.031), (15, 0.016), (16, 0.024), (17, 0.014), (18, 0.018), (19, -0.061), (20, 0.017), (21, -0.042), (22, -0.025), (23, 0.025), (24, 0.043), (25, 0.041), (26, -0.024), (27, 0.103), (28, 0.064), (29, -0.002), (30, -0.006), (31, 0.058), (32, -0.008), (33, -0.008), (34, 0.034), (35, -0.044), (36, -0.028), (37, 0.07), (38, -0.064), (39, -0.016), (40, -0.016), (41, -0.036), (42, 0.06), (43, -0.033), (44, -0.018), (45, -0.029), (46, -0.006), (47, 0.002), (48, 0.054), (49, -0.09)]
simIndex simValue blogId blogTitle
same-blog 1 0.96790671 2258 andrew gelman stats-2014-03-21-Random matrices in the news
Introduction: From 2010 : Mark Buchanan wrote a cover article for the New Scientist on random matrices, a heretofore obscure area of probability theory that his headline writer characterizes as “the deep law that shapes our reality.” It’s interesting stuff, and he gets into some statistical applications at the end, so I’ll give you my take on it. But first, some background. About two hundred years ago, the mathematician/physicist Laplace discovered what is now called the central limit theorem, which is that, under certain conditions, the average of a large number of small random variables has an approximate normal (bell-shaped) distribution. A bit over 100 years ago, social scientists such as Galton applied this theorem to all sorts of biological and social phenomena. The central limit theorem, in its generality, is also important in the information that it indirectly conveys when it fails. For example, the distribution of the heights of adult men or women is nicely bell-shaped, but the
2 0.79979527 535 andrew gelman stats-2011-01-24-Bleg: Automatic Differentiation for Log Prob Gradients?
Introduction: We need help picking out an automatic differentiation package for Hamiltonian Monte Carlo sampling from the posterior of a generalized linear model with deep interactions. Specifically, we need to compute gradients for log probability functions with thousands of parameters that involve matrix (determinants, eigenvalues, inverses), stats (distributions), and math (log gamma) functions. Any suggestions? The Application: Hybrid Monte Carlo for Posteriors We’re getting serious about implementing posterior sampling using Hamiltonian Monte Carlo. HMC speeds up mixing by including gradient information to help guide the Metropolis proposals toward areas high probability. In practice, the algorithm requires a handful or of gradient calculations per sample, but there are many dimensions and the functions are hairy enough we don’t want to compute derivaties by hand. Auto Diff: Perhaps not What you Think It may not have been clear to readers of this blog that automatic diffe
3 0.7581026 555 andrew gelman stats-2011-02-04-Handy Matrix Cheat Sheet, with Gradients
Introduction: This post is an (unpaid) advertisement for the following extremely useful resource: Petersen, K. B. and M. S. Pedersen. 2008. The Matrix Cookbook . Tehcnical Report, Technical University of Denmark. It contains 70+ pages of useful relations and derivations involving matrices. What grabbed my eye was the computation of gradients for matrix operations ranging from eigenvalues and determinants to multivariate normal density functions. I had no idea the multivariate normal had such a clean gradient (see section 8). We’ve been playing around with Hamiltonian (aka Hybrid) Monte Carlo for sampling from the posterior of hierarchical generalized linear models with lots of interactions. HMC speeds up Metropolis sampling by using the gradient of the log probability to drive samples in the direction of higher probability density, which is particularly useful for correlated parameters that mix slowly with standard Gibbs sampling. Matt “III” Hoffman ‘s already got it workin
4 0.74693978 2072 andrew gelman stats-2013-10-21-The future (and past) of statistical sciences
Introduction: In connection with this workshop, I was asked to write a few paragraphs describing my perspective on “the current and near-term future state of the statistical sciences you are most familiar with.” Here’s what I wrote: I think that, at any given time, the field of statistics has a core, but that core changes over time. There are different paradigmatic ways to solve problems. 100 or 150 years ago, the thing to do was to identify a phenomenon of interest with some probability distribution and then use the mathematics of that distribution to gain insight into the underlying process. Thus, for example, if certain data looked like they came from a normal distribution, one could surmise that the values in question arose by adding many small independent pieces. If the data looked like they came from a binomial distribution, that would imply independence and equal probabilities. Waiting times that followed an exponential distribution could be considered as coming from a memoryless
Introduction: From August 1990. It was in the form of a note sent to all the people in the statistics group of Bell Labs, where I’d worked that summer. To all: Here’s the abstract of the work I’ve done this summer. It’s stored in the file, /fs5/gelman/abstract.bell, and copies of the Figures 1-3 are on Trevor’s desk. Any comments are of course appreciated; I’m at gelman@stat.berkeley.edu. On the Routine Use of Markov Chains for Simulation Andrew Gelman and Donald Rubin, 6 August 1990 corrected version: 8 August 1990 1. Simulation In probability and statistics we can often specify multivariate distributions many of whose properties we do not fully understand–perhaps, as in the Ising model of statistical physics, we can write the joint density function, up to a multiplicative constant that cannot be expressed in closed form. For an example in statistics, consider the Normal random effects model in the analysis of variance, which can be easily placed in a Bayesian fram
6 0.71973395 1114 andrew gelman stats-2012-01-12-Controversy about average personality differences between men and women
7 0.71781147 2332 andrew gelman stats-2014-05-12-“The results (not shown) . . .”
8 0.69355845 674 andrew gelman stats-2011-04-21-Handbook of Markov Chain Monte Carlo
9 0.68929774 931 andrew gelman stats-2011-09-29-Hamiltonian Monte Carlo stories
11 0.67858171 1983 andrew gelman stats-2013-08-15-More on AIC, WAIC, etc
12 0.67493188 1881 andrew gelman stats-2013-06-03-Boot
13 0.67095399 778 andrew gelman stats-2011-06-24-New ideas on DIC from Martyn Plummer and Sumio Watanabe
14 0.66812682 269 andrew gelman stats-2010-09-10-R vs. Stata, or, Different ways to estimate multilevel models
15 0.66479337 996 andrew gelman stats-2011-11-07-Chi-square FAIL when many cells have small expected values
16 0.65989077 864 andrew gelman stats-2011-08-21-Going viral — not!
17 0.65558904 1165 andrew gelman stats-2012-02-13-Philosophy of Bayesian statistics: my reactions to Wasserman
18 0.65540057 501 andrew gelman stats-2011-01-04-A new R package for fititng multilevel models
19 0.6544953 1918 andrew gelman stats-2013-06-29-Going negative
20 0.65417707 2311 andrew gelman stats-2014-04-29-Bayesian Uncertainty Quantification for Differential Equations!
topicId topicWeight
[(5, 0.015), (12, 0.015), (15, 0.016), (16, 0.047), (17, 0.014), (21, 0.04), (24, 0.117), (41, 0.025), (45, 0.013), (60, 0.016), (63, 0.011), (76, 0.037), (77, 0.033), (86, 0.038), (89, 0.039), (94, 0.012), (99, 0.393)]
simIndex simValue blogId blogTitle
same-blog 1 0.99200869 2258 andrew gelman stats-2014-03-21-Random matrices in the news
Introduction: From 2010 : Mark Buchanan wrote a cover article for the New Scientist on random matrices, a heretofore obscure area of probability theory that his headline writer characterizes as “the deep law that shapes our reality.” It’s interesting stuff, and he gets into some statistical applications at the end, so I’ll give you my take on it. But first, some background. About two hundred years ago, the mathematician/physicist Laplace discovered what is now called the central limit theorem, which is that, under certain conditions, the average of a large number of small random variables has an approximate normal (bell-shaped) distribution. A bit over 100 years ago, social scientists such as Galton applied this theorem to all sorts of biological and social phenomena. The central limit theorem, in its generality, is also important in the information that it indirectly conveys when it fails. For example, the distribution of the heights of adult men or women is nicely bell-shaped, but the
2 0.98404014 1596 andrew gelman stats-2012-11-29-More consulting experiences, this time in computational linguistics
Introduction: Bob wrote this long comment that I think is worth posting: I [Bob] have done a fair bit of consulting for my small natural language processing company over the past ten years. Like statistics, natural language processing is something may companies think they want, but have no idea how to do themselves. We almost always handed out “free” consulting. Usually on the phone to people who called us out of the blue. Our blog and tutorials Google ranking was pretty much our only approach to marketing other than occassionally going to business-oriented conferences. Our goal was to sell software licenses (because consulting doesn’t scale nor does it provide continuing royalty income), but since so few people knew how to use toolkits like ours, we had to help them along the way. We even provided “free” consulting with our startup license package. We were brutally honest with customers, both about our goals and their goals. Their goals were often incompatible with ours (use company X’
3 0.98367506 1688 andrew gelman stats-2013-01-22-That claim that students whose parents pay for more of college get worse grades
Introduction: Theodore Vasiloudis writes: I came upon this article by Laura Hamilton, an assistant professor in the University of California at Merced, that claims that “The more money that parents provide for higher education, the lower the grades their children earn.” I can’t help but feel that there something wrong with the basis of the study or a confounding factor causing this apparent correlation, and since you often comment on studies on your blog I thought you might find this study interesting. My reply: I have to admit that the description above made me suspicious of the study before I even looked at it. On first thought, I’d expect the effect of parent’s financial contributions to be positive (as they free the student from the need to get a job during college), but not negative. Hamilton argues that “parental investments create a disincentive for student achievement,” which may be—but I’m generally suspicious of arguments in which the rebound is bigger than the main effect.
4 0.9836697 1289 andrew gelman stats-2012-04-29-We go to war with the data we have, not the data we want
Introduction: This post is by Phil. Psychologists perform experiments on Canadian undergraduate psychology students and draws conclusions that (they believe) apply to humans in general; they publish in Science. A drug company decides to embark on additional trials that will cost tens of millions of dollars based on the results of a careful double-blind study….whose patients are all volunteers from two hospitals. A movie studio holds 9 screenings of a new movie for volunteer viewers and, based on their survey responses, decides to spend another $8 million to re-shoot the ending. A researcher interested in the effect of ventilation on worker performance conducts a months-long study in which ventilation levels are varied and worker performance is monitored…in a single building. In almost all fields of research, most studies are based on convenience samples, or on random samples from a larger population that is itself a convenience sample. The paragraph above gives just a few examples. The benefit
5 0.9821502 614 andrew gelman stats-2011-03-15-Induction within a model, deductive inference for model evaluation
Introduction: Jonathan Livengood writes: I have a couple of questions on your paper with Cosma Shalizi on “Philosophy and the practice of Bayesian statistics.” First, you distinguish between inductive approaches and hypothetico-deductive approaches to inference and locate statistical practice (at least, the practice of model building and checking) on the hypothetico-deductive side. Do you think that there are any interesting elements of statistical practice that are properly inductive? For example, suppose someone is playing around with a system that more or less resembles a toy model, like drawing balls from an urn or some such, and where the person has some well-defined priors. The person makes a number of draws from the urn and applies Bayes theorem to get a posterior. On your view, is that person making an induction? If so, how much space is there in statistical practice for genuine inductions like this? Second, I agree with you that one ought to distinguish induction from other kind
6 0.98194444 731 andrew gelman stats-2011-05-26-Lottery probability update
7 0.98179269 757 andrew gelman stats-2011-06-10-Controversy over the Christakis-Fowler findings on the contagion of obesity
8 0.98158133 1948 andrew gelman stats-2013-07-21-Bayes related
9 0.98136073 1469 andrew gelman stats-2012-08-25-Ways of knowing
11 0.98109066 2282 andrew gelman stats-2014-04-05-Bizarre academic spam
13 0.98106325 1719 andrew gelman stats-2013-02-11-Why waste time philosophizing?
14 0.98088408 2180 andrew gelman stats-2014-01-21-Everything I need to know about Bayesian statistics, I learned in eight schools.
15 0.98080897 1807 andrew gelman stats-2013-04-17-Data problems, coding errors…what can be done?
16 0.98072082 773 andrew gelman stats-2011-06-18-Should we always be using the t and robit instead of the normal and logit?
17 0.98019969 1600 andrew gelman stats-2012-12-01-$241,364.83 – $13,000 = $228,364.83
18 0.9801935 1742 andrew gelman stats-2013-02-27-What is “explanation”?
19 0.97976977 2152 andrew gelman stats-2013-12-28-Using randomized incentives as an instrument for survey nonresponse?
20 0.97975397 1095 andrew gelman stats-2012-01-01-Martin and Liu: Probabilistic inference based on consistency of model with data