andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-815 knowledge-graph by maker-knowledge-mining

815 andrew gelman stats-2011-07-22-Statistical inference based on the minimum description length principle


meta infos for this blog

Source: html

Introduction: Tom Ball writes: Here’s another query to add to the stats backlog…Minimum Description Length (MDL). I’m attaching a 2002 Psych Rev paper on same. Basically, it’s an approach to model selection that replaces goodness of fit with generalizability or complexity. Would be great to get your response to this approach. My reply: I’ve heard about the minimum description length principle for a long time but have never really understood it. So I have nothing to say! Anyone who has anything useful to say on the topic, feel free to add in the comments. The rest of you might wonder why I posted this. I just thought it would be good for you to have some sense of the boundaries of my knowledge.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Tom Ball writes: Here’s another query to add to the stats backlog…Minimum Description Length (MDL). [sent-1, score-0.584]

2 Basically, it’s an approach to model selection that replaces goodness of fit with generalizability or complexity. [sent-3, score-0.977]

3 Would be great to get your response to this approach. [sent-4, score-0.155]

4 My reply: I’ve heard about the minimum description length principle for a long time but have never really understood it. [sent-5, score-1.424]

5 Anyone who has anything useful to say on the topic, feel free to add in the comments. [sent-7, score-0.603]

6 The rest of you might wonder why I posted this. [sent-8, score-0.359]

7 I just thought it would be good for you to have some sense of the boundaries of my knowledge. [sent-9, score-0.411]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('minimum', 0.324), ('length', 0.306), ('description', 0.24), ('replaces', 0.234), ('attaching', 0.223), ('rev', 0.223), ('goodness', 0.223), ('generalizability', 0.199), ('backlog', 0.195), ('psych', 0.195), ('boundaries', 0.195), ('query', 0.191), ('add', 0.191), ('ball', 0.167), ('understood', 0.148), ('tom', 0.145), ('stats', 0.143), ('principle', 0.117), ('basically', 0.115), ('rest', 0.115), ('selection', 0.113), ('posted', 0.108), ('heard', 0.105), ('knowledge', 0.104), ('anyone', 0.095), ('say', 0.095), ('wonder', 0.092), ('free', 0.091), ('response', 0.084), ('topic', 0.082), ('fit', 0.08), ('nothing', 0.079), ('approach', 0.079), ('useful', 0.079), ('feel', 0.077), ('long', 0.075), ('reply', 0.072), ('great', 0.071), ('anything', 0.07), ('never', 0.067), ('thought', 0.061), ('another', 0.059), ('sense', 0.059), ('would', 0.055), ('paper', 0.052), ('model', 0.049), ('might', 0.044), ('ve', 0.042), ('really', 0.042), ('good', 0.041)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 815 andrew gelman stats-2011-07-22-Statistical inference based on the minimum description length principle

Introduction: Tom Ball writes: Here’s another query to add to the stats backlog…Minimum Description Length (MDL). I’m attaching a 2002 Psych Rev paper on same. Basically, it’s an approach to model selection that replaces goodness of fit with generalizability or complexity. Would be great to get your response to this approach. My reply: I’ve heard about the minimum description length principle for a long time but have never really understood it. So I have nothing to say! Anyone who has anything useful to say on the topic, feel free to add in the comments. The rest of you might wonder why I posted this. I just thought it would be good for you to have some sense of the boundaries of my knowledge.

2 0.12587175 945 andrew gelman stats-2011-10-06-W’man < W’pedia, again

Introduction: Blogger Deep Climate looks at another paper by the 2002 recipient of the American Statistical Association’s Founders award. This time it’s not funny, it’s just sad. Here’s Wikipedia on simulated annealing: By analogy with this physical process, each step of the SA algorithm replaces the current solution by a random “nearby” solution, chosen with a probability that depends on the difference between the corresponding function values and on a global parameter T (called the temperature), that is gradually decreased during the process. The dependency is such that the current solution changes almost randomly when T is large, but increasingly “downhill” as T goes to zero. The allowance for “uphill” moves saves the method from becoming stuck at local minima—which are the bane of greedier methods. And here’s Wegman: During each step of the algorithm, the variable that will eventually represent the minimum is replaced by a random solution that is chosen according to a temperature

3 0.084122658 1955 andrew gelman stats-2013-07-25-Bayes-respecting experimental design and other things

Introduction: Dan Lakeland writes: I have some questions about some basic statistical ideas and would like your opinion on them: 1) Parameters that manifestly DON’T exist: It makes good sense to me to think about Bayesian statistics as narrowing in on the value of parameters based on a model and some data. But there are cases where “the parameter” simply doesn’t make sense as an actual thing. Yet, it’s not really a complete fiction, like unicorns either, it’s some kind of “effective” thing maybe. Here’s an example of what I mean. I did a simple toy experiment where we dropped crumpled up balls of paper and timed their fall times. (see here: http://models.street-artists.org/?s=falling+ball ) It was pretty instructive actually, and I did it to figure out how to in a practical way use an ODE to get a likelihood in MCMC procedures. One of the parameters in the model is the radius of the spherical ball of paper. But the ball of paper isn’t a sphere, not even approximately. There’s no single valu

4 0.084073104 2229 andrew gelman stats-2014-02-28-God-leaf-tree

Introduction: Govind Manian writes: I wanted to pass along a fragment from Lichtenberg’s Waste Books — which I am finding to be great stone soup — that reminded me of God is in Every Leaf : To the wise man nothing is great and nothing small…I believe he could write treatises on keyholes that sounded as weighty as a jus naturae and would be just as instructive. As the few adepts in such things well know, universal morality is to be found in little everyday penny-events just as much as in great ones. There is so much goodness and ingenuity in a raindrop that an apothecary wouldn’t let it go for less than half-a-crown… (Notebook B, 33)

5 0.076294787 244 andrew gelman stats-2010-08-30-Useful models, model checking, and external validation: a mini-discussion

Introduction: I sent a copy of my paper (coauthored with Cosma Shalizi) on Philosophy and the practice of Bayesian statistics in the social sciences to Richard Berk , who wrote: I read your paper this morning. I think we are pretty much on the same page about all models being wrong. I like very much the way you handle this in the paper. Yes, Newton’s work is wrong, but surely useful. I also like your twist on Bayesian methods. Makes good sense to me. Perhaps most important, your paper raises some difficult issues I have been trying to think more carefully about. 1. If the goal of a model is to be useful, surely we need to explore that “useful” means. At the very least, usefulness will depend on use. So a model that is useful for forecasting may or may not be useful for causal inference. 2. Usefulness will be a matter of degree. So that for each use we will need one or more metrics to represent how useful the model is. In what looks at first to be simple example, if the use is forecasting,

6 0.076256499 1678 andrew gelman stats-2013-01-17-Wanted: 365 stories of statistics

7 0.07565929 754 andrew gelman stats-2011-06-09-Difficulties with Bayesian model averaging

8 0.074370757 2023 andrew gelman stats-2013-09-14-On blogging

9 0.071717843 2036 andrew gelman stats-2013-09-24-“Instead of the intended message that being poor is hard, the takeaway is that rich people aren’t very good with money.”

10 0.069009304 1036 andrew gelman stats-2011-11-30-Stan uses Nuts!

11 0.06860435 395 andrew gelman stats-2010-11-05-Consulting: how do you figure out what to charge?

12 0.06711296 761 andrew gelman stats-2011-06-13-A survey’s not a survey if they don’t tell you how they did it

13 0.066806413 1392 andrew gelman stats-2012-06-26-Occam

14 0.065929458 464 andrew gelman stats-2010-12-12-Finite-population standard deviation in a hierarchical model

15 0.065759718 1241 andrew gelman stats-2012-04-02-Fixed effects and identification

16 0.064269796 1290 andrew gelman stats-2012-04-30-I suppose it’s too late to add Turing’s run-around-the-house-chess to the 2012 London Olympics?

17 0.06207183 829 andrew gelman stats-2011-07-29-Infovis vs. statgraphics: A clear example of their different goals

18 0.06124267 1299 andrew gelman stats-2012-05-04-Models, assumptions, and data summaries

19 0.059299227 1732 andrew gelman stats-2013-02-22-Evaluating the impacts of welfare reform?

20 0.058863379 1691 andrew gelman stats-2013-01-25-Extreem p-values!


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.116), (1, 0.008), (2, -0.018), (3, 0.013), (4, 0.012), (5, -0.001), (6, 0.041), (7, -0.014), (8, 0.055), (9, 0.003), (10, 0.035), (11, -0.004), (12, 0.002), (13, -0.002), (14, -0.022), (15, -0.014), (16, 0.006), (17, -0.009), (18, -0.002), (19, 0.008), (20, 0.005), (21, -0.025), (22, 0.005), (23, -0.005), (24, -0.019), (25, 0.007), (26, 0.003), (27, 0.016), (28, -0.017), (29, 0.018), (30, 0.015), (31, -0.029), (32, -0.002), (33, -0.003), (34, -0.034), (35, -0.012), (36, 0.03), (37, -0.0), (38, -0.0), (39, 0.033), (40, 0.039), (41, 0.022), (42, 0.012), (43, 0.0), (44, 0.011), (45, 0.023), (46, -0.003), (47, 0.002), (48, -0.024), (49, -0.024)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97329336 815 andrew gelman stats-2011-07-22-Statistical inference based on the minimum description length principle

Introduction: Tom Ball writes: Here’s another query to add to the stats backlog…Minimum Description Length (MDL). I’m attaching a 2002 Psych Rev paper on same. Basically, it’s an approach to model selection that replaces goodness of fit with generalizability or complexity. Would be great to get your response to this approach. My reply: I’ve heard about the minimum description length principle for a long time but have never really understood it. So I have nothing to say! Anyone who has anything useful to say on the topic, feel free to add in the comments. The rest of you might wonder why I posted this. I just thought it would be good for you to have some sense of the boundaries of my knowledge.

2 0.74069005 865 andrew gelman stats-2011-08-22-Blogging is “destroying the business model for quality”?

Introduction: Journalist Jonathan Rauch writes that the internet is Sturgeon squared: This is the blogosphere. I’m not getting paid to be here. I’m here to get incredibly famous (in my case, even more incredibly famous) so that I can get paid somewhere else. . . . The average quality of newspapers and (published) novels is far, far better than the average quality of blog posts (and–ugh!–comments). This is because people pay for newspapers and novels. What distinguishes newspapers and novels is how much does not get published in them, because people won’t pay for it. Payment is a filter, and a pretty good one. Imperfect, of course. But pointing out the defects of the old model is merely changing the subject if the new model is worse. . . . Yes, the new model is bringing a lot of new content into being. But most of it is bad. And it’s displacing a lot of better content, by destroying the business model for quality. Even in the information economy, there’s no free lunch. . . . Yes, there’s g

3 0.73919785 363 andrew gelman stats-2010-10-22-Graphing Likert scale responses

Introduction: Alex Hoffman writes: I am reviewing a article with a whole bunch of tables with likert scale responses. You know, the standard thing with each question on its own line, followed by 5 columns of numbers. Is there a good way to display this data graphically? OK, there’s no one best way, but can you point your readers to a few good examples? My reply: Some sort of small multiples. I’m thinking of lineplots. Maybe a grid of plots, each with three colored and labeled lines. For example, it might be a grid with 10 rows and 5 columns. To really know what to do, I’d have to have more sense of what’s being plotted. Feel free to contribute your ideas in the comments.

4 0.73199463 1704 andrew gelman stats-2013-02-03-Heuristics for identifying ecological fallacies?

Introduction: Greg Laughlin writes: My company just wrote a blog post about the ecological fallacy. There’s a discussion about it on the Hacker News message board. Someone asks, “How do you know [if a group-level finding shouldn't be used to describe individual level behavior]?” The best answer I had was “you can never tell without the individual-level data, you should always be suspicious of group-level findings applied to individuals.” Am I missing anything? Are there any situations in which you can look at group-level qualities being ascribed to individuals and not have to fear the ecological fallacy? My reply: I think that’s right. To put it another way, consider the larger model with separate coefficients for individual-level and group-level effects. If you want, you can make an assumption that they’re equal, but that’s an assumption that needs to be justified on substantive grounds. We discuss these issues a bit in this paper from 2001. (I just reread that paper. It’s pre

5 0.73024189 459 andrew gelman stats-2010-12-09-Solve mazes by starting at the exit

Introduction: It worked on this one . Good maze designers know this trick and are careful to design multiple branches in each direction. Back when I was in junior high, I used to make huge mazes, and the basic idea was to anticipate what the solver might try to do and to make the maze difficult by postponing the point at which he would realize a path was going nowhere. For example, you might have 6 branches: one dead end, two pairs that form loops going back to the start, and one that is the correct solution. You do this from both directions and add some twists and turns, and there you are. But the maze designer aiming for the naive solver–the sap who starts from the entrance and goes toward the exit–can simplify matters by just having 6 branches: five dead ends and one winner. This sort of thing is easy to solve in the reverse direction. I’m surprised the Times didn’t do better for their special puzzle issue.

6 0.72175008 482 andrew gelman stats-2010-12-23-Capitalism as a form of voluntarism

7 0.71860641 2123 andrew gelman stats-2013-12-04-Tesla fires!

8 0.71824819 2018 andrew gelman stats-2013-09-12-Do you ever have that I-just-fit-a-model feeling?

9 0.70298469 1796 andrew gelman stats-2013-04-09-The guy behind me on line for the train . . .

10 0.70267779 99 andrew gelman stats-2010-06-19-Paired comparisons

11 0.70003575 54 andrew gelman stats-2010-05-27-Hype about conditional probability puzzles

12 0.69637984 1410 andrew gelman stats-2012-07-09-Experimental work on market-based or non-market-based incentives

13 0.69594014 1597 andrew gelman stats-2012-11-29-What is expected of a consultant

14 0.69492418 910 andrew gelman stats-2011-09-15-Google Refine

15 0.69375128 415 andrew gelman stats-2010-11-15-The two faces of Erving Goffman: Subtle observer of human interactions, and Smug organzation man

16 0.6931178 505 andrew gelman stats-2011-01-05-Wacky interview questions: An exploration into the nature of evidence on the internet

17 0.69276297 2045 andrew gelman stats-2013-09-30-Using the aggregate of the outcome variable as a group-level predictor in a hierarchical model

18 0.69143099 295 andrew gelman stats-2010-09-25-Clusters with very small numbers of observations

19 0.68884164 640 andrew gelman stats-2011-03-31-Why Edit Wikipedia?

20 0.68707097 1666 andrew gelman stats-2013-01-10-They’d rather be rigorous than right


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(13, 0.021), (15, 0.047), (16, 0.013), (18, 0.05), (24, 0.224), (58, 0.174), (68, 0.021), (72, 0.028), (84, 0.051), (99, 0.246)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.92965978 815 andrew gelman stats-2011-07-22-Statistical inference based on the minimum description length principle

Introduction: Tom Ball writes: Here’s another query to add to the stats backlog…Minimum Description Length (MDL). I’m attaching a 2002 Psych Rev paper on same. Basically, it’s an approach to model selection that replaces goodness of fit with generalizability or complexity. Would be great to get your response to this approach. My reply: I’ve heard about the minimum description length principle for a long time but have never really understood it. So I have nothing to say! Anyone who has anything useful to say on the topic, feel free to add in the comments. The rest of you might wonder why I posted this. I just thought it would be good for you to have some sense of the boundaries of my knowledge.

2 0.91235906 119 andrew gelman stats-2010-06-30-Why is George Apley overrated?

Introduction: A comment by Mark Palko reminded me that, while I’m a huge Marquand fan, I think The Late George Apley is way overrated. My theory is that Marquand’s best books don’t fit into the modernist way of looking about literature, and that the gatekeepers of the 1930s and 1940s, when judging Marquand by these standards, conveniently labeled Apley has his best book because it had a form–Edith-Wharton-style satire–that they could handle. In contrast, Point of No Return and all the other classics are a mixture of seriousness and satire that left critics uneasy. Perhaps there’s a way to study this sort of thing more systematically?

3 0.89274865 574 andrew gelman stats-2011-02-14-“The best data visualizations should stand on their own”? I don’t think so.

Introduction: Jimmy pointed me to this blog by Drew Conway on word clouds. I don’t have much to say about Conway’s specifics–word clouds aren’t really my thing, but I’m glad that people are thinking about how to do them better–but I did notice one phrase of his that I’ll dispute. Conway writes The best data visualizations should stand on their own . . . I disagree. I prefer the saying, “A picture plus 1000 words is better than two pictures or 2000 words.” That is, I see a positive interaction between words and pictures or, to put it another way, diminishing returns for words or pictures on their own. I don’t have any big theory for this, but I think, when expressed as a joint value function, my idea makes sense. Also, I live this suggestion in my own work. I typically accompany my graphs with long captions and I try to accompany my words with pictures (although I’m not doing it here, because with the software I use, it’s much easier to type more words than to find, scale, and insert i

4 0.88406062 1966 andrew gelman stats-2013-08-03-Uncertainty in parameter estimates using multilevel models

Introduction: David Hsu writes: I have a (perhaps) simple question about uncertainty in parameter estimates using multilevel models — what is an appropriate threshold for measure parameter uncertainty in a multilevel model? The reason why I ask is that I set out to do a crossed two-way model with two varying intercepts, similar to your flight simulator example in your 2007 book. The difference is that I have a lot of predictors specific to each cell (I think equivalent to airport and pilot in your example), and I find after modeling this in JAGS, I happily find that the predictors are much less important than the variability by cell (airport and pilot effects). Happily because this is what I am writing a paper about. However, I then went to check subsets of predictors using lm() and lmer(). I understand that they all use different estimation methods, but what I can’t figure out is why the errors on all of the coefficient estimates are *so* different. For example, using JAGS, and th

5 0.87853253 1167 andrew gelman stats-2012-02-14-Extra babies on Valentine’s Day, fewer on Halloween?

Introduction: Just in time for the holiday, X pointed me to an article by Becca Levy, Pil Chung, and Martin Slade reporting that, during a recent eleven-year period, more babies were born on Valentine’s Day and fewer on Halloween compared to neighboring days: What I’d really like to see is a graph with all 366 days of the year. It would be easy enough to make. That way we could put the Valentine’s and Halloween data in the context of other possible patterns. While they’re at it, they could also graph births by day of the week and show Thanksgiving, Easter, and other holidays that don’t have fixed dates. It’s so frustrating when people only show part of the story. The data are publicly available, so maybe someone could make those graphs? If the Valentine’s/Halloween data are worth publishing, I think more comprehensive graphs should be publishable as well. I’d post them here, that’s for sure.

6 0.8734799 1886 andrew gelman stats-2013-06-07-Robust logistic regression

7 0.86549962 1087 andrew gelman stats-2011-12-27-“Keeping things unridiculous”: Berger, O’Hagan, and me on weakly informative priors

8 0.86055982 197 andrew gelman stats-2010-08-10-The last great essayist?

9 0.858284 847 andrew gelman stats-2011-08-10-Using a “pure infographic” to explore differences between information visualization and statistical graphics

10 0.85725367 414 andrew gelman stats-2010-11-14-“Like a group of teenagers on a bus, they behave in public as if they were in private”

11 0.85577929 2312 andrew gelman stats-2014-04-29-Ken Rice presents a unifying approach to statistical inference and hypothesis testing

12 0.8551023 1224 andrew gelman stats-2012-03-21-Teaching velocity and acceleration

13 0.85466427 278 andrew gelman stats-2010-09-15-Advice that might make sense for individuals but is negative-sum overall

14 0.85456514 1062 andrew gelman stats-2011-12-16-Mr. Pearson, meet Mr. Mandelbrot: Detecting Novel Associations in Large Data Sets

15 0.85415339 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model

16 0.85351825 2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters

17 0.85345006 1208 andrew gelman stats-2012-03-11-Gelman on Hennig on Gelman on Bayes

18 0.85337991 2247 andrew gelman stats-2014-03-14-The maximal information coefficient

19 0.85333306 1240 andrew gelman stats-2012-04-02-Blogads update

20 0.85290337 1838 andrew gelman stats-2013-05-03-Setting aside the politics, the debate over the new health-care study reveals that we’re moving to a new high standard of statistical journalism