andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1542 knowledge-graph by maker-knowledge-mining

1542 andrew gelman stats-2012-10-20-A statistical model for underdispersion


meta infos for this blog

Source: html

Introduction: We have lots of models for overdispersed count data but we rarely see underdispersed data. But now I know what example I’ll be giving when this next comes up in class. From a book review by Theo Tait: A number of shark species go in for oophagy, or uterine cannibalism. Sand tiger foetuses ‘eat each other in utero, acting out the harshest form of sibling rivalry imaginable’. Only two babies emerge, one from each of the mother shark’s uteruses: the survivors have eaten everything else. ‘A female sand tiger gives birth to a baby that’s already a metre long and an experienced killer,’ explains Demian Chapman, an expert on the subject. That’s what I call underdispersion. E(y)=2, var(y)=0. Take that, M. Poisson!


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 We have lots of models for overdispersed count data but we rarely see underdispersed data. [sent-1, score-0.573]

2 But now I know what example I’ll be giving when this next comes up in class. [sent-2, score-0.282]

3 From a book review by Theo Tait: A number of shark species go in for oophagy, or uterine cannibalism. [sent-3, score-0.735]

4 Sand tiger foetuses ‘eat each other in utero, acting out the harshest form of sibling rivalry imaginable’. [sent-4, score-1.144]

5 Only two babies emerge, one from each of the mother shark’s uteruses: the survivors have eaten everything else. [sent-5, score-0.591]

6 ‘A female sand tiger gives birth to a baby that’s already a metre long and an experienced killer,’ explains Demian Chapman, an expert on the subject. [sent-6, score-1.595]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('shark', 0.364), ('sand', 0.351), ('tiger', 0.324), ('imaginable', 0.202), ('harshest', 0.202), ('rivalry', 0.202), ('sibling', 0.202), ('overdispersed', 0.182), ('eaten', 0.182), ('emerge', 0.175), ('chapman', 0.17), ('var', 0.166), ('killer', 0.142), ('mother', 0.142), ('acting', 0.139), ('species', 0.139), ('poisson', 0.137), ('eat', 0.136), ('birth', 0.135), ('babies', 0.133), ('experienced', 0.126), ('female', 0.126), ('baby', 0.125), ('rarely', 0.116), ('count', 0.112), ('explains', 0.109), ('expert', 0.095), ('giving', 0.084), ('gives', 0.079), ('call', 0.076), ('everything', 0.076), ('form', 0.075), ('review', 0.075), ('next', 0.067), ('comes', 0.066), ('already', 0.064), ('long', 0.061), ('number', 0.059), ('lots', 0.059), ('book', 0.053), ('take', 0.052), ('models', 0.05), ('ll', 0.046), ('go', 0.045), ('two', 0.038), ('example', 0.033), ('know', 0.032), ('data', 0.027), ('see', 0.027), ('one', 0.02)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 1542 andrew gelman stats-2012-10-20-A statistical model for underdispersion

Introduction: We have lots of models for overdispersed count data but we rarely see underdispersed data. But now I know what example I’ll be giving when this next comes up in class. From a book review by Theo Tait: A number of shark species go in for oophagy, or uterine cannibalism. Sand tiger foetuses ‘eat each other in utero, acting out the harshest form of sibling rivalry imaginable’. Only two babies emerge, one from each of the mother shark’s uteruses: the survivors have eaten everything else. ‘A female sand tiger gives birth to a baby that’s already a metre long and an experienced killer,’ explains Demian Chapman, an expert on the subject. That’s what I call underdispersion. E(y)=2, var(y)=0. Take that, M. Poisson!

2 0.11432963 144 andrew gelman stats-2010-07-13-Hey! Here’s a referee report for you!

Introduction: I just wrote this, and I realized it might be useful more generally: The article looks reasonable to me–but I just did a shallow read and didn’t try to judge whether the conclusions are correct. My main comment is that if they’re doing a Poisson regression, they should really be doing an overdispersed Poisson regression. I don’t know if I’ve ever seen data in my life where the non-overdispersed Poisson is appropriate. Also, I’d like to see a before-after plot with dots for control cases and open circles for treatment cases and fitted regression lines drawn in. Whenever there’s a regression I like to see this scatterplot. The scatterplot isn’t a replacement for the regression, but at the very least it gives me intuition as to the scale of the estimated effect. Finally, all their numbers should be rounded appropriately. Feel free to cut-and-paste this into your own referee reports (and to apply these recommendations in your own applied research).

3 0.080515265 1734 andrew gelman stats-2013-02-23-Life in the C-suite: A graph that is both ugly and bad, and an unrelated story

Introduction: Jemes Keirstead sends along this infographic : He hates it: First we’ve got an hourglass metaphor wrecked by the fact that “now” (i.e. the pinch point in the glass) is actually 3-5 years in the future and the past sand includes “up to three years” in the future. Then there are the percentages which are appear to represent a vertical distance, not volume of sand or width of the hourglass. Add to that a strange color scheme in which green goes from dark to light to dark again. I know January’s not even finished yet, but surely a competitor for worst infographic of 2013? Keirstead doesn’t even comment on what I see as the worst aspect of the graph, which is that the “3-5 years” band is the narrowest on the graph, but expressed as a per-year rate it is actually the highest of all the percentages. The hourglass visualization does the astounding feat of taking the period where the executives expect the highest rate of change and presenting it as a minimum in the graph.

4 0.07736782 1494 andrew gelman stats-2012-09-13-Watching the sharks jump

Introduction: Recently in the sister blog: Niall Ferguson is a hack . Niall Ferguson is not always a hack, sometimes he just makes silly mistakes . Paul Krugman is not a hack, but he sometimes he goes over the top . Reflections on hacks . P.S. Yes, technically I’m misusing the expression, it should really be something like, “Watching the sharks get jumped.” But I liked the image of the jumping shark.

5 0.070653498 1294 andrew gelman stats-2012-05-01-Modeling y = a + b + c

Introduction: Brandon Behlendorf writes: I [Behlendorf] am replicating some previous research using OLS [he's talking about what we call "linear regression"---ed.] to regress a logged rate (to reduce skew) of Y on a number of predictors (Xs). Y is the count of a phenomena divided by the population of the unit of the analysis. The problem that I am encountering is that Y is composite count of a number of distinct phenomena [A+B+C], and these phenomena are not uniformly distributed across the sample. Most of the research in this area has conducted regressions either with Y or with individual phenomena [A or B or C] as the dependent variable. Yet it seems that if [A, B, C] are not uniformly distributed across the sample of units in the same proportion, then the use of Y would be biased, since as a count of [A+B+C] divided by the population, it would treat as equivalent units both [2+0.5+1.5] and [4+0+0]. My goal is trying to find a methodology which allows a researcher to regress Y on a

6 0.069639787 1387 andrew gelman stats-2012-06-21-Will Tiger Woods catch Jack Nicklaus? And a discussion of the virtues of using continuous data even if your goal is discrete prediction

7 0.069000259 203 andrew gelman stats-2010-08-12-John McPhee, the Anti-Malcolm

8 0.064496115 1249 andrew gelman stats-2012-04-06-Thinking seriously about social science research

9 0.057559542 922 andrew gelman stats-2011-09-24-Economists don’t think like accountants—but maybe they should

10 0.054865841 1521 andrew gelman stats-2012-10-04-Columbo does posterior predictive checks

11 0.053968243 2333 andrew gelman stats-2014-05-13-Personally, I’d rather go with Teragram

12 0.053680193 2043 andrew gelman stats-2013-09-29-The difficulties of measuring just about anything

13 0.053678326 1102 andrew gelman stats-2012-01-06-Bayesian Anova found useful in ecology

14 0.05362048 2139 andrew gelman stats-2013-12-19-Happy birthday

15 0.053410873 4 andrew gelman stats-2010-04-26-Prolefeed

16 0.052437358 1369 andrew gelman stats-2012-06-06-Your conclusion is only as good as your data

17 0.05066919 770 andrew gelman stats-2011-06-15-Still more Mr. P in public health

18 0.050247595 1664 andrew gelman stats-2013-01-10-Recently in the sister blog: Brussels sprouts, ugly graphs, and switched at birth

19 0.050073378 859 andrew gelman stats-2011-08-18-Misunderstanding analysis of covariance

20 0.049929231 1819 andrew gelman stats-2013-04-23-Charles Murray’s “Coming Apart” and the measurement of social and political divisions


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.063), (1, -0.01), (2, -0.01), (3, 0.022), (4, 0.02), (5, 0.009), (6, 0.018), (7, -0.008), (8, 0.025), (9, 0.014), (10, 0.006), (11, 0.002), (12, -0.001), (13, -0.006), (14, 0.034), (15, 0.001), (16, 0.01), (17, 0.0), (18, 0.019), (19, -0.028), (20, -0.01), (21, 0.02), (22, -0.006), (23, -0.007), (24, 0.001), (25, 0.021), (26, -0.003), (27, -0.01), (28, 0.002), (29, -0.008), (30, -0.022), (31, -0.0), (32, -0.007), (33, -0.001), (34, 0.014), (35, 0.016), (36, -0.001), (37, -0.003), (38, 0.002), (39, 0.003), (40, -0.01), (41, -0.009), (42, -0.005), (43, -0.003), (44, 0.014), (45, 0.029), (46, 0.003), (47, -0.019), (48, 0.014), (49, 0.025)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.92126822 1542 andrew gelman stats-2012-10-20-A statistical model for underdispersion

Introduction: We have lots of models for overdispersed count data but we rarely see underdispersed data. But now I know what example I’ll be giving when this next comes up in class. From a book review by Theo Tait: A number of shark species go in for oophagy, or uterine cannibalism. Sand tiger foetuses ‘eat each other in utero, acting out the harshest form of sibling rivalry imaginable’. Only two babies emerge, one from each of the mother shark’s uteruses: the survivors have eaten everything else. ‘A female sand tiger gives birth to a baby that’s already a metre long and an experienced killer,’ explains Demian Chapman, an expert on the subject. That’s what I call underdispersion. E(y)=2, var(y)=0. Take that, M. Poisson!

2 0.74610603 102 andrew gelman stats-2010-06-21-Why modern art is all in the mind

Introduction: This looks cool: Ten years ago researchers in America took two groups of three-year-olds and showed them a blob of paint on a canvas. Children who were told that the marks were the result of an accidental spillage showed little interest. The others, who had been told that the splodge of colour had been carefully created for them, started to refer to it as “a painting”. Now that experiment . . . has gone on to form part of the foundation of an influential new book that questions the way in which we respond to art. . . . The book, which is subtitled The New Science of Why We Like What We Like, is not an attack on modern or contemporary art and Bloom says fans of more traditional art are not capable of making purely aesthetic judgments either. “I don’t have a strong position about the art itself,” he said this weekend. “But I do have a strong position about why we actually like it.” This sounds fascinating. But I’m skeptical about this part: Humans are incapable of just getti

3 0.72098595 1179 andrew gelman stats-2012-02-21-“Readability” as freedom from the actual sensation of reading

Introduction: In her essay on Margaret Mitchell and Gone With the Wind, Claudia Roth Pierpoint writes: The much remarked “readability” of the book must have played a part in this smooth passage from the page to the screen, since “readability” has to do not only with freedom from obscurity but, paradoxically, with freedom from the actual sensation of reading [emphasis added]—of the tug and traction of words as they move thoughts into place in the mind. Requiring, in fact, the least reading, the most “readable” book allows its characters to slip easily through nets of words and into other forms. Popular art has been well defined by just this effortless movement from medium to medium, which is carried out, as Leslie Fiedler observed in relation to Uncle Tom’s Cabin, “without loss of intensity or alteration of meaning.” Isabel Archer rises from the page only in the hanging garments of Henry James’s prose, but Scarlett O’Hara is a free woman. Well put. I wish Pierpoint would come out with ano

4 0.71794486 2164 andrew gelman stats-2014-01-09-Hermann Goering and Jane Jacobs, together at last!

Introduction: Hermann Goering is famous for two things: 1. Being an air force general, and 2. Being a really bad air force general. What does this have to do, you may ask, with Jane Jacobs, who is famous for a book she wrote in the early 1960s advocating small, mixed-use street-level city development, in contrast to the mega-projects that were advocated by many influential planners at the time. The connection is, as a London-based friend pointed out to me the other day, that the German bombing of London in WW2 knocked out random sections all over the city, which were then often replaced by various public developments. The knocked-out portions were often small, so there was not always room for megablocks to replace them, and they were scattered—so the new housing was also distributed haphazardly all over the city. Thus, Goering helped in two ways, corresponding to the two numbered points listed above: 1. His air force dropped bombs and destroyed buildings all over London. 2. His at

5 0.71396095 203 andrew gelman stats-2010-08-12-John McPhee, the Anti-Malcolm

Introduction: This blog is threatening to turn into Statistical Modeling, Causal Inference, Social Science, and Literature Criticism, but I’m just going to go with the conversational flow, so here’s another post about an essayist. I’m not a big fan of Janet Malcolm’s essays — and I don’t mean I don’t like her attitude or her pro-murderer attitude, I mean I don’t like them all that much as writing. They’re fine, I read them, they don’t bore me, but I certainly don’t think she’s “our” best essayist. But that’s not a debate I want to have right now, and if I did I’m quite sure most of you wouldn’t want to read it anyway. So instead, I’ll just say something about John McPhee. As all right-thinking people agree, in McPhee’s long career he has written two kinds of books: good, short books, and bad, long books. (He has also written many New Yorker essays, and perhaps other essays for other magazines too; most of these are good, although I haven’t seen any really good recent work from him, and so

6 0.71060097 1790 andrew gelman stats-2013-04-06-Calling Jenny Davidson . . .

7 0.70953637 2021 andrew gelman stats-2013-09-13-Swiss Jonah Lehrer

8 0.70909894 115 andrew gelman stats-2010-06-28-Whassup with those crappy thrillers?

9 0.70166904 655 andrew gelman stats-2011-04-10-“Versatile, affordable chicken has grown in popularity”

10 0.69973153 2168 andrew gelman stats-2014-01-12-Things that I like that almost nobody else is interested in

11 0.69828713 2189 andrew gelman stats-2014-01-28-History is too important to be left to the history professors

12 0.69819164 1390 andrew gelman stats-2012-06-23-Traditionalist claims that modern art could just as well be replaced by a “paint-throwing chimp”

13 0.69758058 1783 andrew gelman stats-2013-03-31-He’s getting ready to write a book

14 0.69177318 285 andrew gelman stats-2010-09-18-Fiction is not for tirades? Tell that to Saul Bellow!

15 0.69081128 1641 andrew gelman stats-2012-12-27-The Möbius strip, or, marketing that is impervious to criticism

16 0.68927467 1925 andrew gelman stats-2013-07-04-“Versatile, affordable chicken has grown in popularity”

17 0.68871701 1970 andrew gelman stats-2013-08-06-New words of 1917

18 0.67991406 499 andrew gelman stats-2011-01-03-5 books

19 0.67867744 893 andrew gelman stats-2011-09-06-Julian Symons on Frances Newman

20 0.67851764 16 andrew gelman stats-2010-05-04-Burgess on Kipling


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.049), (16, 0.14), (21, 0.017), (24, 0.035), (41, 0.021), (57, 0.411), (65, 0.02), (82, 0.016), (84, 0.022), (98, 0.02), (99, 0.121)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.91432929 891 andrew gelman stats-2011-09-05-World Bank data now online

Introduction: Wayne Folta writes that the World Bank is opening up some of its data for researchers.

same-blog 2 0.89623272 1542 andrew gelman stats-2012-10-20-A statistical model for underdispersion

Introduction: We have lots of models for overdispersed count data but we rarely see underdispersed data. But now I know what example I’ll be giving when this next comes up in class. From a book review by Theo Tait: A number of shark species go in for oophagy, or uterine cannibalism. Sand tiger foetuses ‘eat each other in utero, acting out the harshest form of sibling rivalry imaginable’. Only two babies emerge, one from each of the mother shark’s uteruses: the survivors have eaten everything else. ‘A female sand tiger gives birth to a baby that’s already a metre long and an experienced killer,’ explains Demian Chapman, an expert on the subject. That’s what I call underdispersion. E(y)=2, var(y)=0. Take that, M. Poisson!

3 0.77494419 1146 andrew gelman stats-2012-01-30-Convenient page of data sources from the Washington Post

Introduction: Wayne Folta points us to this list .

4 0.72992969 2039 andrew gelman stats-2013-09-25-Harmonic convergence

Introduction: Diederik Stapel gives a Ted talk . Sometimes, reality truly is a parody of reality.

5 0.6376996 1043 andrew gelman stats-2011-12-06-Krugman disses Hayek as “being almost entirely about politics rather than economics”

Introduction: That’s ok , Krugman earlier slammed Galbraith. (I wonder if Krugman is as big a fan of “tough choices” now as he was in 1996 .) Given Krugman’s politicization in recent years, I’m surprised he’s so dismissive of the political (rather than technical-economic) nature of Hayek’s influence. (I don’t know if he’s changed his views on Galbraith in recent years.) P.S. Greg Mankiw, in contrast, labels Galbraith and Hayek as “two of the great economists of the 20th century” and writes, “even though their most famous works were written many decades ago, they are still well worth reading today.”

6 0.5933187 1101 andrew gelman stats-2012-01-05-What are the standards for reliability in experimental psychology?

7 0.58381492 1485 andrew gelman stats-2012-09-06-One reason New York isn’t as rich as it used to be: Redistribution of federal tax money to other states

8 0.56576812 1044 andrew gelman stats-2011-12-06-The K Foundation burns Cosma’s turkey

9 0.55843931 1018 andrew gelman stats-2011-11-19-Tempering and modes

10 0.5510897 215 andrew gelman stats-2010-08-18-DataMarket

11 0.54872334 861 andrew gelman stats-2011-08-19-Will Stan work well with 40×40 matrices?

12 0.50572699 1460 andrew gelman stats-2012-08-16-“Real data can be a pain”

13 0.5057227 1870 andrew gelman stats-2013-05-26-How to understand coefficients that reverse sign when you start controlling for things?

14 0.49973339 1821 andrew gelman stats-2013-04-24-My talk midtown this Friday noon (and at Columbia Monday afternoon)

15 0.49870741 1120 andrew gelman stats-2012-01-15-Fun fight over the Grover search algorithm

16 0.49186563 1036 andrew gelman stats-2011-11-30-Stan uses Nuts!

17 0.4893434 306 andrew gelman stats-2010-09-29-Statistics and the end of time

18 0.48636699 989 andrew gelman stats-2011-11-03-This post does not mention Wegman

19 0.48387581 2318 andrew gelman stats-2014-05-04-Stan (& JAGS) Tutorial on Linear Mixed Models

20 0.4744935 816 andrew gelman stats-2011-07-22-“Information visualization” vs. “Statistical graphics”