andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-318 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Very freakonomic (and I mean that in the best sense of the word).
sentIndex sentText sentNum sentScore
1 Very freakonomic (and I mean that in the best sense of the word). [sent-1, score-1.222]
wordName wordTfidf (topN-words)
[('word', 0.707), ('mean', 0.439), ('best', 0.416), ('sense', 0.367)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 318 andrew gelman stats-2010-10-04-U-Haul statistics
Introduction: Very freakonomic (and I mean that in the best sense of the word).
2 0.24407907 1191 andrew gelman stats-2012-03-01-Hoe noem je?
Introduction: Gerrit Storms reports on an interesting linguistic research project in which you can participate! Here’s the description: Over the past few weeks, we have been trying to set up a scientific study that is important for many researchers interested in words, word meaning, semantics, and cognitive science in general. It is a huge word association project, in which people are asked to participate in a small task that doesn’t last longer than 5 minutes. Our goal is to build a global word association network that contains connections between about 40,000 words, the size of the lexicon of an average adult. Setting up such a network might learn us a lot about semantic memory, how it develops, and maybe also about how it can deteriorate (like in Alzheimer’s disease). Most people enjoy doing the task, but we need thousands of participants to succeed. Up till today, we found about 53,000 participants willing to do the little task, but we need more subjects. That is why we address you. Would
3 0.2344057 77 andrew gelman stats-2010-06-09-Sof[t]
Introduction: Joe Fruehwald writes: I’m working with linguistic data, specifically binomial hits and misses of a certain variable for certain words (specifically whether or not the “t” sound was pronounced at the end of words like “soft”). Word frequency follows a power law, with most words appearing just once, and with some words being hyperfrequent. I’m not interested in specific word effects, but I am interested in the effect of word frequency. A logistic model fit is going to be heavily influenced by the effect of the hyperfrequent words which constitute only one type. To control for the item effect, I would fit a multilevel model with a random intercept by word, but like I said, most of the words appear only once. Is there a principled approach to this problem? My response: It’s ok to fit a multilevel model even if most groups only have one observation each. You’ll want to throw in some word-level predictors too. Think of the multilevel model not as a substitute for the usual thoug
4 0.16849473 476 andrew gelman stats-2010-12-19-Google’s word count statistics viewer
Introduction: Word count stats from the Google books database prove that Bayesianism is expanding faster than the universe. A n-gram is a tuple of n words.
5 0.15739523 742 andrew gelman stats-2011-06-02-Grouponomics, counterfactuals, and opportunity cost
Introduction: I keep encountering the word “Groupon”–I think it’s some sort of pets.com-style commercial endeavor where people can buy coupons? I don’t really care, and I’ve avoided googling the word out of a general animosity toward our society’s current glorification of get-rich-quick schemes. (As you can tell, I’m still bitter about that whole stock market thing.) Anyway, even without knowing what Groupon actually is, I enjoyed this blog by Kaiser Fung in which he tries to work out some of its economic consequences. He connects the statistical notion of counterfactuals to the concept of opportunity cost from economics. The comments are interesting too.
6 0.14350152 255 andrew gelman stats-2010-09-04-How does multilevel modeling affect the estimate of the grand mean?
7 0.12971792 2292 andrew gelman stats-2014-04-15-When you believe in things that you don’t understand
8 0.12817807 574 andrew gelman stats-2011-02-14-“The best data visualizations should stand on their own”? I don’t think so.
9 0.12737486 1796 andrew gelman stats-2013-04-09-The guy behind me on line for the train . . .
10 0.11035375 727 andrew gelman stats-2011-05-23-My new writing strategy
11 0.10601071 810 andrew gelman stats-2011-07-20-Adding more information can make the variance go up (depending on your model)
12 0.1059058 503 andrew gelman stats-2011-01-04-Clarity on my email policy
13 0.099066868 312 andrew gelman stats-2010-10-02-“Regression to the mean” is fine. But what’s the “mean”?
14 0.088621438 938 andrew gelman stats-2011-10-03-Comparing prediction errors
15 0.087593555 1808 andrew gelman stats-2013-04-17-Excel-bashing
16 0.086996496 2234 andrew gelman stats-2014-03-05-Plagiarism, Arizona style
17 0.084047832 1725 andrew gelman stats-2013-02-17-“1.7%” ha ha ha
18 0.081449792 738 andrew gelman stats-2011-05-30-Works well versus well understood
19 0.080053307 1317 andrew gelman stats-2012-05-13-Question 3 of my final exam for Design and Analysis of Sample Surveys
20 0.076783068 1933 andrew gelman stats-2013-07-10-Please send all comments to -dev-ripley
topicId topicWeight
[(0, 0.062), (1, 0.004), (2, 0.008), (3, 0.005), (4, 0.002), (5, -0.013), (6, 0.041), (7, 0.008), (8, 0.024), (9, 0.005), (10, -0.014), (11, -0.002), (12, 0.02), (13, -0.015), (14, -0.027), (15, -0.001), (16, 0.013), (17, -0.006), (18, 0.015), (19, 0.008), (20, 0.026), (21, -0.027), (22, 0.031), (23, 0.03), (24, -0.013), (25, -0.001), (26, 0.028), (27, 0.022), (28, -0.012), (29, 0.008), (30, -0.027), (31, 0.033), (32, -0.025), (33, 0.012), (34, -0.001), (35, 0.013), (36, -0.002), (37, -0.025), (38, -0.014), (39, 0.007), (40, 0.056), (41, -0.002), (42, -0.023), (43, -0.016), (44, -0.035), (45, 0.004), (46, 0.033), (47, 0.012), (48, -0.007), (49, -0.007)]
simIndex simValue blogId blogTitle
same-blog 1 0.99075276 318 andrew gelman stats-2010-10-04-U-Haul statistics
Introduction: Very freakonomic (and I mean that in the best sense of the word).
2 0.61599565 668 andrew gelman stats-2011-04-19-The free cup and the extra dollar: A speculation in philosophy
Introduction: The following is an essay into a topic I know next to nothing about. As part of our endless discussion of Dilbert and Charlie Sheen, commenter Fraac linked to a blog by philosopher Edouard Machery, who tells a fascinating story : How do we think about the intentional nature of actions? And how do people with an impaired mindreading capacity think about it? Consider the following probes: The Free-Cup Case Joe was feeling quite dehydrated, so he stopped by the local smoothie shop to buy the largest sized drink available. Before ordering, the cashier told him that if he bought a Mega-Sized Smoothie he would get it in a special commemorative cup. Joe replied, ‘I don’t care about a commemorative cup, I just want the biggest smoothie you have.’ Sure enough, Joe received the Mega-Sized Smoothie in a commemorative cup. Did Joe intentionally obtain the commemorative cup? The Extra-Dollar Case Joe was feeling quite dehydrated, so he stopped by the local smoothie shop to buy
3 0.59135348 157 andrew gelman stats-2010-07-21-Roller coasters, charity, profit, hmmm
Introduction: Dan Kahan writes: Here is a very interesting article form Science that reports result of experiment that looked at whether people bought a product (picture of themselves screaming or vomiting on roller coaster) or paid more for it when told “1/2 to charity.” Answer was “buy more” but “pay lots less” than when alternative was fixed price w/ or w/o charity; and “buy more” & “pay more” if consumer could name own price & 1/2 went to charity than if none went to charity. Pretty interesting. But . . . What’s odd, I [Kahan] think, is the measure used to report the result. The paper (written by some really amazingly good social psychologists; I know this from other studies) goes on & on, w/ figures & tables, about how the amusement park’s “revenue,” “revenue per ride” & “profit” went up by large amount when it used “name your own price & 1/2 to charity.” Yet that result is dominated by random effects — the marginal cost & volume of sales are peculiar to the product being sold &
4 0.58866262 138 andrew gelman stats-2010-07-10-Creating a good wager based on probability estimates
Introduction: Suppose you and I agree on a probability estimate…perhaps we both agree there is a 2/3 chance Spain will beat Netherlands in tomorrow’s World Cup. In this case, we could agree on a wager: if Spain beats Netherlands, I pay you $x. If Netherlands beats Spain, you pay me $2x. It is easy to see that my expected loss (or win) is $0, and that the same is true for you. Either of us should be indifferent to taking this bet, and to which side of the bet we are on. We might make this bet just to increase our interest in watching the game, but neither of us would see a money-making opportunity here. By the way, the relationship between “odds” and the event probability — a 1/3 chance of winning turning into a bet at 2:1 odds — is that if the event probability is p, then a fair bet has odds of (1/p – 1):1. More interesting, and more relevant to many real-world situations, is the case that we disagree on the probability of an event. If we disagree on the probability, then there should be
5 0.58806264 1759 andrew gelman stats-2013-03-12-How tall is Jon Lee Anderson?
Introduction: The second best thing about this story (from Tom Scocca) is that Anderson spells “Tweets” with a capital T. But the best thing is that Scocca is numerate—he compares numbers on the logarithmic scale: Reminding Lake that he only had 169 Twitter followers was the saddest gambit of all. Jon Lee Anderson has 17,866 followers. And Kim Kardashian has, as I write this, 17,489,892 followers. That is: Jon Lee Anderson is 1/1,000 as important on Twitter, by his own standard, as Kim Kardashian. He is 10 times closer to Mitch Lake than he is to Kim Kardashian. How often do we see a popular journalist who understands orders of magnitude? Good job, Tom Scocca! P.S. Based on his “little twerp” comment, I also wonder if Anderson suffers from tall person syndrome—that’s the problem that some people of above-average height have, that they think they’re more important than other people because they literally look down on them. Don’t get me wrong—I have lots of tall friends who are complete
6 0.57311106 1089 andrew gelman stats-2011-12-28-Path sampling for models of varying dimension
7 0.5725857 1105 andrew gelman stats-2012-01-08-Econ debate about prices at a fancy restaurant
8 0.56097811 2338 andrew gelman stats-2014-05-19-My short career as a Freud expert
9 0.55967295 607 andrew gelman stats-2011-03-11-Rajiv Sethi on the interpretation of prediction market data
10 0.55501544 2070 andrew gelman stats-2013-10-20-The institution of tenure
11 0.55467099 2128 andrew gelman stats-2013-12-09-How to model distributions that have outliers in one direction
12 0.55169326 767 andrew gelman stats-2011-06-15-Error in an attribution of an error
13 0.54487234 1252 andrew gelman stats-2012-04-08-Jagdish Bhagwati’s definition of feminist sincerity
14 0.54270518 574 andrew gelman stats-2011-02-14-“The best data visualizations should stand on their own”? I don’t think so.
15 0.54259324 719 andrew gelman stats-2011-05-19-Everything is Obvious (once you know the answer)
16 0.54166383 343 andrew gelman stats-2010-10-15-?
17 0.54138607 1453 andrew gelman stats-2012-08-10-Quotes from me!
18 0.53469527 1359 andrew gelman stats-2012-06-02-Another retraction
19 0.5337885 1424 andrew gelman stats-2012-07-22-Extreme events as evidence for differences in distributions
20 0.53300059 341 andrew gelman stats-2010-10-14-Confusion about continuous probability densities
topicId topicWeight
[(24, 0.251), (99, 0.415)]
simIndex simValue blogId blogTitle
1 0.99999905 1733 andrew gelman stats-2013-02-22-Krugman sets the bar too high
Introduction: If being cantankerous and potty-mouthed is a bad thing, I’m in big trouble !
Introduction: From 2.5 years ago . Read all the comments; the discussion is helpful.
3 0.99753284 2283 andrew gelman stats-2014-04-06-An old discussion of food deserts
Introduction: I happened to be reading an old comment thread from 2012 (follow the link from here ) and came across this amusing exchange: Perhaps this is the paper Jonathan was talking about? Here’s more from the thread: Anyway, I don’t have anything to add right now, I just thought it was an interesting discussion.
4 0.99734741 408 andrew gelman stats-2010-11-11-Incumbency advantage in 2010
Introduction: See here for the full story.
Introduction: Here’s the story. P.S. Some sociologists discuss the case here .
6 0.9953779 2200 andrew gelman stats-2014-02-05-Prior distribution for a predicted probability
8 0.99157691 963 andrew gelman stats-2011-10-18-Question on Type M errors
9 0.991243 1941 andrew gelman stats-2013-07-16-Priors
10 0.99091899 259 andrew gelman stats-2010-09-06-Inbox zero. Really.
11 0.9907589 1150 andrew gelman stats-2012-02-02-The inevitable problems with statistical significance and 95% intervals
12 0.99053407 970 andrew gelman stats-2011-10-24-Bell Labs
13 0.9902122 86 andrew gelman stats-2010-06-14-“Too much data”?
14 0.98913157 2295 andrew gelman stats-2014-04-18-One-tailed or two-tailed?
15 0.98911691 63 andrew gelman stats-2010-06-02-The problem of overestimation of group-level variance parameters
16 0.98904991 2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters
17 0.98876238 1363 andrew gelman stats-2012-06-03-Question about predictive checks
18 0.98859572 669 andrew gelman stats-2011-04-19-The mysterious Gamma (1.4, 0.4)
19 0.9884845 77 andrew gelman stats-2010-06-09-Sof[t]
20 0.98771477 2109 andrew gelman stats-2013-11-21-Hidden dangers of noninformative priors