andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1174 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Kaiser asks the interesting question: How do you measure what restaurants are “overrated”? You can’t just ask people, right? There’s some sort of social element here, that “overrated” implies that someone’s out there doing the rating.
sentIndex sentText sentNum sentScore
1 Kaiser asks the interesting question: How do you measure what restaurants are “overrated”? [sent-1, score-0.83]
2 There’s some sort of social element here, that “overrated” implies that someone’s out there doing the rating. [sent-3, score-0.75]
wordName wordTfidf (topN-words)
[('overrated', 0.665), ('restaurants', 0.364), ('element', 0.312), ('rating', 0.267), ('implies', 0.23), ('kaiser', 0.217), ('asks', 0.196), ('measure', 0.167), ('ask', 0.154), ('social', 0.116), ('someone', 0.114), ('interesting', 0.103), ('question', 0.096), ('sort', 0.092), ('right', 0.091), ('people', 0.055)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 1174 andrew gelman stats-2012-02-18-Not as ugly as you look
Introduction: Kaiser asks the interesting question: How do you measure what restaurants are “overrated”? You can’t just ask people, right? There’s some sort of social element here, that “overrated” implies that someone’s out there doing the rating.
2 0.13059008 2365 andrew gelman stats-2014-06-09-I hate polynomials
Introduction: A recent discussion with Mark Palko [scroll down to the comments at this link ] reminds me that I think that polynomials are way way overrated, and I think a lot of damage has arisen from the old-time approach of introducing polynomial functions as a canonical example of linear regressions ( for example ). There are very few settings I can think of where it makes sense to fit a general polynomial of degree higher than 2. I think that millions of students have been brainwashed into thinking of these as the canonical functions and that this has caused endless trouble later on. I’m not sure how I’d change the high school math curriculum to deal with this, but I do think it’s an issue.
3 0.110734 648 andrew gelman stats-2011-04-04-The Case for More False Positives in Anti-doping Testing
Introduction: No joke. See here (from Kaiser Fung). At the Statistics Forum.
4 0.099718675 388 andrew gelman stats-2010-11-01-The placebo effect in pharma
Introduction: Bruce McCullough writes: The Sept 2009 issue of Wired had a big article on the increase in the placebo effect, and why it’s been getting bigger. Kaiser Fung has a synopsis . As if you don’t have enough to do, I thought you might be interested in blogging on this. My reply: I thought Kaiser’s discussion was good, especially this point: Effect on treatment group = Effect of the drug + effect of belief in being treated Effect on placebo group = Effect of belief in being treated Thus, the difference between the two groups = effect of the drug, since the effect of belief in being treated affects both groups of patients. Thus, as Kaiser puts it, if the treatment isn’t doing better than placebo, it doesn’t say that the placebo effect is big (let alone “too big”) but that the treatment isn’t showing any additional effect. It’s “treatment + placebo” vs. placebo, not treatment vs. placebo. That said, I’d prefer for Kaiser to make it clear that the additivity he’s assu
5 0.099192582 2248 andrew gelman stats-2014-03-15-Problematic interpretations of confidence intervals
Introduction: Rink Hoekstra writes: A couple of months ago, you were visiting the University of Groningen, and after the talk you gave there I spoke briefly with you about a study that I conducted with Richard Morey, Jeff Rouder and Eric-Jan Wagenmakers. In the study, we found that researchers’ knowledge of how to interpret a confidence interval (CI), was almost as limited as the knowledge of students who had had no inferential statistics course yet. Our manuscript was recently accepted for publication in Psychonomic Bulletin & Review , and it’s now available online (see e.g., here ). Maybe it’s interesting to discuss on your blog, especially since CIs are often promoted (for example in the new guidelines of Psychological Science ), but apparently researchers seem to have little idea how to interpret them. Given that the confidence percentage of a CI tells something about the procedure rather than about the data at hand, this might be understandable, but, according to us, it’s problematic neve
6 0.098277584 2031 andrew gelman stats-2013-09-19-What makes a statistician look like a hero?
7 0.097825423 543 andrew gelman stats-2011-01-28-NYT shills for personal DNA tests
8 0.092634909 2186 andrew gelman stats-2014-01-26-Infoviz on top of stat graphic on top of spreadsheet
9 0.088772558 230 andrew gelman stats-2010-08-24-Kaggle forcasting update
10 0.085314676 1001 andrew gelman stats-2011-11-10-Three hours in the life of a statistician
11 0.084991537 204 andrew gelman stats-2010-08-12-Sloppily-written slam on moderately celebrated writers is amusing nonetheless
12 0.082758747 1132 andrew gelman stats-2012-01-21-A counterfeit data graphic
13 0.079699226 461 andrew gelman stats-2010-12-09-“‘Why work?’”
14 0.076261036 2060 andrew gelman stats-2013-10-13-New issue of Symposium magazine
15 0.067847379 1245 andrew gelman stats-2012-04-03-Redundancy and efficiency: In praise of Penn Station
16 0.065445915 1076 andrew gelman stats-2011-12-21-Derman, Rodrik and the nature of statistical models
17 0.062071897 1678 andrew gelman stats-2013-01-17-Wanted: 365 stories of statistics
18 0.058536161 982 andrew gelman stats-2011-10-30-“There’s at least as much as an 80 percent chance . . .”
19 0.057633795 209 andrew gelman stats-2010-08-16-EdLab at Columbia’s Teachers’ College
topicId topicWeight
[(0, 0.052), (1, -0.027), (2, -0.002), (3, -0.002), (4, 0.013), (5, -0.005), (6, -0.002), (7, 0.018), (8, 0.029), (9, 0.028), (10, -0.024), (11, -0.018), (12, 0.002), (13, 0.006), (14, -0.057), (15, -0.016), (16, -0.033), (17, 0.105), (18, 0.018), (19, -0.01), (20, 0.008), (21, 0.005), (22, -0.002), (23, -0.061), (24, 0.006), (25, 0.013), (26, 0.015), (27, -0.021), (28, -0.01), (29, 0.043), (30, 0.002), (31, -0.042), (32, 0.043), (33, 0.031), (34, 0.043), (35, -0.069), (36, 0.033), (37, -0.024), (38, -0.004), (39, -0.042), (40, -0.012), (41, 0.07), (42, 0.02), (43, 0.001), (44, -0.034), (45, 0.026), (46, 0.035), (47, -0.003), (48, -0.015), (49, -0.029)]
simIndex simValue blogId blogTitle
same-blog 1 0.95719916 1174 andrew gelman stats-2012-02-18-Not as ugly as you look
Introduction: Kaiser asks the interesting question: How do you measure what restaurants are “overrated”? You can’t just ask people, right? There’s some sort of social element here, that “overrated” implies that someone’s out there doing the rating.
2 0.90686816 2031 andrew gelman stats-2013-09-19-What makes a statistician look like a hero?
Introduction: Answer here (courtesy of Kaiser Fung).
3 0.84828281 1256 andrew gelman stats-2012-04-10-Our data visualization panel at the New York Public Library
Introduction: In case you couldn’t come to our panel (with Kaiser Fung, Mark Hansen, Tahir Hemphill, Manuel Lima, and Jonathan Stray, and organized by Isabel Draves), here’s the video:
4 0.84512782 1001 andrew gelman stats-2011-11-10-Three hours in the life of a statistician
Introduction: Kaiser Fung tells what it’s really like . Here’s a sample: As soon as I [Kaiser] put the substring-concatenate expression together with two lines of code that generate data tables, it choked. Sorta like Dashiell Hammett without the broads and the heaters. And here’s another take, from a slightly different perspective.
5 0.83160275 543 andrew gelman stats-2011-01-28-NYT shills for personal DNA tests
Introduction: Kaiser nails it . The offending article , by John Tierney, somehow ended up in the Science section rather than the Opinion section. As an opinion piece (or, for that matter, a blog), Tierney’s article would be nothing special. But I agree with Kaiser that it doesn’t work as a newspaper article. As Kaiser notes, this story involves a bunch of statistical and empirical claims that are not well resolved by P.R. and rhetoric.
6 0.7853716 1132 andrew gelman stats-2012-01-21-A counterfeit data graphic
7 0.74461228 742 andrew gelman stats-2011-06-02-Grouponomics, counterfactuals, and opportunity cost
8 0.71013558 461 andrew gelman stats-2010-12-09-“‘Why work?’”
9 0.68479198 388 andrew gelman stats-2010-11-01-The placebo effect in pharma
10 0.68096524 2186 andrew gelman stats-2014-01-26-Infoviz on top of stat graphic on top of spreadsheet
11 0.66341984 985 andrew gelman stats-2011-11-01-Doug Schoen has 2 poll reports
12 0.65833092 982 andrew gelman stats-2011-10-30-“There’s at least as much as an 80 percent chance . . .”
13 0.62546766 2121 andrew gelman stats-2013-12-02-Should personal genetic testing be regulated? Battle of the blogroll
14 0.62544721 648 andrew gelman stats-2011-04-04-The Case for More False Positives in Anti-doping Testing
15 0.60164058 1090 andrew gelman stats-2011-12-28-“. . . extending for dozens of pages”
16 0.57653409 1246 andrew gelman stats-2012-04-04-Data visualization panel at the New York Public Library this evening!
17 0.57427824 238 andrew gelman stats-2010-08-27-No radon lobby
18 0.57407457 1612 andrew gelman stats-2012-12-08-The Case for More False Positives in Anti-doping Testing
19 0.5718013 344 andrew gelman stats-2010-10-15-Story time
20 0.57152939 209 andrew gelman stats-2010-08-16-EdLab at Columbia’s Teachers’ College
topicId topicWeight
[(16, 0.058), (24, 0.105), (34, 0.054), (76, 0.065), (88, 0.319), (99, 0.178)]
simIndex simValue blogId blogTitle
same-blog 1 0.87138915 1174 andrew gelman stats-2012-02-18-Not as ugly as you look
Introduction: Kaiser asks the interesting question: How do you measure what restaurants are “overrated”? You can’t just ask people, right? There’s some sort of social element here, that “overrated” implies that someone’s out there doing the rating.
2 0.75070393 1098 andrew gelman stats-2012-01-04-Bayesian Page Rank?
Introduction: Loren Maxwell writes: I am trying to do some studies on the PageRank algorithm with applying a Bayesian technique. If you are not familiar with PageRank, it is the basis for how Google ranks their pages. It basically treats the internet as a large social network with each link conferring some value onto the page it links to. For example, if I had a webpage that had only one link to it, say from my friend’s webpage, then its PageRank would be dependent on my friend’s PageRank, presumably quite low. However, if the one link to my page was off the Google search page, then my PageRank would be quite high since there are undoubtedly millions of pages linking to Google and few pages that Google links to. The end result of the algorithm, however, is that all the PageRank values of the nodes in the network sum to one and the PageRank of a specific node is the probability that a “random surfer” will end up on that node. For example, in the attached spreadsheet, Column D shows e
3 0.72278601 290 andrew gelman stats-2010-09-22-Data Thief
Introduction: John Transue sends along a link to this software for extracting data from graphs. I haven’t tried it out but it could be useful to somebody out there?
4 0.71200037 1992 andrew gelman stats-2013-08-21-Workshop for Women in Machine Learning
Introduction: This might interest some of you: CALL FOR ABSTRACTS Workshop for Women in Machine Learning Co-located with NIPS 2013, Lake Tahoe, Nevada, USA December 5, 2013 http://www.wimlworkshop.org Deadline for abstract submissions: September 16, 2013 WORKSHOP DESCRIPTION The Workshop for Women in Machine Learning is a day-long event taking place on the first day of NIPS. The workshop aims to showcase the research of women in machine learning and to strengthen their community. The event brings together female faculty, graduate students, and research scientists for an opportunity to connect, exchange ideas, and learn from each other. Underrepresented minorities and undergraduates interested in pursuing machine learning research are encouraged to participate. While all presenters will be female, all genders are invited to attend. Scholarships will be provided to female students and postdoctoral attendees with accepted abstracts to partially offset travel costs. Workshop
5 0.68929052 136 andrew gelman stats-2010-07-09-Using ranks as numbers
Introduction: David Shor writes: I’m dealing with a situation where I have two datasets, one that assigns each participant a discrete score out of five for a set of particular traits (Dog behavior characteristics by breed), and another from an independent source that ranks each breed by each characteristic. It’s also possible to obtain the results of a survey, where experts were asked to rank 7 randomly picked breeds by characteristics. I’m interested in obtaining estimates for each trait, and intuitively, it seems clear that the second and third dataset provide a lot of information. But it’s unclear how to incorporate them to infer latent variables, since only sample ranks are observed. This seems like it is a common problem, do you have any suggestions? My quick answer is that you can treat ranks as numbers (a point we make somewhere in Bayesian Data Analysis, I believe) and just fit an item-response model from there. Val Johnson wrote an article on this in Jasa a few years ago, “Bayesia
6 0.68259478 569 andrew gelman stats-2011-02-12-Get the Data
7 0.66897887 1507 andrew gelman stats-2012-09-22-Grade inflation: why weren’t the instructors all giving all A’s already??
9 0.65457821 2095 andrew gelman stats-2013-11-09-Typo in Ghitza and Gelman MRP paper
10 0.65068221 825 andrew gelman stats-2011-07-27-Grade inflation: why weren’t the instructors all giving all A’s already??
11 0.63465059 603 andrew gelman stats-2011-03-07-Assumptions vs. conditions, part 2
12 0.6344229 1403 andrew gelman stats-2012-07-02-Moving beyond hopeless graphics
13 0.62356848 784 andrew gelman stats-2011-07-01-Weighting and prediction in sample surveys
14 0.62133372 1866 andrew gelman stats-2013-05-21-Recently in the sister blog
15 0.61958116 400 andrew gelman stats-2010-11-08-Poli sci plagiarism update, and a note about the benefits of not caring
16 0.61871833 1930 andrew gelman stats-2013-07-09-Symposium Magazine
17 0.61319947 1633 andrew gelman stats-2012-12-21-Kahan on Pinker on politics
18 0.58195555 1087 andrew gelman stats-2011-12-27-“Keeping things unridiculous”: Berger, O’Hagan, and me on weakly informative priors
19 0.58120167 2365 andrew gelman stats-2014-06-09-I hate polynomials
20 0.57821727 1414 andrew gelman stats-2012-07-12-Steven Pinker’s unconvincing debunking of group selection