andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1174 knowledge-graph by maker-knowledge-mining

1174 andrew gelman stats-2012-02-18-Not as ugly as you look

meta infos for this blog

Source: html

Introduction: Kaiser asks the interesting question: How do you measure what restaurants are “overrated”? You can’t just ask people, right? There’s some sort of social element here, that “overrated” implies that someone’s out there doing the rating.

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Kaiser asks the interesting question: How do you measure what restaurants are “overrated”? [sent-1, score-0.83]

2 There’s some sort of social element here, that “overrated” implies that someone’s out there doing the rating. [sent-3, score-0.75]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('overrated', 0.665), ('restaurants', 0.364), ('element', 0.312), ('rating', 0.267), ('implies', 0.23), ('kaiser', 0.217), ('asks', 0.196), ('measure', 0.167), ('ask', 0.154), ('social', 0.116), ('someone', 0.114), ('interesting', 0.103), ('question', 0.096), ('sort', 0.092), ('right', 0.091), ('people', 0.055)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1174 andrew gelman stats-2012-02-18-Not as ugly as you look

2 0.13059008 2365 andrew gelman stats-2014-06-09-I hate polynomials

Introduction: A recent discussion with Mark Palko [scroll down to the comments at this link ] reminds me that I think that polynomials are way way overrated, and I think a lot of damage has arisen from the old-time approach of introducing polynomial functions as a canonical example of linear regressions ( for example ). There are very few settings I can think of where it makes sense to fit a general polynomial of degree higher than 2. I think that millions of students have been brainwashed into thinking of these as the canonical functions and that this has caused endless trouble later on. I’m not sure how I’d change the high school math curriculum to deal with this, but I do think it’s an issue.

3 0.110734 648 andrew gelman stats-2011-04-04-The Case for More False Positives in Anti-doping Testing

Introduction: No joke. See here (from Kaiser Fung). At the Statistics Forum.

4 0.099718675 388 andrew gelman stats-2010-11-01-The placebo effect in pharma

Introduction: Bruce McCullough writes: The Sept 2009 issue of Wired had a big article on the increase in the placebo effect, and why it’s been getting bigger. Kaiser Fung has a synopsis . As if you don’t have enough to do, I thought you might be interested in blogging on this. My reply: I thought Kaiser’s discussion was good, especially this point: Effect on treatment group = Effect of the drug + effect of belief in being treated Effect on placebo group = Effect of belief in being treated Thus, the difference between the two groups = effect of the drug, since the effect of belief in being treated affects both groups of patients. Thus, as Kaiser puts it, if the treatment isn’t doing better than placebo, it doesn’t say that the placebo effect is big (let alone “too big”) but that the treatment isn’t showing any additional effect. It’s “treatment + placebo” vs. placebo, not treatment vs. placebo. That said, I’d prefer for Kaiser to make it clear that the additivity he’s assu

5 0.099192582 2248 andrew gelman stats-2014-03-15-Problematic interpretations of confidence intervals

Introduction: Rink Hoekstra writes: A couple of months ago, you were visiting the University of Groningen, and after the talk you gave there I spoke briefly with you about a study that I conducted with Richard Morey, Jeff Rouder and Eric-Jan Wagenmakers. In the study, we found that researchers’ knowledge of how to interpret a confidence interval (CI), was almost as limited as the knowledge of students who had had no inferential statistics course yet. Our manuscript was recently accepted for publication in Psychonomic Bulletin & Review , and it’s now available online (see e.g., here ). Maybe it’s interesting to discuss on your blog, especially since CIs are often promoted (for example in the new guidelines of Psychological Science ), but apparently researchers seem to have little idea how to interpret them. Given that the confidence percentage of a CI tells something about the procedure rather than about the data at hand, this might be understandable, but, according to us, it’s problematic neve

6 0.098277584 2031 andrew gelman stats-2013-09-19-What makes a statistician look like a hero?

7 0.097825423 543 andrew gelman stats-2011-01-28-NYT shills for personal DNA tests

8 0.092634909 2186 andrew gelman stats-2014-01-26-Infoviz on top of stat graphic on top of spreadsheet

9 0.088772558 230 andrew gelman stats-2010-08-24-Kaggle forcasting update

10 0.085314676 1001 andrew gelman stats-2011-11-10-Three hours in the life of a statistician

11 0.084991537 204 andrew gelman stats-2010-08-12-Sloppily-written slam on moderately celebrated writers is amusing nonetheless

12 0.082758747 1132 andrew gelman stats-2012-01-21-A counterfeit data graphic

13 0.079699226 461 andrew gelman stats-2010-12-09-“‘Why work?’”

14 0.076261036 2060 andrew gelman stats-2013-10-13-New issue of Symposium magazine

15 0.067847379 1245 andrew gelman stats-2012-04-03-Redundancy and efficiency: In praise of Penn Station

16 0.065445915 1076 andrew gelman stats-2011-12-21-Derman, Rodrik and the nature of statistical models

17 0.062071897 1678 andrew gelman stats-2013-01-17-Wanted: 365 stories of statistics

18 0.058536161 982 andrew gelman stats-2011-10-30-“There’s at least as much as an 80 percent chance . . .”

19 0.057633795 209 andrew gelman stats-2010-08-16-EdLab at Columbia’s Teachers’ College

20 0.057292145 1834 andrew gelman stats-2013-05-01-A graph at war with its caption. Also, how to visualize the same numbers without giving the display a misleading causal feel?

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.052), (1, -0.027), (2, -0.002), (3, -0.002), (4, 0.013), (5, -0.005), (6, -0.002), (7, 0.018), (8, 0.029), (9, 0.028), (10, -0.024), (11, -0.018), (12, 0.002), (13, 0.006), (14, -0.057), (15, -0.016), (16, -0.033), (17, 0.105), (18, 0.018), (19, -0.01), (20, 0.008), (21, 0.005), (22, -0.002), (23, -0.061), (24, 0.006), (25, 0.013), (26, 0.015), (27, -0.021), (28, -0.01), (29, 0.043), (30, 0.002), (31, -0.042), (32, 0.043), (33, 0.031), (34, 0.043), (35, -0.069), (36, 0.033), (37, -0.024), (38, -0.004), (39, -0.042), (40, -0.012), (41, 0.07), (42, 0.02), (43, 0.001), (44, -0.034), (45, 0.026), (46, 0.035), (47, -0.003), (48, -0.015), (49, -0.029)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95719916 1174 andrew gelman stats-2012-02-18-Not as ugly as you look

2 0.90686816 2031 andrew gelman stats-2013-09-19-What makes a statistician look like a hero?

Introduction: Answer here (courtesy of Kaiser Fung).

3 0.84828281 1256 andrew gelman stats-2012-04-10-Our data visualization panel at the New York Public Library

Introduction: In case you couldn’t come to our panel (with Kaiser Fung, Mark Hansen, Tahir Hemphill, Manuel Lima, and Jonathan Stray, and organized by Isabel Draves), here’s the video:

4 0.84512782 1001 andrew gelman stats-2011-11-10-Three hours in the life of a statistician

Introduction: Kaiser Fung tells what it’s really like . Here’s a sample: As soon as I [Kaiser] put the substring-concatenate expression together with two lines of code that generate data tables, it choked. Sorta like Dashiell Hammett without the broads and the heaters. And here’s another take, from a slightly different perspective.

5 0.83160275 543 andrew gelman stats-2011-01-28-NYT shills for personal DNA tests

Introduction: Kaiser nails it . The offending article , by John Tierney, somehow ended up in the Science section rather than the Opinion section. As an opinion piece (or, for that matter, a blog), Tierney’s article would be nothing special. But I agree with Kaiser that it doesn’t work as a newspaper article. As Kaiser notes, this story involves a bunch of statistical and empirical claims that are not well resolved by P.R. and rhetoric.

6 0.7853716 1132 andrew gelman stats-2012-01-21-A counterfeit data graphic

7 0.74461228 742 andrew gelman stats-2011-06-02-Grouponomics, counterfactuals, and opportunity cost

8 0.71013558 461 andrew gelman stats-2010-12-09-“‘Why work?’”

9 0.68479198 388 andrew gelman stats-2010-11-01-The placebo effect in pharma

10 0.68096524 2186 andrew gelman stats-2014-01-26-Infoviz on top of stat graphic on top of spreadsheet

11 0.66341984 985 andrew gelman stats-2011-11-01-Doug Schoen has 2 poll reports

12 0.65833092 982 andrew gelman stats-2011-10-30-“There’s at least as much as an 80 percent chance . . .”

13 0.62546766 2121 andrew gelman stats-2013-12-02-Should personal genetic testing be regulated? Battle of the blogroll

14 0.62544721 648 andrew gelman stats-2011-04-04-The Case for More False Positives in Anti-doping Testing

15 0.60164058 1090 andrew gelman stats-2011-12-28-“. . . extending for dozens of pages”

16 0.57653409 1246 andrew gelman stats-2012-04-04-Data visualization panel at the New York Public Library this evening!

17 0.57427824 238 andrew gelman stats-2010-08-27-No radon lobby

18 0.57407457 1612 andrew gelman stats-2012-12-08-The Case for More False Positives in Anti-doping Testing

19 0.5718013 344 andrew gelman stats-2010-10-15-Story time

20 0.57152939 209 andrew gelman stats-2010-08-16-EdLab at Columbia’s Teachers’ College

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(16, 0.058), (24, 0.105), (34, 0.054), (76, 0.065), (88, 0.319), (99, 0.178)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.87138915 1174 andrew gelman stats-2012-02-18-Not as ugly as you look

2 0.75070393 1098 andrew gelman stats-2012-01-04-Bayesian Page Rank?

Introduction: Loren Maxwell writes: I am trying to do some studies on the PageRank algorithm with applying a Bayesian technique. If you are not familiar with PageRank, it is the basis for how Google ranks their pages. It basically treats the internet as a large social network with each link conferring some value onto the page it links to. For example, if I had a webpage that had only one link to it, say from my friend’s webpage, then its PageRank would be dependent on my friend’s PageRank, presumably quite low. However, if the one link to my page was off the Google search page, then my PageRank would be quite high since there are undoubtedly millions of pages linking to Google and few pages that Google links to. The end result of the algorithm, however, is that all the PageRank values of the nodes in the network sum to one and the PageRank of a specific node is the probability that a “random surfer” will end up on that node. For example, in the attached spreadsheet, Column D shows e

3 0.72278601 290 andrew gelman stats-2010-09-22-Data Thief

Introduction: John Transue sends along a link to this software for extracting data from graphs. I havenâ€™t tried it out but it could be useful to somebody out there?

4 0.71200037 1992 andrew gelman stats-2013-08-21-Workshop for Women in Machine Learning

Introduction: This might interest some of you: CALL FOR ABSTRACTS Workshop for Women in Machine Learning Co-located with NIPS 2013, Lake Tahoe, Nevada, USA December 5, 2013 http://www.wimlworkshop.org Deadline for abstract submissions: September 16, 2013 WORKSHOP DESCRIPTION The Workshop for Women in Machine Learning is a day-long event taking place on the first day of NIPS. The workshop aims to showcase the research of women in machine learning and to strengthen their community. The event brings together female faculty, graduate students, and research scientists for an opportunity to connect, exchange ideas, and learn from each other. Underrepresented minorities and undergraduates interested in pursuing machine learning research are encouraged to participate. While all presenters will be female, all genders are invited to attend. Scholarships will be provided to female students and postdoctoral attendees with accepted abstracts to partially offset travel costs. Workshop

5 0.68929052 136 andrew gelman stats-2010-07-09-Using ranks as numbers

Introduction: David Shor writes: I’m dealing with a situation where I have two datasets, one that assigns each participant a discrete score out of five for a set of particular traits (Dog behavior characteristics by breed), and another from an independent source that ranks each breed by each characteristic. It’s also possible to obtain the results of a survey, where experts were asked to rank 7 randomly picked breeds by characteristics. I’m interested in obtaining estimates for each trait, and intuitively, it seems clear that the second and third dataset provide a lot of information. But it’s unclear how to incorporate them to infer latent variables, since only sample ranks are observed. This seems like it is a common problem, do you have any suggestions? My quick answer is that you can treat ranks as numbers (a point we make somewhere in Bayesian Data Analysis, I believe) and just fit an item-response model from there. Val Johnson wrote an article on this in Jasa a few years ago, “Bayesia

6 0.68259478 569 andrew gelman stats-2011-02-12-Get the Data

7 0.66897887 1507 andrew gelman stats-2012-09-22-Grade inflation: why weren’t the instructors all giving all A’s already??

8 0.6547209 629 andrew gelman stats-2011-03-26-Is it plausible that 1% of people pick a career based on their first name?

9 0.65457821 2095 andrew gelman stats-2013-11-09-Typo in Ghitza and Gelman MRP paper

10 0.65068221 825 andrew gelman stats-2011-07-27-Grade inflation: why weren’t the instructors all giving all A’s already??

11 0.63465059 603 andrew gelman stats-2011-03-07-Assumptions vs. conditions, part 2

12 0.6344229 1403 andrew gelman stats-2012-07-02-Moving beyond hopeless graphics

13 0.62356848 784 andrew gelman stats-2011-07-01-Weighting and prediction in sample surveys

14 0.62133372 1866 andrew gelman stats-2013-05-21-Recently in the sister blog

15 0.61958116 400 andrew gelman stats-2010-11-08-Poli sci plagiarism update, and a note about the benefits of not caring

16 0.61871833 1930 andrew gelman stats-2013-07-09-Symposium Magazine

17 0.61319947 1633 andrew gelman stats-2012-12-21-Kahan on Pinker on politics

18 0.58195555 1087 andrew gelman stats-2011-12-27-“Keeping things unridiculous”: Berger, O’Hagan, and me on weakly informative priors

19 0.58120167 2365 andrew gelman stats-2014-06-09-I hate polynomials

20 0.57821727 1414 andrew gelman stats-2012-07-12-Steven Pinker’s unconvincing debunking of group selection