andrew_gelman_stats andrew_gelman_stats-2014 andrew_gelman_stats-2014-2222 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Mon: “Edlin’s rule” for routinely scaling down published estimates Tues: Basketball Stats: Don’t model the probability of win, model the expected score differential Wed: A good comment on one of my papers Thurs: “What Can we Learn from the Many Labs Replication Project?” Fri: God/leaf/tree Sat: “We are moving from an era of private data and public analyses to one of public data and private analyses. Just as we have learned to be cautious about data that are missing, we may have to be cautious about missing analyses also.”
sentIndex sentText sentNum sentScore
1 Mon: “Edlin’s rule” for routinely scaling down published estimates Tues: Basketball Stats: Don’t model the probability of win, model the expected score differential Wed: A good comment on one of my papers Thurs: “What Can we Learn from the Many Labs Replication Project? [sent-1, score-1.469]
2 ” Fri: God/leaf/tree Sat: “We are moving from an era of private data and public analyses to one of public data and private analyses. [sent-2, score-1.681]
3 Just as we have learned to be cautious about data that are missing, we may have to be cautious about missing analyses also. [sent-3, score-1.602]
wordName wordTfidf (topN-words)
[('cautious', 0.441), ('private', 0.298), ('analyses', 0.219), ('missing', 0.214), ('edlin', 0.204), ('scaling', 0.186), ('differential', 0.175), ('basketball', 0.173), ('fri', 0.173), ('labs', 0.171), ('mon', 0.169), ('tues', 0.169), ('thurs', 0.165), ('wed', 0.165), ('public', 0.163), ('routinely', 0.16), ('era', 0.157), ('stats', 0.146), ('replication', 0.144), ('sat', 0.14), ('score', 0.135), ('win', 0.128), ('moving', 0.127), ('rule', 0.118), ('learned', 0.117), ('expected', 0.109), ('project', 0.108), ('data', 0.103), ('model', 0.1), ('learn', 0.096), ('estimates', 0.087), ('papers', 0.086), ('comment', 0.086), ('probability', 0.08), ('published', 0.073), ('may', 0.067), ('one', 0.05), ('many', 0.046), ('good', 0.042)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999988 2222 andrew gelman stats-2014-02-24-On deck this week
Introduction: Mon: “Edlin’s rule” for routinely scaling down published estimates Tues: Basketball Stats: Don’t model the probability of win, model the expected score differential Wed: A good comment on one of my papers Thurs: “What Can we Learn from the Many Labs Replication Project?” Fri: God/leaf/tree Sat: “We are moving from an era of private data and public analyses to one of public data and private analyses. Just as we have learned to be cautious about data that are missing, we may have to be cautious about missing analyses also.”
Introduction: This is an echo of yesterday’s post, Basketball Stats: Don’t model the probability of win, model the expected score differential . As with basketball, so with baseball: as the great Bill James wrote, if you want to predict a pitcher’s win-loss record, it’s better to use last year’s ERA than last year’s W-L. As with basketball and baseball, so with epidemiology: as Joseph Delaney points out in my favorite blog that nobody reads, you will see much better prediction if you first model change in the parameter (e.g. blood pressure) and then convert that to the binary disease state (e.g. hypertension) then if you just develop a logistic model for prob(hypertension). As with basketball, baseball, and epidemiology, so with political science: instead of modeling election winners, better to model vote differential, a point that I made back in 1993 (see page 120 here ) but which seems to continually need repeating . A forecasting method should get essentially no credit for correctl
Introduction: Someone who wants to remain anonymous writes: I am working to create a more accurate in-game win probability model for basketball games. My idea is for each timestep in a game (a second, 5 seconds, etc), use the Vegas line, the current score differential, who has the ball, and the number of possessions played already (to account for differences in pace) to create a point estimate probability of the home team winning. This problem would seem to fit a multi-level model structure well. It seems silly to estimate 2,000 regressions (one for each timestep), but the coefficients should vary at each timestep. Do you have suggestions for what type of model this could/would be? Additionally, I believe this needs to be some form of logit/probit given the binary dependent variable (win or loss). Finally, do you have suggestions for what package could accomplish this in Stata or R? To answer the questions in reverse order: 3. I’d hope this could be done in Stan (which can be run from R)
4 0.20511475 2366 andrew gelman stats-2014-06-09-On deck this week
Introduction: Mon: I hate polynomials Tues: Spring forward, fall back, drop dead? Wed: Bayes in the research conversation Thurs: The health policy innovation center: how best to move from pilot studies to large-scale practice? Fri: Stroopy names Sat: He’s not so great in math but wants to do statistics and machine learning Sun: Comparing the full model to the partial model
5 0.20279419 2240 andrew gelman stats-2014-03-10-On deck this week: Things people sent me
Introduction: Mon: Preregistration: what’s in it for you? Tues: What if I were to stop publishing in journals? Wed: Empirical implications of Empirical Implications of Theoretical Models Thurs: An Economist’s Guide to Visualizing Data Fri: The maximal information coefficient Sat: Problematic interpretations of confidence intervals Sun: The more you look, the more you find
6 0.20201935 2290 andrew gelman stats-2014-04-14-On deck this week
7 0.19381374 2348 andrew gelman stats-2014-05-26-On deck this week
8 0.18785203 2321 andrew gelman stats-2014-05-05-On deck this week
9 0.1823049 2276 andrew gelman stats-2014-03-31-On deck this week
10 0.17758989 2331 andrew gelman stats-2014-05-12-On deck this week
11 0.17621845 2339 andrew gelman stats-2014-05-19-On deck this week
12 0.17515181 2214 andrew gelman stats-2014-02-17-On deck this week
13 0.17171538 2206 andrew gelman stats-2014-02-10-On deck this week
14 0.15713707 2298 andrew gelman stats-2014-04-21-On deck this week
15 0.14501485 2356 andrew gelman stats-2014-06-02-On deck this week
16 0.14403306 2253 andrew gelman stats-2014-03-17-On deck this week: Revisitings
17 0.1424157 2262 andrew gelman stats-2014-03-23-Win probabilities during a sporting event
18 0.14061609 2285 andrew gelman stats-2014-04-07-On deck this week
19 0.13128734 93 andrew gelman stats-2010-06-17-My proposal for making college admissions fairer
20 0.12863044 2310 andrew gelman stats-2014-04-28-On deck this week
topicId topicWeight
[(0, 0.105), (1, 0.037), (2, -0.007), (3, -0.015), (4, 0.025), (5, 0.036), (6, -0.046), (7, -0.051), (8, 0.018), (9, -0.049), (10, -0.006), (11, 0.302), (12, 0.043), (13, 0.149), (14, -0.08), (15, -0.025), (16, 0.079), (17, 0.005), (18, 0.066), (19, -0.084), (20, -0.016), (21, 0.067), (22, -0.008), (23, 0.033), (24, 0.0), (25, -0.026), (26, 0.039), (27, 0.066), (28, -0.002), (29, -0.041), (30, -0.048), (31, -0.047), (32, 0.047), (33, 0.06), (34, 0.005), (35, 0.004), (36, 0.022), (37, -0.007), (38, -0.005), (39, 0.002), (40, 0.058), (41, -0.044), (42, -0.006), (43, -0.009), (44, 0.036), (45, 0.028), (46, -0.001), (47, 0.015), (48, -0.056), (49, 0.031)]
simIndex simValue blogId blogTitle
same-blog 1 0.90664583 2222 andrew gelman stats-2014-02-24-On deck this week
Introduction: Mon: “Edlin’s rule” for routinely scaling down published estimates Tues: Basketball Stats: Don’t model the probability of win, model the expected score differential Wed: A good comment on one of my papers Thurs: “What Can we Learn from the Many Labs Replication Project?” Fri: God/leaf/tree Sat: “We are moving from an era of private data and public analyses to one of public data and private analyses. Just as we have learned to be cautious about data that are missing, we may have to be cautious about missing analyses also.”
2 0.83236033 2366 andrew gelman stats-2014-06-09-On deck this week
Introduction: Mon: I hate polynomials Tues: Spring forward, fall back, drop dead? Wed: Bayes in the research conversation Thurs: The health policy innovation center: how best to move from pilot studies to large-scale practice? Fri: Stroopy names Sat: He’s not so great in math but wants to do statistics and machine learning Sun: Comparing the full model to the partial model
3 0.82673424 2240 andrew gelman stats-2014-03-10-On deck this week: Things people sent me
Introduction: Mon: Preregistration: what’s in it for you? Tues: What if I were to stop publishing in journals? Wed: Empirical implications of Empirical Implications of Theoretical Models Thurs: An Economist’s Guide to Visualizing Data Fri: The maximal information coefficient Sat: Problematic interpretations of confidence intervals Sun: The more you look, the more you find
4 0.82643479 2298 andrew gelman stats-2014-04-21-On deck this week
Introduction: Mon : Ticket to Baaaath Tues : Ticket to Baaaaarf Wed : Thinking of doing a list experiment? Here’s a list of reasons why you should think again Thurs : An open site for researchers to post and share papers Fri : Questions about “Too Good to Be True” Sat : Sleazy sock puppet can’t stop spamming our discussion of compressed sensing and promoting the work of Xiteng Liu Sun : White stripes and dead armadillos
5 0.80024987 2331 andrew gelman stats-2014-05-12-On deck this week
Introduction: Mon: “The results (not shown) . . .” Tues: Personally, I’d rather go with Teragram Wed: How much can we learn about individual-level causal claims from state-level correlations? Thurs: Bill Easterly vs. Jeff Sachs: What percentage of the recipients didn’t use the free malaria bed nets in Zambia? Fri: Models with constraints Sat: Forum in Ecology on p-values and model selection Sun: Never back down: The culture of poverty and the culture of journalism
6 0.78139544 2276 andrew gelman stats-2014-03-31-On deck this week
7 0.77709562 2321 andrew gelman stats-2014-05-05-On deck this week
8 0.77005446 2310 andrew gelman stats-2014-04-28-On deck this week
9 0.76420557 2290 andrew gelman stats-2014-04-14-On deck this week
10 0.75387305 2214 andrew gelman stats-2014-02-17-On deck this week
11 0.75319761 2348 andrew gelman stats-2014-05-26-On deck this week
12 0.72720766 2339 andrew gelman stats-2014-05-19-On deck this week
13 0.72297084 2253 andrew gelman stats-2014-03-17-On deck this week: Revisitings
14 0.71999091 2265 andrew gelman stats-2014-03-24-On deck this week
15 0.69526201 2356 andrew gelman stats-2014-06-02-On deck this week
16 0.69215214 2285 andrew gelman stats-2014-04-07-On deck this week
17 0.66097599 2206 andrew gelman stats-2014-02-10-On deck this week
18 0.61375409 2264 andrew gelman stats-2014-03-24-On deck this month
19 0.57908511 2320 andrew gelman stats-2014-05-05-On deck this month
20 0.54035473 165 andrew gelman stats-2010-07-27-Nothing is Linear, Nothing is Additive: Bayesian Models for Interactions in Social Science
topicId topicWeight
[(15, 0.026), (24, 0.094), (27, 0.03), (41, 0.087), (43, 0.027), (44, 0.05), (47, 0.07), (68, 0.091), (71, 0.051), (77, 0.034), (98, 0.028), (99, 0.277)]
simIndex simValue blogId blogTitle
same-blog 1 0.96151376 2222 andrew gelman stats-2014-02-24-On deck this week
Introduction: Mon: “Edlin’s rule” for routinely scaling down published estimates Tues: Basketball Stats: Don’t model the probability of win, model the expected score differential Wed: A good comment on one of my papers Thurs: “What Can we Learn from the Many Labs Replication Project?” Fri: God/leaf/tree Sat: “We are moving from an era of private data and public analyses to one of public data and private analyses. Just as we have learned to be cautious about data that are missing, we may have to be cautious about missing analyses also.”
2 0.89350241 622 andrew gelman stats-2011-03-21-A possible resolution of the albedo mystery!
Introduction: Remember that bizarre episode in Freakonomics 2, where Levitt and Dubner went to the Batcave-like lair of a genius billionaire who told them that “the problem with solar panels is that they’re black .” I’m not the only one who wondered at the time: of all the issues to bring up about solar power, why that one? Well, I think I’ve found the answer in this article by John Lanchester: In 2004, Nathan Myhrvold, who had, five years earlier, at the advanced age of forty, retired from his job as Microsoft’s chief technology officer, began to contribute to the culinary discussion board egullet.org . . . At the time he grew interested in sous vide, there was no book in English on the subject, and he resolved to write one. . . . broadened it further to include information about the basic physics of heating processes, then to include the physics and chemistry of traditional cooking techniques, and then to include the science and practical application of the highly inventive new techniq
3 0.88756603 1669 andrew gelman stats-2013-01-12-The power of the puzzlegraph
Introduction: The Organisation for Economic Co-operation and Development reports that the following project from Krisztina Szucs and Mate Cziner has won their visualization challenge, “launched in September 2012 to solicit visualisations based on the OECD’s data-rich Education at a Glance report”: (The graph is interactive. Click on the above image and click again to see the full version.) From the press release: Entries from around the world focused on data related to the economic costs and return on investment in education . . . [The winning entry] takes a detailed look at public vs. private and men vs. women for selected countries . . . The judges were particularly impressed by the angled slope format of the visualisation, which encourages comparison between the upper-secondary and tertiary benefits of education. Szucs and Cziner were also lauded for their striking visual design, which draws users into exploring their piece [emphasis added]. I used boldface to highlight a p
4 0.88368213 516 andrew gelman stats-2011-01-14-A new idea for a science core course based entirely on computer simulation
Introduction: Columbia College has for many years had a Core Curriculum, in which students read classics such as Plato (in translation) etc. A few years ago they created a Science core course. There was always some confusion about this idea: On one hand, how much would college freshmen really learn about science by reading the classic writings of Galileo, Laplace, Darwin, Einstein, etc.? And they certainly wouldn’t get much out by puzzling over the latest issues of Nature, Cell, and Physical Review Letters. On the other hand, what’s the point of having them read Dawkins, Gould, or even Brian Greene? These sorts of popularizations give you a sense of modern science (even to the extent of conveying some of the debates in these fields), but reading them might not give the same intellectual engagement that you’d get from wrestling with the Bible or Shakespeare. I have a different idea. What about structuring the entire course around computer programming and simulation? Start with a few weeks t
5 0.8816877 36 andrew gelman stats-2010-05-16-Female Mass Murderers: Babes Behind Bars
Introduction: Around the time I was finishing up my Ph.D. thesis, I was trying to come up with a good title–something more grabby than “Topics in Image Reconstruction for Emission Tomography”–and one of the other students said: How about something iike, Female Mass Murderers: Babes Behind Bars? That sounded good to me, and I was all set to use it. I had a plan: I’d first submit the one the boring title–that’s how it would be recorded in all the official paperwork–but then at the last minute I’d substitute in the new title page before submitting to the library. (This was in the days of hard copies.) Nobody would look at the time, then later on, if anyone went into the library to find my thesis, they’d have a pleasant surprise. Anyway, as I said, I was all set to do this, but a friend warned me off. He said that at some point, someone might find it, and the rumor would spread that I’m a sexist pig. So I didn’t. I was thinking about this after hearing this report based on a reading of Sup
6 0.88163877 958 andrew gelman stats-2011-10-14-The General Social Survey is a great resource
7 0.88114262 913 andrew gelman stats-2011-09-16-Groundhog day in August?
8 0.880705 454 andrew gelman stats-2010-12-07-Diabetes stops at the state line?
9 0.88011175 1050 andrew gelman stats-2011-12-10-Presenting at the econ seminar
10 0.87987179 1143 andrew gelman stats-2012-01-29-G+ > Skype
11 0.87982661 1261 andrew gelman stats-2012-04-12-The Naval Research Lab
12 0.87919414 303 andrew gelman stats-2010-09-28-“Genomics” vs. genetics
13 0.8779909 924 andrew gelman stats-2011-09-24-“Income can’t be used to predict political opinion”
14 0.87793392 2068 andrew gelman stats-2013-10-18-G+ hangout for Bayesian Data Analysis course now! (actually, in 5 minutes)
15 0.87668294 877 andrew gelman stats-2011-08-29-Applying quantum probability to political science
16 0.87666148 1816 andrew gelman stats-2013-04-21-Exponential increase in the number of stat majors
17 0.87531787 1284 andrew gelman stats-2012-04-26-Modeling probability data
18 0.87424284 875 andrew gelman stats-2011-08-28-Better than Dennis the dentist or Laura the lawyer
19 0.87421167 2206 andrew gelman stats-2014-02-10-On deck this week
20 0.87419707 1114 andrew gelman stats-2012-01-12-Controversy about average personality differences between men and women