andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-677 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: I recently learned we have some readers at the National Oceanic and Atmospheric Administration so I thought I’d share an old story. About 35 years ago my brother worked briefly as a clerk at NOAA in their D.C. (or maybe it was D.C.-area) office. His job was to enter the weather numbers that came in. He had a boss who was very orderly. At one point there was a hurricane that wiped out some weather station in the Caribbean, and his boss told him to put in the numbers anyway. My brother protested that they didn’t have the data, to which his boss replied: “I know what the numbers are.” Nowadays we call this sort of thing “imputation” and we like it. But not in the raw data! I bet nowadays they have an NA code.
sentIndex sentText sentNum sentScore
1 I recently learned we have some readers at the National Oceanic and Atmospheric Administration so I thought I’d share an old story. [sent-1, score-0.43]
2 About 35 years ago my brother worked briefly as a clerk at NOAA in their D. [sent-2, score-0.839]
3 His job was to enter the weather numbers that came in. [sent-7, score-0.743]
4 At one point there was a hurricane that wiped out some weather station in the Caribbean, and his boss told him to put in the numbers anyway. [sent-9, score-1.636]
5 My brother protested that they didn’t have the data, to which his boss replied: “I know what the numbers are. [sent-10, score-1.251]
6 ” Nowadays we call this sort of thing “imputation” and we like it. [sent-11, score-0.182]
wordName wordTfidf (topN-words)
[('boss', 0.459), ('brother', 0.366), ('weather', 0.278), ('nowadays', 0.217), ('numbers', 0.201), ('wiped', 0.194), ('protested', 0.194), ('noaa', 0.183), ('oceanic', 0.183), ('caribbean', 0.183), ('clerk', 0.183), ('atmospheric', 0.175), ('hurricane', 0.169), ('station', 0.153), ('enter', 0.132), ('administration', 0.123), ('imputation', 0.122), ('bet', 0.117), ('briefly', 0.113), ('na', 0.108), ('raw', 0.107), ('replied', 0.091), ('learned', 0.089), ('share', 0.088), ('code', 0.085), ('worked', 0.079), ('told', 0.079), ('national', 0.078), ('old', 0.074), ('call', 0.073), ('job', 0.071), ('readers', 0.07), ('recently', 0.061), ('came', 0.061), ('ago', 0.056), ('data', 0.053), ('didn', 0.052), ('put', 0.049), ('thought', 0.048), ('sort', 0.044), ('thing', 0.044), ('maybe', 0.043), ('years', 0.042), ('point', 0.035), ('know', 0.031), ('like', 0.021), ('one', 0.019)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 677 andrew gelman stats-2011-04-24-My NOAA story
Introduction: I recently learned we have some readers at the National Oceanic and Atmospheric Administration so I thought I’d share an old story. About 35 years ago my brother worked briefly as a clerk at NOAA in their D.C. (or maybe it was D.C.-area) office. His job was to enter the weather numbers that came in. He had a boss who was very orderly. At one point there was a hurricane that wiped out some weather station in the Caribbean, and his boss told him to put in the numbers anyway. My brother protested that they didn’t have the data, to which his boss replied: “I know what the numbers are.” Nowadays we call this sort of thing “imputation” and we like it. But not in the raw data! I bet nowadays they have an NA code.
2 0.10390529 2070 andrew gelman stats-2013-10-20-The institution of tenure
Introduction: Rohin Dhar writes: The Priceonomics blog is doing a feature where we ask a few economists what they think of the the institution of tenure. If you’d be interested in participating, I’d love to get your response. As an economist, what do you think of tenure? Should it be abolished / kept / modified? My reply: Just to be clear, I’m assuming that when you say “tenure,” you’re talking about lifetime employment for college professors such as myself. I’m actually a political scientist, not an economist. So rather than giving my opinion, I’ll say what I think an economist might say. I think an economist could say one of two things: Economist as anthropologist would say: Tenure is decided by independent institutions acting freely. If they choose to offer tenure, they will have good reasons, and it is not part of an economist’s job to second-guess individual decisions. Economist as McKinsey consultant would say: Tenure can be evaluated based on a cost-benefit analysis. How
3 0.080005214 1501 andrew gelman stats-2012-09-18-More studies on the economic effects of climate change
Introduction: After writing yesterday’s post , I was going through Solomon Hsiang’s blog and found a post pointing to three studies from researchers at business schools: Severe Weather and Automobile Assembly Productivity Gérard P. Cachon, Santiago Gallino and Marcelo Olivares Abstract: It is expected that climate change could lead to an increased frequency of severe weather. In turn, severe weather intuitively should hamper the productivity of work that occurs outside. But what is the effect of rain, snow, fog, heat and wind on work that occurs indoors, such as the production of automobiles? Using weekly production data from 64 automobile plants in the United States over a ten-year period, we find that adverse weather conditions lead to a significant reduction in production. For example, one additional day of high wind advisory by the National Weather Service (i.e., maximum winds generally in excess of 44 miles per hour) reduces production by 26%, which is comparable in order of magnitude t
4 0.074193723 2361 andrew gelman stats-2014-06-06-Hurricanes vs. Himmicanes
Introduction: The story’s on the sister blog and I quote liberally from Jeremy Freese, who wrote : The authors have issued a statement that argues against some criticisms of their study that others have offered. These are irrelevant to the above observations, as I [Freese] am taking everything about the measurement and model specification at their word–my starting point is the model that fully replicates the analyses that they themselves published. A qualification is that one of their comments is that they deny they are making any claims about the importance of other factors that kill people in hurricanes. But they are. If you claim that 27 out of the 42 deaths in Hurricane Eloise would have been prevented if it was named Hurricane Charley, that is indeed a claim that diminishes the potential importance of other causes of deaths in that hurricane. Freese also raises an important general issue in science communication: The authors’ university issued a press release with a dramatic prese
5 0.070372485 1245 andrew gelman stats-2012-04-03-Redundancy and efficiency: In praise of Penn Station
Introduction: In reaction to this news article by Michael Kimmelman, I’d like to repost this from four years ago: Walking through Penn Station in New York, I remembered how much I love its open structure. By “open,” I don’t mean bright and airy. I mean “open” in a topological sense. The station has three below-ground levels–the uppermost has ticket counters (and, what is more relevant nowadays, ticket machines), some crappy stores and restaurants, and a crappy waiting area. The middle level has Long Island Rail Road ticket counters, some more crappy stores and restaurants, and entrances to the 7th and 8th Avenue subway lines. The lower level has train tracks and platforms. There are stairs, escalators, and elevators going everywhere. As a result, it’s easy to get around, there are lots of shortcuts, and the train loads fast–some people come down the escalators and elevators from the top level, others take the stairs from the middle level. The powers-that-be keep threatening to spend a coupl
6 0.06897632 2181 andrew gelman stats-2014-01-21-The Commissar for Traffic presents the latest Five-Year Plan
7 0.067849331 180 andrew gelman stats-2010-08-03-Climate Change News
8 0.066538729 135 andrew gelman stats-2010-07-09-Rasmussen sez: “108% of Respondents Say . . .”
9 0.061679892 628 andrew gelman stats-2011-03-25-100-year floods
10 0.060175285 976 andrew gelman stats-2011-10-27-Geophysicist Discovers Modeling Error (in Economics)
11 0.058730841 608 andrew gelman stats-2011-03-12-Single or multiple imputation?
12 0.058358669 1447 andrew gelman stats-2012-08-07-Reproducible science FAIL (so far): What’s stoppin people from sharin data and code?
15 0.05659207 138 andrew gelman stats-2010-07-10-Creating a good wager based on probability estimates
16 0.056096498 1083 andrew gelman stats-2011-12-26-The quals and the quants
17 0.056084681 1484 andrew gelman stats-2012-09-05-Two exciting movie ideas: “Second Chance U” and “The New Dirty Dozen”
18 0.054058358 2018 andrew gelman stats-2013-09-12-Do you ever have that I-just-fit-a-model feeling?
20 0.049201954 731 andrew gelman stats-2011-05-26-Lottery probability update
topicId topicWeight
[(0, 0.069), (1, -0.034), (2, -0.002), (3, 0.028), (4, 0.019), (5, -0.012), (6, 0.016), (7, -0.002), (8, 0.003), (9, -0.014), (10, 0.002), (11, 0.002), (12, -0.002), (13, -0.012), (14, -0.023), (15, 0.049), (16, 0.01), (17, -0.005), (18, 0.021), (19, 0.003), (20, -0.003), (21, 0.045), (22, -0.019), (23, 0.022), (24, -0.01), (25, -0.004), (26, -0.023), (27, -0.011), (28, 0.027), (29, 0.036), (30, 0.006), (31, -0.004), (32, -0.006), (33, 0.005), (34, -0.001), (35, -0.001), (36, 0.014), (37, 0.019), (38, -0.009), (39, -0.007), (40, -0.009), (41, 0.006), (42, -0.004), (43, 0.017), (44, -0.019), (45, -0.031), (46, -0.017), (47, -0.017), (48, -0.007), (49, -0.01)]
simIndex simValue blogId blogTitle
same-blog 1 0.9397698 677 andrew gelman stats-2011-04-24-My NOAA story
Introduction: I recently learned we have some readers at the National Oceanic and Atmospheric Administration so I thought I’d share an old story. About 35 years ago my brother worked briefly as a clerk at NOAA in their D.C. (or maybe it was D.C.-area) office. His job was to enter the weather numbers that came in. He had a boss who was very orderly. At one point there was a hurricane that wiped out some weather station in the Caribbean, and his boss told him to put in the numbers anyway. My brother protested that they didn’t have the data, to which his boss replied: “I know what the numbers are.” Nowadays we call this sort of thing “imputation” and we like it. But not in the raw data! I bet nowadays they have an NA code.
Introduction: Solomon Hsiang shares some bad news: Persistently reduced labor productivity may be one of the largest economic impacts of anthropogenic climate change. . . . Two percent per degree Celsius . . . That’s the magic number for how worker productivity responds to warm/hot temperatures. In my 2010 PNAS paper , I [Hsiang] found that labor-intensive sectors of national economies decreased output by roughly 2.4% per degree C and argued that this looked suspiously like it came from reductions in worker output. Using a totally different method and dataset, Matt Neidell and Josh Graff Zivin found that labor supply in micro data fell by 1.8% per degree C. Both responses kicked in at around 26C. Chris Sheehan just sent me this NYT article on air conditioning , where they mention this neat natural experiment: [I]n the past year, [Japan] became an unwitting laboratory to study even more extreme air-conditioning abstinence, and the results have not been encouraging. After th
3 0.70107108 68 andrew gelman stats-2010-06-03-…pretty soon you’re talking real money.
Introduction: A New York Times article reports the opening of a half-mile section of bike path, recently built along the west side of Manhattan at a cost of $16M, or roughly $30 million per mile. That’s about $5700 per linear foot. Kinda sounds like a lot, doesn’t it? Well, $30 million per mile for about one car-lane mile is a lot, but it’s not out of line compared to other urban highway construction costs. The Doyle Drive project in San Francisco — a freeway to replace the current old and deteriorating freeway approach to the Golden Gate Bridge — is currently under way at $1 billion for 1.6 miles…but hey, it will have six lanes each way, so that isn’t so bad, at $50 million per lane-mile. And there are other components to the project, too, not just building the highway (there will also be bike paths, landscaping, on- and off-ramps, and so on). All in all it seems roughly in line with the New York bike lane project. Speaking of the Doyle Drive project, one expense was the cost of movin
4 0.68761402 731 andrew gelman stats-2011-05-26-Lottery probability update
Introduction: It was reported last year that the national lottery of Israel featured the exact same 6 numbers (out of 45) twice in the same month, and statistics professor Isaac Meilijson of Tel Aviv University was quoted as saying that “the incident of six numbers repeating themselves within a month is an event of once in 10,000 years.” I shouldn’t mock when it comes to mathematics–after all, I proved a false theorem once! (Or, to be precise, my collaborator and I published a false claim which we thought we’d proved, thus we thought was a theorem.) So let me retract the mockery and move, first to the mathematics and then to the statistics. First, how many possibilities are there in pick 6 out of 45? It’s (45*44*43*42*41*40)/6! = 8,145,060. Let’s call this number N. Second, what’s the probability that the same numbers repeat in a single calendar month? I’ve been told that the Israeli lottery has 2 draws per week, That’s 104/12=8.67 draws per month. Or maybe they skip some holiday
5 0.68425685 1897 andrew gelman stats-2013-06-13-When’s that next gamma-ray blast gonna come, already?
Introduction: Phil Plait writes : Earth May Have Been Hit by a Cosmic Blast 1200 Years Ago . . . this is nothing to panic about. If it happened at all, it was a long time ago, and unlikely to happen again for hundreds of thousands of years. This left me confused. If it really did happen 1200 years ago, basic statistics would suggest it would occur approximately once every 1200 years or so (within half an order of magnitude). So where does “hundreds of thousands of years” come from? I emailed astronomer David Hogg to see if I was missing something here, and he replied: Yeah, if we think this hit us 1200 years ago, we should imagine that this happens every few thousand years at least. Now that said, if there are *other* reasons for thinking it is exceedingly rare, then that would be a strong a priori argument against believing in the result. So you should either believe that it didn’t happen 1200 years ago, or else you should believe it will happen again in the next few thousan
6 0.68358117 1187 andrew gelman stats-2012-02-27-“Apple confronts the law of large numbers” . . . huh?
7 0.68037498 1342 andrew gelman stats-2012-05-24-The Used TV Price is Too Damn High
9 0.6744225 1905 andrew gelman stats-2013-06-18-There are no fat sprinters
10 0.66319579 513 andrew gelman stats-2011-01-12-“Tied for Warmest Year On Record”
11 0.65677333 1549 andrew gelman stats-2012-10-26-My talk at the Larchmont public library this Sunday
12 0.65549123 2341 andrew gelman stats-2014-05-20-plus ça change, plus c’est la même chose
13 0.65497708 1640 andrew gelman stats-2012-12-26-What do people do wrong? WSJ columnist is looking for examples!
14 0.65409839 404 andrew gelman stats-2010-11-09-“Much of the recent reported drop in interstate migration is a statistical artifact”
15 0.65402675 137 andrew gelman stats-2010-07-10-Cost of communicating numbers
16 0.65379214 1245 andrew gelman stats-2012-04-03-Redundancy and efficiency: In praise of Penn Station
17 0.65312916 628 andrew gelman stats-2011-03-25-100-year floods
18 0.652749 970 andrew gelman stats-2011-10-24-Bell Labs
19 0.64887375 2352 andrew gelman stats-2014-05-29-When you believe in things that you don’t understand
20 0.64867973 1623 andrew gelman stats-2012-12-14-GiveWell charity recommendations
topicId topicWeight
[(9, 0.036), (12, 0.377), (16, 0.041), (24, 0.076), (34, 0.044), (40, 0.025), (53, 0.015), (99, 0.244)]
simIndex simValue blogId blogTitle
same-blog 1 0.91410738 677 andrew gelman stats-2011-04-24-My NOAA story
Introduction: I recently learned we have some readers at the National Oceanic and Atmospheric Administration so I thought I’d share an old story. About 35 years ago my brother worked briefly as a clerk at NOAA in their D.C. (or maybe it was D.C.-area) office. His job was to enter the weather numbers that came in. He had a boss who was very orderly. At one point there was a hurricane that wiped out some weather station in the Caribbean, and his boss told him to put in the numbers anyway. My brother protested that they didn’t have the data, to which his boss replied: “I know what the numbers are.” Nowadays we call this sort of thing “imputation” and we like it. But not in the raw data! I bet nowadays they have an NA code.
2 0.81603807 211 andrew gelman stats-2010-08-17-Deducer update
Introduction: A year ago we blogged about Ian Fellows’s R Gui called Deducer (oops, my bad, I meant to link to this ). Fellows sends in this update: Since version 0.1, I [Fellows] have added: 1. A nice plug-in interface, so that people can extend Deducer’s capability without leaving the comfort of R. (see: http://www.deducer.org/pmwiki/pmwiki.php?n=Main.Development ) 2. Several new dialogs. 3. A one-step installer for windows. 4. A plug-in package (DeducerExtras) which extends the scope of analyses covered. 5. A plotting GUI that can create anything from simple histograms to complex custom graphics. Deducer is designed to be a free easy to use alternative to proprietary data analysis software such as SPSS, JMP, and Minitab. It has a menu system to do common data manipulation and analysis tasks, and an excel-like spreadsheet in which to view and edit data frames. The goal of the project is two fold. Provide an intuitive interface so that non-technical users can learn and p
3 0.80223274 1119 andrew gelman stats-2012-01-15-Excellence in Statistical Reporting Award
Introduction: The American Statistical Association is seeking nominations for its annual Excellence in Statistical Reporting Award . The award was created in 2004 to encourage and recognize members of the communications media who have best displayed an informed interest in the science of statistics and its role in public life. The award can be given for a single statistical article or for a body of work. Former winners of the award include: Felix Salmon , financial blogger, 2010; Sharon Begley , Newsweek, 2009; Mark Buchanan, New York Times, 2008; John Berry, Bloomberg News, 2005; and Gina Kolata, New York Times, 2004. If anyone has any suggestions for the 2012 award, feel free to post in the comments or email me.
4 0.73597646 189 andrew gelman stats-2010-08-06-Proposal for a moratorium on the use of the words “fashionable” and “trendy”
Introduction: Tyler Cowen links to an interesting article by Terry Teachout on David Mamet’s political conservatism. I don’t think of playwrights as gurus, but I do find it interesting to consider the political orientations of authors and celebrities . I have only one problem with Teachout’s thought-provoking article. He writes: As early as 2002 . . . Arguing that “the Western press [had] embraced antisemitism as the new black,” Mamet drew a sharp contrast between that trendy distaste for Jews and the harsh realities of daily life in Israel . . . In 2006, Mamet published a collection of essays called The Wicked Son: Anti-Semitism, Jewish Self-Hatred and the Jews that made the point even more bluntly. “The Jewish State,” he wrote, “has offered the Arab world peace since 1948; it has received war, and slaughter, and the rhetoric of annihilation.” He went on to argue that secularized Jews who “reject their birthright of ‘connection to the Divine’” succumb in time to a self-hatred tha
5 0.70214176 1282 andrew gelman stats-2012-04-26-Bad news about (some) statisticians
Introduction: Sociologist Fabio Rojas reports on “a conversation I [Rojas] have had a few times with statisticians”: Rojas: “What does your research tell us about a sample of, say, a few hundred cases?” Statistician: “That’s not important. My result works as n–> 00.” Rojas: “Sure, that’s a fine mathematical result, but I have to estimate the model with, like, totally finite data. I need inference, not limits. Maybe the estimate doesn’t work out so well for small n.” Statistician: “Sure, but if you have a few million cases, it’ll work in the limit.” Rojas: “Whoa. Have you ever collected, like, real world network data? A million cases is hard to get.” The conversation continues in this frustrating vein. Rojas writes: This illustrates a fundamental issue in statistics (and other sciences). One you formalize a model and work mathematically, you are tempted to focus on what is mathematically interesting instead of the underlying problem motivating the science. . . . We have the sam
6 0.69357121 372 andrew gelman stats-2010-10-27-A use for tables (really)
7 0.68318367 434 andrew gelman stats-2010-11-28-When Small Numbers Lead to Big Errors
8 0.67719316 840 andrew gelman stats-2011-08-05-An example of Bayesian model averaging
9 0.67329162 1660 andrew gelman stats-2013-01-08-Bayesian, Permutable Symmetries
10 0.65856206 239 andrew gelman stats-2010-08-28-The mathematics of democracy
11 0.65136355 1871 andrew gelman stats-2013-05-27-Annals of spam
12 0.64838743 1597 andrew gelman stats-2012-11-29-What is expected of a consultant
13 0.63167799 1858 andrew gelman stats-2013-05-15-Reputations changeable, situations tolerable
14 0.62527049 1777 andrew gelman stats-2013-03-26-Data Science for Social Good summer fellowship program
15 0.62466007 1348 andrew gelman stats-2012-05-27-Question 17 of my final exam for Design and Analysis of Sample Surveys
16 0.60817665 244 andrew gelman stats-2010-08-30-Useful models, model checking, and external validation: a mini-discussion
17 0.5964877 2203 andrew gelman stats-2014-02-08-“Guys who do more housework get less sex”
18 0.59508353 2287 andrew gelman stats-2014-04-09-Advice: positive-sum, zero-sum, or negative-sum
19 0.59304088 2361 andrew gelman stats-2014-06-06-Hurricanes vs. Himmicanes
20 0.59246719 1564 andrew gelman stats-2012-11-06-Choose your default, or your default will choose you (election forecasting edition)