andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1109 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: John Eppley asks what I make of this : Eppley is guessing the negative spikes are searches getting swamped by holiday season shoppers.
sentIndex sentText sentNum sentScore
1 John Eppley asks what I make of this : Eppley is guessing the negative spikes are searches getting swamped by holiday season shoppers. [sent-1, score-1.697]
wordName wordTfidf (topN-words)
[('eppley', 0.709), ('shoppers', 0.323), ('searches', 0.304), ('swamped', 0.291), ('holiday', 0.254), ('season', 0.228), ('guessing', 0.173), ('asks', 0.157), ('negative', 0.14), ('john', 0.114), ('getting', 0.097), ('make', 0.053)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 1109 andrew gelman stats-2012-01-09-Google correlate links statistics with minorities
Introduction: John Eppley asks what I make of this : Eppley is guessing the negative spikes are searches getting swamped by holiday season shoppers.
2 0.14523004 752 andrew gelman stats-2011-06-08-Traffic Prediction
Introduction: I always thought predicting traffic for a particular day and time would be something easily predicted from historic data with regression. Google Maps now has this feature: It would be good to actually include season, holiday and similar information: the predictions would be better. I wonder if one can find this data easily, or if others have done this work before.
3 0.04522315 737 andrew gelman stats-2011-05-30-Memorial Day question
Introduction: When I was a kid they shifted a bunch of holidays to Monday. (Not all the holidays: they kept New Year’s, Christmas, and July 4th on fixed dates, they kept Thanksgiving on a Thursday, and for some reason the shifted Veterans Day didn’t stick. But they successfully moved Washington’s Birthday, Memorial Day, and Columbus Day. It makes sense to give people a 3-day weekend. I have no idea why they picked Monday rather than Friday, but either one would do, I suppose. My question is: if this Monday holiday thing was such a good idea, why did it take them so long to do it?
4 0.044965945 2270 andrew gelman stats-2014-03-28-Creating a Lenin-style democracy
Introduction: Mark Palko explains why a penalty for getting the wrong answer on a test (the SAT, which is used in college admissions and which is used in the famous 8 schools example) is not a “penalty for guessing.” Then the very next day he catches this from Todd Balf in the New York Times Magazine: Students were docked one-quarter point for every multiple-choice question they got wrong, requiring a time-consuming risk analysis to determine which questions to answer and which to leave blank. Ugh! That just makes me want to . . . ok, I won’t go there. Anyway, Palko goes to the trouble to explain: While time management for a test like the SAT can be complicated, the rule for guessing is embarrassingly simple: give your best guess for questions you read; don’t waste time guessing on questions that you didn’t have time to read. The risk analysis actually becomes much more complicated when you take away the penalty for guessing. On the ACT (or the new SAT), there is a positive
5 0.042109121 1167 andrew gelman stats-2012-02-14-Extra babies on Valentine’s Day, fewer on Halloween?
Introduction: Just in time for the holiday, X pointed me to an article by Becca Levy, Pil Chung, and Martin Slade reporting that, during a recent eleven-year period, more babies were born on Valentine’s Day and fewer on Halloween compared to neighboring days: What I’d really like to see is a graph with all 366 days of the year. It would be easy enough to make. That way we could put the Valentine’s and Halloween data in the context of other possible patterns. While they’re at it, they could also graph births by day of the week and show Thanksgiving, Easter, and other holidays that don’t have fixed dates. It’s so frustrating when people only show part of the story. The data are publicly available, so maybe someone could make those graphs? If the Valentine’s/Halloween data are worth publishing, I think more comprehensive graphs should be publishable as well. I’d post them here, that’s for sure.
6 0.041067105 703 andrew gelman stats-2011-05-10-Bringing Causal Models Into the Mainstream
7 0.040360272 2054 andrew gelman stats-2013-10-07-Bing is preferred to Google by people who aren’t like me
8 0.039131016 1671 andrew gelman stats-2013-01-13-Preregistration of Studies and Mock Reports
9 0.037398741 2068 andrew gelman stats-2013-10-18-G+ hangout for Bayesian Data Analysis course now! (actually, in 5 minutes)
10 0.0366829 949 andrew gelman stats-2011-10-10-Grrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
11 0.035739467 1098 andrew gelman stats-2012-01-04-Bayesian Page Rank?
12 0.033259716 1851 andrew gelman stats-2013-05-11-Actually, I have no problem with this graph
13 0.032827709 1918 andrew gelman stats-2013-06-29-Going negative
14 0.032433197 1933 andrew gelman stats-2013-07-10-Please send all comments to -dev-ripley
15 0.032073263 183 andrew gelman stats-2010-08-04-Bayesian models for simultaneous equation systems?
16 0.032070547 185 andrew gelman stats-2010-08-04-Why does anyone support private macroeconomic forecasts?
17 0.030803232 1174 andrew gelman stats-2012-02-18-Not as ugly as you look
18 0.030410176 2233 andrew gelman stats-2014-03-04-Literal vs. rhetorical
19 0.029277626 1101 andrew gelman stats-2012-01-05-What are the standards for reliability in experimental psychology?
20 0.028247112 1406 andrew gelman stats-2012-07-05-Xiao-Li Meng and Xianchao Xie rethink asymptotics
topicId topicWeight
[(0, 0.018), (1, -0.01), (2, -0.0), (3, 0.004), (4, 0.004), (5, -0.004), (6, 0.007), (7, 0.0), (8, 0.006), (9, 0.001), (10, -0.001), (11, -0.001), (12, 0.007), (13, -0.009), (14, -0.011), (15, 0.007), (16, -0.004), (17, 0.01), (18, 0.007), (19, -0.006), (20, -0.001), (21, 0.011), (22, -0.001), (23, -0.005), (24, -0.001), (25, 0.0), (26, 0.005), (27, 0.003), (28, -0.008), (29, -0.007), (30, 0.003), (31, -0.003), (32, 0.02), (33, -0.014), (34, 0.013), (35, 0.009), (36, -0.003), (37, -0.003), (38, -0.008), (39, 0.005), (40, 0.001), (41, 0.002), (42, -0.014), (43, 0.005), (44, -0.015), (45, 0.012), (46, 0.004), (47, -0.008), (48, 0.034), (49, -0.005)]
simIndex simValue blogId blogTitle
same-blog 1 0.93281668 1109 andrew gelman stats-2012-01-09-Google correlate links statistics with minorities
Introduction: John Eppley asks what I make of this : Eppley is guessing the negative spikes are searches getting swamped by holiday season shoppers.
2 0.53366959 1255 andrew gelman stats-2012-04-10-Amtrak sucks
Introduction: Couldn’t they at least let me buy my tickets from Amazon so I wouldn’t have to re-enter the credit card information each time? Yeah, yeah, I know it’s no big deal. It just seems so silly.
3 0.49086401 131 andrew gelman stats-2010-07-07-A note to John
Introduction: Jeff the Productivity Sapper points me to this insulting open letter to Nate Silver written by pollster John Zogby. I’ll go through bits of Zogby’s note line by line. (Conflict of interest warning: I have collaborated with Nate and I blog on his site). Zogby writes: Here is some advice from someone [Zogby] who has been where you [Silver] are today. Sorry, John. (I can call you that, right? Since you’re calling Nate “Nate”?). Yes, you were once the hot pollster. But, no, you were never where Nate is today. Don’t kid yourself. Zogby writes: You [Nate] are hot right now – using an aggregate of other people’s work, you got 49 of 50 states right in 2008. Yes, Nate used other people’s work. That’s what’s called “making use of available data.” Or, to use a more technical term employed in statistics, it’s called “not being an idiot.” Only in the wacky world of polling are you supposed to draw inferences about the U.S.A. using only a single survey organization. I do
4 0.48566088 1808 andrew gelman stats-2013-04-17-Excel-bashing
Introduction: In response to the latest controversy , a statistics professor writes: It’s somewhat surprising to see Very Serious Researchers (apologies to Paul Krugman) using Excel. Some years ago, I was consulting on a trademark infringement case and was trying (unsuccessfully) to replicate another expert’s regression analysis. It wasn’t until I had the brainstorm to use Excel that I was able to reproduce his results – it may be better now, but at the time, Excel could propagate round-off error and catastrophically cancel like no other software! Microsoft has lots of top researchers so it’s hard for me to understand how Excel can remain so crappy. I mean, sure, I understand in some general way that they have a large user base, it’s hard to maintain backward compatibility, there’s feature creep, and, besides all that, lots of people have different preferences in data analysis than I do. But still, it’s such a joke. Word has problems too, but I can see how these problems arise from its d
5 0.48543468 1553 andrew gelman stats-2012-10-30-Real rothko, fake rothko
Introduction: Jay Livingston writes : I know that in art, quality and value are two very different things. Still, I had to stop and wonder when I read about Domenico and Eleanore De Sole, who in 2004 paid $8.3 million for a painting attributed to Mark Rothko that they now say is a worthless fake. One day a painting is worth $8.3 million; the next day, the same painting – same quality, same capacity to give aesthetic pleasure or do whatever it is that art does – is “worthless.”* Art forgery also makes me wonder about the buyer’s motive. If the buyer wanted only to have and to gaze upon something beautiful, something with artistic merit, then a fake Rothko is no different than a real Rothko. It seems more likely that what the buyer wants is to own something valuable – i.e., something that costs a lot. Displaying your brokerage account statements is just too crude and obvious. What the high-end art market offers is a kind of money laundering. Objects that are rare and therefore expensive
6 0.48396337 364 andrew gelman stats-2010-10-22-Politics is not a random walk: Momentum and mean reversion in polling
7 0.47667938 954 andrew gelman stats-2011-10-12-Benford’s Law suggests lots of financial fraud
8 0.46879697 1694 andrew gelman stats-2013-01-26-Reflections on ethicsblogging
9 0.45741203 1874 andrew gelman stats-2013-05-28-Nostalgia
10 0.45691705 1759 andrew gelman stats-2013-03-12-How tall is Jon Lee Anderson?
11 0.45656943 752 andrew gelman stats-2011-06-08-Traffic Prediction
13 0.4512406 211 andrew gelman stats-2010-08-17-Deducer update
14 0.45063838 563 andrew gelman stats-2011-02-07-Evaluating predictions of political events
15 0.44922709 2197 andrew gelman stats-2014-02-04-Peabody here.
16 0.44763482 300 andrew gelman stats-2010-09-28-A calibrated Cook gives Dems the edge in Nov, sez Sandy
17 0.44662213 1758 andrew gelman stats-2013-03-11-Yes, the decision to try (or not) to have a child can be made rationally
18 0.44307116 1804 andrew gelman stats-2013-04-15-How effective are football coaches?
19 0.44202635 424 andrew gelman stats-2010-11-21-Data cleaning tool!
topicId topicWeight
[(12, 0.069), (16, 0.048), (27, 0.062), (64, 0.426), (99, 0.149)]
simIndex simValue blogId blogTitle
same-blog 1 0.90519142 1109 andrew gelman stats-2012-01-09-Google correlate links statistics with minorities
Introduction: John Eppley asks what I make of this : Eppley is guessing the negative spikes are searches getting swamped by holiday season shoppers.
2 0.69058615 985 andrew gelman stats-2011-11-01-Doug Schoen has 2 poll reports
Introduction: According to Chris Wilson , there are two versions of the report of the Occupy Wall Street poll from so-called hack pollster Doug Schoen. Here’s the report that Azi Paybarah says that Schoen sent to him, and here’s the final question from the poll: And here’s what’s on Schoen’s own website: Very similar, except for that last phrase, “no matter what the cost.” I have no idea which was actually asked to the survey participants, but it’s a reminder of the difficulties of public opinion research—sometimes you don’t even know what question was asked! I’m not implying anything sinister on Schoen’s part, it’s just interesting to see these two documents floating around. P.S. More here from Kaiser Fung on fundamental flaws with Schoen’s poll.
3 0.60926676 724 andrew gelman stats-2011-05-21-New search engine for data & statistics
Introduction: Jon Goldhill points us to a new search engine, Zanran , which is for finding data and statistics. Goldhill writes: It’s useful when you’re looking for a graph/table rather than a single number. For example, if you look for ‘teenage births rates in the united states’ in Zanran you’ll see a series of graphs. If you check in Google, there’s plenty of material – but you’d have to open everything up to see if it had any real numbers. (I hope you’ll appreciate Zanran’s preview capability as well – hovering over the icons gives a useful preview of the content.)
4 0.60596216 595 andrew gelman stats-2011-02-28-What Zombies see in Scatterplots
Introduction: This video caught my interest – news video clip (from this post2 ) http://www.stat.columbia.edu/~cook/movabletype/archives/2011/02/on_summarizing.html The news commentator did seem to be trying to point out what a couple of states had to say about the claimed relationship – almost on their own. Some methods have been worked out for zombies to do just this! So I grabbed the data as close as I quickly could, modified the code slightly and here’s the zombie veiw of it. PoliticInt.pdf North Carolina is the bolded red curve, Idaho the bolded green curve. Missisipi and New York are the bolded blue. As ugly as it is this is the Bayasian marginal picture – exactly (given MCMC errror). K? p.s. you will get a very confusing picture if you forget to centre the x (i.e. see chapter 4 of Gelman and Hill book)
5 0.53761983 1521 andrew gelman stats-2012-10-04-Columbo does posterior predictive checks
Introduction: I’m already on record as saying that Ronald Reagan was a statistician so I think this is ok too . . . Here’s what Columbo does. He hears the killer’s story and he takes it very seriously (it’s murder, and Columbo never jokes about murder), examines all its implications, and finds where it doesn’t fit the data. Then Columbo carefully examines the discrepancies, tries some model expansion, and eventually concludes that he’s proved there’s a problem. OK, now you’re saying: Yeah, yeah, sure, but how does that differ from any other fictional detective? The difference, I think, is that the tradition is for the detective to find clues and use these to come up with hypotheses, or to trap the killer via internal contradictions in his or her statement. I see Columbo is different—and more in keeping with chapter 6 of Bayesian Data Analysis—in that he is taking the killer’s story seriously and exploring all its implications. That’s the essence of predictive model checking: you t
6 0.53755462 1653 andrew gelman stats-2013-01-04-Census dotmap
7 0.52661085 118 andrew gelman stats-2010-06-30-Question & Answer Communities
8 0.51095271 1058 andrew gelman stats-2011-12-14-Higgs bozos: Rosencrantz and Guildenstern are spinning in their graves
9 0.45719534 11 andrew gelman stats-2010-04-29-Auto-Gladwell, or Can fractals be used to predict human history?
11 0.40204319 304 andrew gelman stats-2010-09-29-Data visualization marathon
12 0.39420396 1637 andrew gelman stats-2012-12-24-Textbook for data visualization?
13 0.38715053 1554 andrew gelman stats-2012-10-31-It not necessary that Bayesian methods conform to the likelihood principle
14 0.37972358 2249 andrew gelman stats-2014-03-15-Recently in the sister blog
15 0.37712529 1949 andrew gelman stats-2013-07-21-Defensive political science responds defensively to an attack on social science
16 0.37452486 2221 andrew gelman stats-2014-02-23-Postdoc with Huffpost Pollster to do Bayesian poll tracking
17 0.36432844 1761 andrew gelman stats-2013-03-13-Lame Statistics Patents
18 0.35877332 1727 andrew gelman stats-2013-02-19-Beef with data
19 0.34946582 1119 andrew gelman stats-2012-01-15-Excellence in Statistical Reporting Award
20 0.34898376 1008 andrew gelman stats-2011-11-13-Student project competition