andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1109 knowledge-graph by maker-knowledge-mining

1109 andrew gelman stats-2012-01-09-Google correlate links statistics with minorities

meta infos for this blog

Source: html

Introduction: John Eppley asks what I make of this : Eppley is guessing the negative spikes are searches getting swamped by holiday season shoppers.

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 John Eppley asks what I make of this : Eppley is guessing the negative spikes are searches getting swamped by holiday season shoppers. [sent-1, score-1.697]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('eppley', 0.709), ('shoppers', 0.323), ('searches', 0.304), ('swamped', 0.291), ('holiday', 0.254), ('season', 0.228), ('guessing', 0.173), ('asks', 0.157), ('negative', 0.14), ('john', 0.114), ('getting', 0.097), ('make', 0.053)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1109 andrew gelman stats-2012-01-09-Google correlate links statistics with minorities

Introduction: John Eppley asks what I make of this : Eppley is guessing the negative spikes are searches getting swamped by holiday season shoppers.

2 0.14523004 752 andrew gelman stats-2011-06-08-Traffic Prediction

Introduction: I always thought predicting traffic for a particular day and time would be something easily predicted from historic data with regression. Google Maps now has this feature: It would be good to actually include season, holiday and similar information: the predictions would be better. I wonder if one can find this data easily, or if others have done this work before.

3 0.04522315 737 andrew gelman stats-2011-05-30-Memorial Day question

Introduction: When I was a kid they shifted a bunch of holidays to Monday. (Not all the holidays: they kept New Year’s, Christmas, and July 4th on fixed dates, they kept Thanksgiving on a Thursday, and for some reason the shifted Veterans Day didn’t stick. But they successfully moved Washington’s Birthday, Memorial Day, and Columbus Day. It makes sense to give people a 3-day weekend. I have no idea why they picked Monday rather than Friday, but either one would do, I suppose. My question is: if this Monday holiday thing was such a good idea, why did it take them so long to do it?

4 0.044965945 2270 andrew gelman stats-2014-03-28-Creating a Lenin-style democracy

Introduction: Mark Palko explains why a penalty for getting the wrong answer on a test (the SAT, which is used in college admissions and which is used in the famous 8 schools example) is not a “penalty for guessing.” Then the very next day he catches this from Todd Balf in the New York Times Magazine: Students were docked one-quarter point for every multiple-choice question they got wrong, requiring a time-consuming risk analysis to determine which questions to answer and which to leave blank. Ugh! That just makes me want to . . . ok, I won’t go there. Anyway, Palko goes to the trouble to explain: While time management for a test like the SAT can be complicated, the rule for guessing is embarrassingly simple: give your best guess for questions you read; don’t waste time guessing on questions that you didn’t have time to read. The risk analysis actually becomes much more complicated when you take away the penalty for guessing. On the ACT (or the new SAT), there is a positive

5 0.042109121 1167 andrew gelman stats-2012-02-14-Extra babies on Valentine’s Day, fewer on Halloween?

Introduction: Just in time for the holiday, X pointed me to an article by Becca Levy, Pil Chung, and Martin Slade reporting that, during a recent eleven-year period, more babies were born on Valentine’s Day and fewer on Halloween compared to neighboring days: What I’d really like to see is a graph with all 366 days of the year. It would be easy enough to make. That way we could put the Valentine’s and Halloween data in the context of other possible patterns. While they’re at it, they could also graph births by day of the week and show Thanksgiving, Easter, and other holidays that don’t have fixed dates. It’s so frustrating when people only show part of the story. The data are publicly available, so maybe someone could make those graphs? If the Valentine’s/Halloween data are worth publishing, I think more comprehensive graphs should be publishable as well. I’d post them here, that’s for sure.

6 0.041067105 703 andrew gelman stats-2011-05-10-Bringing Causal Models Into the Mainstream

7 0.040360272 2054 andrew gelman stats-2013-10-07-Bing is preferred to Google by people who aren’t like me

8 0.039131016 1671 andrew gelman stats-2013-01-13-Preregistration of Studies and Mock Reports

9 0.037398741 2068 andrew gelman stats-2013-10-18-G+ hangout for Bayesian Data Analysis course now! (actually, in 5 minutes)

10 0.0366829 949 andrew gelman stats-2011-10-10-Grrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr

11 0.035739467 1098 andrew gelman stats-2012-01-04-Bayesian Page Rank?

12 0.033259716 1851 andrew gelman stats-2013-05-11-Actually, I have no problem with this graph

13 0.032827709 1918 andrew gelman stats-2013-06-29-Going negative

14 0.032433197 1933 andrew gelman stats-2013-07-10-Please send all comments to -dev-ripley

15 0.032073263 183 andrew gelman stats-2010-08-04-Bayesian models for simultaneous equation systems?

16 0.032070547 185 andrew gelman stats-2010-08-04-Why does anyone support private macroeconomic forecasts?

17 0.030803232 1174 andrew gelman stats-2012-02-18-Not as ugly as you look

18 0.030410176 2233 andrew gelman stats-2014-03-04-Literal vs. rhetorical

19 0.029277626 1101 andrew gelman stats-2012-01-05-What are the standards for reliability in experimental psychology?

20 0.028247112 1406 andrew gelman stats-2012-07-05-Xiao-Li Meng and Xianchao Xie rethink asymptotics

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.018), (1, -0.01), (2, -0.0), (3, 0.004), (4, 0.004), (5, -0.004), (6, 0.007), (7, 0.0), (8, 0.006), (9, 0.001), (10, -0.001), (11, -0.001), (12, 0.007), (13, -0.009), (14, -0.011), (15, 0.007), (16, -0.004), (17, 0.01), (18, 0.007), (19, -0.006), (20, -0.001), (21, 0.011), (22, -0.001), (23, -0.005), (24, -0.001), (25, 0.0), (26, 0.005), (27, 0.003), (28, -0.008), (29, -0.007), (30, 0.003), (31, -0.003), (32, 0.02), (33, -0.014), (34, 0.013), (35, 0.009), (36, -0.003), (37, -0.003), (38, -0.008), (39, 0.005), (40, 0.001), (41, 0.002), (42, -0.014), (43, 0.005), (44, -0.015), (45, 0.012), (46, 0.004), (47, -0.008), (48, 0.034), (49, -0.005)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.93281668 1109 andrew gelman stats-2012-01-09-Google correlate links statistics with minorities

Introduction: John Eppley asks what I make of this : Eppley is guessing the negative spikes are searches getting swamped by holiday season shoppers.

2 0.53366959 1255 andrew gelman stats-2012-04-10-Amtrak sucks

Introduction: Couldn’t they at least let me buy my tickets from Amazon so I wouldn’t have to re-enter the credit card information each time? Yeah, yeah, I know it’s no big deal. It just seems so silly.

3 0.49086401 131 andrew gelman stats-2010-07-07-A note to John

Introduction: Jeff the Productivity Sapper points me to this insulting open letter to Nate Silver written by pollster John Zogby. I’ll go through bits of Zogby’s note line by line. (Conflict of interest warning: I have collaborated with Nate and I blog on his site). Zogby writes: Here is some advice from someone [Zogby] who has been where you [Silver] are today. Sorry, John. (I can call you that, right? Since you’re calling Nate “Nate”?). Yes, you were once the hot pollster. But, no, you were never where Nate is today. Don’t kid yourself. Zogby writes: You [Nate] are hot right now – using an aggregate of other people’s work, you got 49 of 50 states right in 2008. Yes, Nate used other people’s work. That’s what’s called “making use of available data.” Or, to use a more technical term employed in statistics, it’s called “not being an idiot.” Only in the wacky world of polling are you supposed to draw inferences about the U.S.A. using only a single survey organization. I do

4 0.48566088 1808 andrew gelman stats-2013-04-17-Excel-bashing

Introduction: In response to the latest controversy , a statistics professor writes: It’s somewhat surprising to see Very Serious Researchers (apologies to Paul Krugman) using Excel. Some years ago, I was consulting on a trademark infringement case and was trying (unsuccessfully) to replicate another expert’s regression analysis. It wasn’t until I had the brainstorm to use Excel that I was able to reproduce his results – it may be better now, but at the time, Excel could propagate round-off error and catastrophically cancel like no other software! Microsoft has lots of top researchers so it’s hard for me to understand how Excel can remain so crappy. I mean, sure, I understand in some general way that they have a large user base, it’s hard to maintain backward compatibility, there’s feature creep, and, besides all that, lots of people have different preferences in data analysis than I do. But still, it’s such a joke. Word has problems too, but I can see how these problems arise from its d

5 0.48543468 1553 andrew gelman stats-2012-10-30-Real rothko, fake rothko

Introduction: Jay Livingston writes : I know that in art, quality and value are two very different things. Still, I had to stop and wonder when I read about Domenico and Eleanore De Sole, who in 2004 paid $8.3 million for a painting attributed to Mark Rothko that they now say is a worthless fake. One day a painting is worth $8.3 million; the next day, the same painting – same quality, same capacity to give aesthetic pleasure or do whatever it is that art does – is “worthless.”* Art forgery also makes me wonder about the buyer’s motive. If the buyer wanted only to have and to gaze upon something beautiful, something with artistic merit, then a fake Rothko is no different than a real Rothko. It seems more likely that what the buyer wants is to own something valuable – i.e., something that costs a lot. Displaying your brokerage account statements is just too crude and obvious. What the high-end art market offers is a kind of money laundering. Objects that are rare and therefore expensive

6 0.48396337 364 andrew gelman stats-2010-10-22-Politics is not a random walk: Momentum and mean reversion in polling

7 0.47667938 954 andrew gelman stats-2011-10-12-Benford’s Law suggests lots of financial fraud

8 0.46879697 1694 andrew gelman stats-2013-01-26-Reflections on ethicsblogging

9 0.45741203 1874 andrew gelman stats-2013-05-28-Nostalgia

10 0.45691705 1759 andrew gelman stats-2013-03-12-How tall is Jon Lee Anderson?

11 0.45656943 752 andrew gelman stats-2011-06-08-Traffic Prediction

12 0.45256984 1390 andrew gelman stats-2012-06-23-Traditionalist claims that modern art could just as well be replaced by a “paint-throwing chimp”

13 0.4512406 211 andrew gelman stats-2010-08-17-Deducer update

14 0.45063838 563 andrew gelman stats-2011-02-07-Evaluating predictions of political events

15 0.44922709 2197 andrew gelman stats-2014-02-04-Peabody here.

16 0.44763482 300 andrew gelman stats-2010-09-28-A calibrated Cook gives Dems the edge in Nov, sez Sandy

17 0.44662213 1758 andrew gelman stats-2013-03-11-Yes, the decision to try (or not) to have a child can be made rationally

18 0.44307116 1804 andrew gelman stats-2013-04-15-How effective are football coaches?

19 0.44202635 424 andrew gelman stats-2010-11-21-Data cleaning tool!

20 0.44057536 1213 andrew gelman stats-2012-03-15-Economics now = Freudian psychology in the 1950s: More on the incoherence of “economics exceptionalism”

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(12, 0.069), (16, 0.048), (27, 0.062), (64, 0.426), (99, 0.149)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.90519142 1109 andrew gelman stats-2012-01-09-Google correlate links statistics with minorities

Introduction: John Eppley asks what I make of this : Eppley is guessing the negative spikes are searches getting swamped by holiday season shoppers.

2 0.69058615 985 andrew gelman stats-2011-11-01-Doug Schoen has 2 poll reports

Introduction: According to Chris Wilson , there are two versions of the report of the Occupy Wall Street poll from so-called hack pollster Doug Schoen. Here’s the report that Azi Paybarah says that Schoen sent to him, and here’s the final question from the poll: And here’s what’s on Schoen’s own website: Very similar, except for that last phrase, “no matter what the cost.” I have no idea which was actually asked to the survey participants, but it’s a reminder of the difficulties of public opinion research—sometimes you don’t even know what question was asked! I’m not implying anything sinister on Schoen’s part, it’s just interesting to see these two documents floating around. P.S. More here from Kaiser Fung on fundamental flaws with Schoen’s poll.

3 0.60926676 724 andrew gelman stats-2011-05-21-New search engine for data & statistics

Introduction: Jon Goldhill points us to a new search engine, Zanran , which is for finding data and statistics. Goldhill writes: It’s useful when you’re looking for a graph/table rather than a single number. For example, if you look for ‘teenage births rates in the united states’ in Zanran you’ll see a series of graphs. If you check in Google, there’s plenty of material – but you’d have to open everything up to see if it had any real numbers. (I hope you’ll appreciate Zanran’s preview capability as well – hovering over the icons gives a useful preview of the content.)

4 0.60596216 595 andrew gelman stats-2011-02-28-What Zombies see in Scatterplots

Introduction: This video caught my interest – news video clip (from this post2 ) http://www.stat.columbia.edu/~cook/movabletype/archives/2011/02/on_summarizing.html The news commentator did seem to be trying to point out what a couple of states had to say about the claimed relationship – almost on their own. Some methods have been worked out for zombies to do just this! So I grabbed the data as close as I quickly could, modified the code slightly and here’s the zombie veiw of it. PoliticInt.pdf North Carolina is the bolded red curve, Idaho the bolded green curve. Missisipi and New York are the bolded blue. As ugly as it is this is the Bayasian marginal picture – exactly (given MCMC errror). K? p.s. you will get a very confusing picture if you forget to centre the x (i.e. see chapter 4 of Gelman and Hill book)

5 0.53761983 1521 andrew gelman stats-2012-10-04-Columbo does posterior predictive checks

Introduction: I’m already on record as saying that Ronald Reagan was a statistician so I think this is ok too . . . Here’s what Columbo does. He hears the killer’s story and he takes it very seriously (it’s murder, and Columbo never jokes about murder), examines all its implications, and finds where it doesn’t fit the data. Then Columbo carefully examines the discrepancies, tries some model expansion, and eventually concludes that he’s proved there’s a problem. OK, now you’re saying: Yeah, yeah, sure, but how does that differ from any other fictional detective? The difference, I think, is that the tradition is for the detective to find clues and use these to come up with hypotheses, or to trap the killer via internal contradictions in his or her statement. I see Columbo is different—and more in keeping with chapter 6 of Bayesian Data Analysis—in that he is taking the killer’s story seriously and exploring all its implications. That’s the essence of predictive model checking: you t

6 0.53755462 1653 andrew gelman stats-2013-01-04-Census dotmap

7 0.52661085 118 andrew gelman stats-2010-06-30-Question & Answer Communities

8 0.51095271 1058 andrew gelman stats-2011-12-14-Higgs bozos: Rosencrantz and Guildenstern are spinning in their graves

9 0.45719534 11 andrew gelman stats-2010-04-29-Auto-Gladwell, or Can fractals be used to predict human history?

10 0.41071528 977 andrew gelman stats-2011-10-27-Hack pollster Doug Schoen illustrates a general point: The #1 way to lie with statistics is . . . to just lie!

11 0.40204319 304 andrew gelman stats-2010-09-29-Data visualization marathon

12 0.39420396 1637 andrew gelman stats-2012-12-24-Textbook for data visualization?

13 0.38715053 1554 andrew gelman stats-2012-10-31-It not necessary that Bayesian methods conform to the likelihood principle

14 0.37972358 2249 andrew gelman stats-2014-03-15-Recently in the sister blog

15 0.37712529 1949 andrew gelman stats-2013-07-21-Defensive political science responds defensively to an attack on social science

16 0.37452486 2221 andrew gelman stats-2014-02-23-Postdoc with Huffpost Pollster to do Bayesian poll tracking

17 0.36432844 1761 andrew gelman stats-2013-03-13-Lame Statistics Patents

18 0.35877332 1727 andrew gelman stats-2013-02-19-Beef with data

19 0.34946582 1119 andrew gelman stats-2012-01-15-Excellence in Statistical Reporting Award

20 0.34898376 1008 andrew gelman stats-2011-11-13-Student project competition