andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-981 knowledge-graph by maker-knowledge-mining

981 andrew gelman stats-2011-10-30-rms2

meta infos for this blog

Source: html

Introduction: In case you just can’t get enough, check out this amusing interview. The interview is from the year 2000 (I think) but it reads like it could’ve been done yesterday.

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 In case you just can’t get enough, check out this amusing interview. [sent-1, score-0.994]

2 The interview is from the year 2000 (I think) but it reads like it could’ve been done yesterday. [sent-2, score-1.405]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('reads', 0.466), ('amusing', 0.439), ('interview', 0.408), ('yesterday', 0.358), ('check', 0.293), ('year', 0.228), ('done', 0.223), ('enough', 0.182), ('case', 0.158), ('ve', 0.125), ('could', 0.107), ('get', 0.104), ('like', 0.08), ('think', 0.08)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 981 andrew gelman stats-2011-10-30-rms2

Introduction: In case you just can’t get enough, check out this amusing interview. The interview is from the year 2000 (I think) but it reads like it could’ve been done yesterday.

2 0.2615599 409 andrew gelman stats-2010-11-11-“Tiny,” “Large,” “Very,” “Nice,” “Dumbest”

Introduction: Amusing authorship analysis.

3 0.15869561 2369 andrew gelman stats-2014-06-11-“I can’t drive home now. Not just yet. First I need to go to Utrecht.”

Introduction: EJ points me to this new techno-thriller . Based on the sentence quoted above, I don’t see it selling lots of copies. It reads like a really boring Raymond Chandler. I still think these two movie ideas would be a better sell.

4 0.15154192 505 andrew gelman stats-2011-01-05-Wacky interview questions: An exploration into the nature of evidence on the internet

Introduction: Gayle Laackmann reports ( link from Felix Salmon) that Microsoft, Google, etc. don’t actually ask brain-teasers in their job interviews. The actually ask a lot of questions about programming. (I looked here and was relieved to see that the questions aren’t very hard. I could probably get a job as an entry-level programmer if I needed to.) Laackmann writes: Let’s look at the very widely circulated “15 Google Interview Questions that will make you feel stupid” list [ here's the original list , I think, from Lewis Lin] . . . these questions are fake. Fake fake fake. How can you tell that they’re fake? Because one of them is “Why are manhole covers round?” This is an infamous Microsoft interview question that has since been so very, very banned at both companies . I find it very hard to believe that a Google interviewer asked such a question. We’ll get back to the manhole question in a bit. Lacakmann reports that she never saw any IQ tests in three years of interviewi

5 0.14484821 1851 andrew gelman stats-2013-05-11-Actually, I have no problem with this graph

Introduction: Tom Salvesen asks, is this the worst info-graphic of the year? I say, no. Nobody really cares about these numbers. It’s an amusing feature. The alternative would not be a better display of these data, the alternative would be some photo or cartoon. They’re just having fun. I wouldn’t give it any design awards but it’s fine, it is what it is.

6 0.12393048 2283 andrew gelman stats-2014-04-06-An old discussion of food deserts

7 0.11383871 1514 andrew gelman stats-2012-09-28-AdviseStat 47% Campaign Ad

8 0.10426684 1488 andrew gelman stats-2012-09-08-Annals of spam

9 0.10331193 594 andrew gelman stats-2011-02-28-Behavioral economics doesn’t seem to have much to say about marriage

10 0.10060763 233 andrew gelman stats-2010-08-25-Lauryn Hill update

11 0.10056229 1640 andrew gelman stats-2012-12-26-What do people do wrong? WSJ columnist is looking for examples!

12 0.093328357 826 andrew gelman stats-2011-07-27-The Statistics Forum!

13 0.088337503 434 andrew gelman stats-2010-11-28-When Small Numbers Lead to Big Errors

14 0.081728794 699 andrew gelman stats-2011-05-06-Another stereotype demolished

15 0.078496173 2116 andrew gelman stats-2013-11-28-“Statistics is what people think math is”

16 0.078387029 554 andrew gelman stats-2011-02-04-An addition to the model-makers’ oath

17 0.076609254 1624 andrew gelman stats-2012-12-15-New prize on causality in statstistics education

18 0.076270938 444 andrew gelman stats-2010-12-02-Rational addiction

19 0.075607069 658 andrew gelman stats-2011-04-11-Statistics in high schools: Towards more accessible conceptions of statistical inference

20 0.073847376 2287 andrew gelman stats-2014-04-09-Advice: positive-sum, zero-sum, or negative-sum

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.088), (1, -0.033), (2, 0.003), (3, 0.008), (4, 0.03), (5, 0.001), (6, 0.045), (7, 0.009), (8, 0.04), (9, -0.042), (10, 0.051), (11, -0.025), (12, -0.0), (13, 0.006), (14, -0.055), (15, -0.0), (16, 0.017), (17, -0.009), (18, 0.041), (19, 0.013), (20, 0.008), (21, -0.009), (22, 0.024), (23, -0.023), (24, -0.015), (25, 0.012), (26, -0.036), (27, -0.005), (28, -0.06), (29, -0.029), (30, 0.014), (31, 0.01), (32, -0.002), (33, -0.027), (34, -0.032), (35, -0.025), (36, -0.041), (37, 0.027), (38, -0.033), (39, 0.027), (40, 0.006), (41, 0.006), (42, -0.038), (43, 0.041), (44, -0.03), (45, -0.011), (46, -0.03), (47, -0.032), (48, -0.014), (49, 0.015)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.92077148 981 andrew gelman stats-2011-10-30-rms2

Introduction: In case you just can’t get enough, check out this amusing interview. The interview is from the year 2000 (I think) but it reads like it could’ve been done yesterday.

2 0.66405022 693 andrew gelman stats-2011-05-04-Don’t any statisticians work for the IRS?

Introduction: A friend asks the above question and writes: This article left me thinking – how could the IRS not notice that this guy didn’t file taxes for several years? Don’t they run checks and notice if you miss a year? If I write a check our of order, there’s an asterisk next to the check number in my next bank statement showing that there was a gap in the sequence. If you ran the IRS, wouldn’t you do this: SSNs are issued sequentially. Once a SSN reaches 18, expect it to file a return. If it doesn’t, mail out a postage paid letter asking why not with check boxes such as Student, Unemployed, etc. Follow up at reasonable intervals. Eventually every SSN should be filing a return, or have an international address. Yes this is intrusive, but my goal is only to maximize tax revenue. Surely people who do this for a living could come up with something more elegant. My response: I dunno, maybe some confidentiality rules? The other thing is that I’m guessing that IRS gets lots of pushback w

3 0.64007717 129 andrew gelman stats-2010-07-05-Unrelated to all else

Introduction: Another stereotype is affirmed when I go on the U.K. rail system webpage and it repeatedly times out on me. At one point I have a browser open with the itinerary I’m interested in, and then awhile later I reopen the window (not clicking on anything on the page, just bringing the window up on the screen) but it’s timed out again. P.S. Yes, yes, I know that Amtrak is worse. Still, it’s amusing to see a confirmation that, at least in one respect, the British trains are as bad as they say.

4 0.6329959 1882 andrew gelman stats-2013-06-03-The statistical properties of smart chains (and referral chains more generally)

Introduction: Louis Mittel writes: The premise of the column this guy is starting is interesting: Noah Davis interviews a smart person and then interviews the smartest person that smart person knows and so on. It reminded me of you mentioning survey design strategy of asking people about other people, like “How many people do you know named Stuart?” or “How many people do you know that have had an abortion?” Ignoring the interview aspect of what this guy is doing, I think there’s some cool questions about the distribution/path behavior of smartest-person-I-know chains (say, seeded at random). Do they loop? If so, how long do they run before looping, how large are the loops? What parts of the population do the explore? Do you know of anything that’s been done on something like this? My reply: Interesting question. It could be asked of any referral chain, for example asking a sequence of people, “Who’s the tallest person you know?” or “Who’s the best piano player you know” or “Who’

5 0.63271016 505 andrew gelman stats-2011-01-05-Wacky interview questions: An exploration into the nature of evidence on the internet

6 0.62921447 594 andrew gelman stats-2011-02-28-Behavioral economics doesn’t seem to have much to say about marriage

7 0.61964792 1563 andrew gelman stats-2012-11-05-Someone is wrong on the internet, part 2

8 0.6189642 1597 andrew gelman stats-2012-11-29-What is expected of a consultant

9 0.60840845 1995 andrew gelman stats-2013-08-23-“I mean, what exact buttons do I have to hit?”

10 0.6029911 1277 andrew gelman stats-2012-04-23-Infographic of the year

11 0.59928906 2300 andrew gelman stats-2014-04-21-Ticket to Baaaath

12 0.59672606 1640 andrew gelman stats-2012-12-26-What do people do wrong? WSJ columnist is looking for examples!

13 0.59301752 835 andrew gelman stats-2011-08-02-“The sky is the limit” isn’t such a good thing

14 0.58892232 995 andrew gelman stats-2011-11-06-Statistical models and actual models

15 0.58892196 1831 andrew gelman stats-2013-04-29-The Great Race

16 0.58702725 2347 andrew gelman stats-2014-05-25-Why I decided not to be a physicist

17 0.58559144 532 andrew gelman stats-2011-01-23-My Wall Street Journal story

18 0.58461881 626 andrew gelman stats-2011-03-23-Physics is hard

19 0.5835368 297 andrew gelman stats-2010-09-27-An interesting education and statistics blog

20 0.58242249 135 andrew gelman stats-2010-07-09-Rasmussen sez: “108% of Respondents Say . . .”

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(15, 0.208), (24, 0.169), (99, 0.395)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.99715376 1794 andrew gelman stats-2013-04-09-My talks in DC and Baltimore this week

Introduction: U.S. Treasury, Office of Financial Research, Tues 9 Apr afternoon (I don’t actually know exactly when or in what room): Parameterization and Bayesian Modeling — Johns Hopkins University, Department of Biostatistics, 4pm Wed 10 Apr, room W2030 School of Public Health : Little data: How traditional statistical ideas remain relevant in a big-data world At the end of the day, after all the processing, big data are being used to answer little- data questions such as, Does an observed pattern generalize to the larger population?, or Could it be explained by alternative processes (sometimes called “chance”)? We discuss some recent ideas in the world of “little data” that remain of big importance.

2 0.99269772 133 andrew gelman stats-2010-07-08-Gratuitous use of “Bayesian Statistics,” a branding issue?

Introduction: I’m on an island in Maine for a few weeks (big shout out for North Haven!) This morning I picked up a copy of “Working Waterfront,” a newspaper that focuses on issues of coastal fishing communities. I came across an article about modeling “fish” populations — actually lobsters, I guess they’re considered “fish” for regulatory purposes. When I read it, I thought “wow, this article is really well-written, not dumbed down like articles in most newspapers.” I think it’s great that a small coastal newspaper carries reporting like this. (The online version has a few things that I don’t recall in the print version, too, so it’s even better). But in addition to being struck by finding such a good article in a small newspaper, I was struck by this: According to [University of Maine scientist Yong] Chen, there are four main areas where his model improved on the prior version. “We included the inshore trawl data from Maine and other state surveys, in addition to federal survey data; we h

3 0.98711145 1541 andrew gelman stats-2012-10-19-Statistical discrimination again

Introduction: Mark Johnstone writes: I’ve recently been investigating a new European Court of Justice ruling on insurance calculations (on behalf of MoneySuperMarket) and I found something related to statistics that caught my attention. . . . The ruling (which comes into effect in December 2012) states that insurers in Europe can no longer provide different premiums based on gender. Despite the fact that women are statistically safer drivers, unless it’s biologically proven there is a causal relationship between being female and being a safer driver, this is now seen as an act of discrimination (more on this from the Wall Street Journal). However, where do you stop with this? What about age? What about other factors? And what does this mean for the application of statistics in general? Is it inherently unjust in this context? One proposal has been to fit ‘black boxes’ into cars so more individual data can be collected, as opposed to relying heavily on aggregates. For fans of data and s

4 0.98526406 1908 andrew gelman stats-2013-06-21-Interpreting interactions in discrete-data regression

Introduction: Mike Johns writes: Are you familiar with the work of Ai and Norton on interactions in logit/probit models? I’d be curious to hear your thoughts. Ai, C.R. and Norton E.C. 2003. Interaction terms in logit and probit models. Economics Letters 80(1): 123-129. A peer ref just cited this paper in reaction to a logistic model we tested and claimed that the “only” way to test an interaction in logit/probit regression is to use the cross derivative method of Ai & Norton. I’ve never heard of this issue or method. It leaves me wondering what the interaction term actually tests (something Ai & Norton don’t discuss) and why such an important discovery is not more widely known. Is this an issue that is of particular relevance to econometric analysis because they approach interactions from the difference-in-difference perspective? Full disclosure, I’m coming from a social science/epi background. Thus, i’m not interested in the d-in-d estimator; I want to know if any variables modify the rela

5 0.98356605 945 andrew gelman stats-2011-10-06-W’man < W’pedia, again

Introduction: Blogger Deep Climate looks at another paper by the 2002 recipient of the American Statistical Association’s Founders award. This time it’s not funny, it’s just sad. Here’s Wikipedia on simulated annealing: By analogy with this physical process, each step of the SA algorithm replaces the current solution by a random “nearby” solution, chosen with a probability that depends on the difference between the corresponding function values and on a global parameter T (called the temperature), that is gradually decreased during the process. The dependency is such that the current solution changes almost randomly when T is large, but increasingly “downhill” as T goes to zero. The allowance for “uphill” moves saves the method from becoming stuck at local minima—which are the bane of greedier methods. And here’s Wegman: During each step of the algorithm, the variable that will eventually represent the minimum is replaced by a random solution that is chosen according to a temperature

6 0.98154783 1081 andrew gelman stats-2011-12-24-Statistical ethics violation

7 0.98087394 329 andrew gelman stats-2010-10-08-More on those dudes who will pay your professor $8000 to assign a book to your class, and related stories about small-time sleazoids

8 0.98034501 1624 andrew gelman stats-2012-12-15-New prize on causality in statstistics education

9 0.97746956 1833 andrew gelman stats-2013-04-30-“Tragedy of the science-communication commons”

10 0.97249079 1393 andrew gelman stats-2012-06-26-The reverse-journal-submission system

same-blog 11 0.97216946 981 andrew gelman stats-2011-10-30-rms2

12 0.97171378 274 andrew gelman stats-2010-09-14-Battle of the Americans: Writer at the American Enterprise Institute disparages the American Political Science Association

13 0.97112137 1779 andrew gelman stats-2013-03-27-“Two Dogmas of Strong Objective Bayesianism”

14 0.96976185 834 andrew gelman stats-2011-08-01-I owe it all to the haters

15 0.96908516 576 andrew gelman stats-2011-02-15-With a bit of precognition, you’d have known I was going to post again on this topic, and with a lot of precognition, you’d have known I was going to post today

16 0.96687877 1998 andrew gelman stats-2013-08-25-A new Bem theory

17 0.96563917 803 andrew gelman stats-2011-07-14-Subtleties with measurement-error models for the evaluation of wacky claims

18 0.9561547 1683 andrew gelman stats-2013-01-19-“Confirmation, on the other hand, is not sexy”

19 0.95576125 1395 andrew gelman stats-2012-06-27-Cross-validation (What is it good for?)

20 0.952259 1888 andrew gelman stats-2013-06-08-New Judea Pearl journal of causal inference