andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1882 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Louis Mittel writes: The premise of the column this guy is starting is interesting: Noah Davis interviews a smart person and then interviews the smartest person that smart person knows and so on. It reminded me of you mentioning survey design strategy of asking people about other people, like “How many people do you know named Stuart?” or “How many people do you know that have had an abortion?” Ignoring the interview aspect of what this guy is doing, I think there’s some cool questions about the distribution/path behavior of smartest-person-I-know chains (say, seeded at random). Do they loop? If so, how long do they run before looping, how large are the loops? What parts of the population do the explore? Do you know of anything that’s been done on something like this? My reply: Interesting question. It could be asked of any referral chain, for example asking a sequence of people, “Who’s the tallest person you know?” or “Who’s the best piano player you know” or “Who’
sentIndex sentText sentNum sentScore
1 Louis Mittel writes: The premise of the column this guy is starting is interesting: Noah Davis interviews a smart person and then interviews the smartest person that smart person knows and so on. [sent-1, score-1.994]
2 It reminded me of you mentioning survey design strategy of asking people about other people, like “How many people do you know named Stuart? [sent-2, score-0.455]
3 ” or “How many people do you know that have had an abortion? [sent-3, score-0.19]
4 ” Ignoring the interview aspect of what this guy is doing, I think there’s some cool questions about the distribution/path behavior of smartest-person-I-know chains (say, seeded at random). [sent-4, score-0.473]
5 Do you know of anything that’s been done on something like this? [sent-8, score-0.102]
6 It could be asked of any referral chain, for example asking a sequence of people, “Who’s the tallest person you know? [sent-10, score-0.67]
7 ” or “Who’s the best piano player you know” or “Who’s the weirdest person you know” or whatever. [sent-11, score-0.425]
8 But let’s stick with the “who’s the smartest” chain. [sent-12, score-0.067]
9 In answer to Louis’s first question: yes, such a chain would have to loop, as there’s only a finite number of people. [sent-13, score-0.19]
10 For example, if you ask Stephen Hawking for the smartest person he knows, and then ask that next person, you’ll probably loop back to . [sent-15, score-0.954]
11 The distribution of lengths of the loops, that I have no idea. [sent-19, score-0.158]
12 I’m trying to think how one could measure the distribution of this sort of referral network. [sent-20, score-0.308]
13 He tried to convince me to invest $10,000 to start an ISP in Cambridge. [sent-24, score-0.232]
14 If I had listened to him, I would have been like Zuckerberg or something. [sent-30, score-0.088]
15 The guy is rich, successful, can do anything he wants. [sent-33, score-0.215]
16 One thing that came up in comments is, can people refer to themselves? [sent-38, score-0.088]
17 I assume not, otherwise all chains would eventually dead-end at Stephen Hawking, Scott Adams, and that albedo guy. [sent-39, score-0.218]
wordName wordTfidf (topN-words)
[('smartest', 0.279), ('loops', 0.263), ('person', 0.261), ('loop', 0.252), ('referral', 0.234), ('guy', 0.215), ('hawking', 0.201), ('stephen', 0.186), ('internet', 0.174), ('louis', 0.159), ('interviews', 0.142), ('chains', 0.14), ('missed', 0.127), ('chain', 0.125), ('smart', 0.125), ('interview', 0.118), ('looping', 0.107), ('isp', 0.107), ('mittel', 0.107), ('know', 0.102), ('asking', 0.101), ('provider', 0.101), ('boat', 0.101), ('zuckerberg', 0.101), ('knows', 0.097), ('piano', 0.096), ('listened', 0.088), ('people', 0.088), ('invest', 0.086), ('premise', 0.086), ('lengths', 0.084), ('abortion', 0.082), ('davis', 0.082), ('ask', 0.081), ('stuart', 0.08), ('noah', 0.08), ('albedo', 0.078), ('adams', 0.078), ('mentioning', 0.076), ('convince', 0.075), ('sequence', 0.074), ('distribution', 0.074), ('start', 0.071), ('onto', 0.071), ('player', 0.068), ('stick', 0.067), ('ignoring', 0.067), ('struck', 0.067), ('growing', 0.065), ('finite', 0.065)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 1882 andrew gelman stats-2013-06-03-The statistical properties of smart chains (and referral chains more generally)
Introduction: Louis Mittel writes: The premise of the column this guy is starting is interesting: Noah Davis interviews a smart person and then interviews the smartest person that smart person knows and so on. It reminded me of you mentioning survey design strategy of asking people about other people, like “How many people do you know named Stuart?” or “How many people do you know that have had an abortion?” Ignoring the interview aspect of what this guy is doing, I think there’s some cool questions about the distribution/path behavior of smartest-person-I-know chains (say, seeded at random). Do they loop? If so, how long do they run before looping, how large are the loops? What parts of the population do the explore? Do you know of anything that’s been done on something like this? My reply: Interesting question. It could be asked of any referral chain, for example asking a sequence of people, “Who’s the tallest person you know?” or “Who’s the best piano player you know” or “Who’
2 0.13731059 1207 andrew gelman stats-2012-03-10-A quick suggestion
Introduction: Next time Stephen Wolfram is on the phone , maybe he could call the head of Human Resources at his company and get this guy fired?
3 0.11297634 1727 andrew gelman stats-2013-02-19-Beef with data
Introduction: Louis Mittel writes: Do you know why David Brooks has such a beef with data? My reply: I have no idea, but I’m happy that we’re now considered the establishment that he has to rebel against!
4 0.1062493 1044 andrew gelman stats-2011-12-06-The K Foundation burns Cosma’s turkey
Introduction: Shalizi delivers a slow, drawn-out illustration of the point that economic efficiency is all about who’s got the $, which isn’t always related to what we would usually call “efficiency” in other settings. (His point is related to my argument that the phrase “willingness to pay” should generally be replaced by “ability to pay.”) The basic story is simple: Good guy needs a turkey, bad guy wants a turkey. Bad guy is willing and able to pay more for the turkey than good guy can afford, hence good guy starves to death. The counterargument is that a market in turkeys will motivate producers to breed more turkeys, ultimately saturating the bad guys’ desires and leaving surplus turkeys for the good guys at a reasonable price. I’m sure there’s a counter-counterargument too, but I don’t want to go there. But what really amused me about Cosma’s essay was how he scrambled the usual cultural/political associations. (I assume he did this on purpose.) In the standard version of t
Introduction: Gayle Laackmann reports ( link from Felix Salmon) that Microsoft, Google, etc. don’t actually ask brain-teasers in their job interviews. The actually ask a lot of questions about programming. (I looked here and was relieved to see that the questions aren’t very hard. I could probably get a job as an entry-level programmer if I needed to.) Laackmann writes: Let’s look at the very widely circulated “15 Google Interview Questions that will make you feel stupid” list [ here's the original list , I think, from Lewis Lin] . . . these questions are fake. Fake fake fake. How can you tell that they’re fake? Because one of them is “Why are manhole covers round?” This is an infamous Microsoft interview question that has since been so very, very banned at both companies . I find it very hard to believe that a Google interviewer asked such a question. We’ll get back to the manhole question in a bit. Lacakmann reports that she never saw any IQ tests in three years of interviewi
6 0.1036315 1007 andrew gelman stats-2011-11-13-At last, treated with the disrespect that I deserve
7 0.099255569 430 andrew gelman stats-2010-11-25-The von Neumann paradox
8 0.093711205 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”
9 0.091955222 2233 andrew gelman stats-2014-03-04-Literal vs. rhetorical
10 0.090092905 2255 andrew gelman stats-2014-03-19-How Americans vote
11 0.086018264 1848 andrew gelman stats-2013-05-09-A tale of two discussion papers
13 0.082541555 1309 andrew gelman stats-2012-05-09-The first version of my “inference from iterative simulation using parallel sequences” paper!
14 0.08128909 108 andrew gelman stats-2010-06-24-Sometimes the raw numbers are better than a percentage
15 0.07892929 664 andrew gelman stats-2011-04-16-Dilbert update: cartooning can give you the strength to open jars with your bare hands
16 0.078816071 719 andrew gelman stats-2011-05-19-Everything is Obvious (once you know the answer)
17 0.077786356 716 andrew gelman stats-2011-05-17-Is the internet causing half the rapes in Norway? I wanna see the scatterplot.
18 0.076263517 831 andrew gelman stats-2011-07-30-A Wikipedia riddle!
19 0.07554099 763 andrew gelman stats-2011-06-13-Inventor of Connect Four dies at 91
topicId topicWeight
[(0, 0.145), (1, -0.056), (2, 0.01), (3, 0.023), (4, 0.021), (5, 0.023), (6, 0.076), (7, 0.0), (8, 0.031), (9, -0.032), (10, -0.018), (11, -0.061), (12, 0.016), (13, 0.011), (14, -0.025), (15, 0.003), (16, 0.012), (17, -0.014), (18, 0.054), (19, 0.026), (20, -0.032), (21, -0.049), (22, 0.031), (23, 0.002), (24, -0.008), (25, 0.016), (26, -0.017), (27, 0.026), (28, -0.007), (29, 0.008), (30, 0.063), (31, 0.011), (32, 0.005), (33, 0.048), (34, -0.04), (35, -0.024), (36, 0.018), (37, 0.031), (38, -0.064), (39, 0.02), (40, 0.005), (41, -0.007), (42, 0.021), (43, 0.027), (44, -0.034), (45, -0.022), (46, 0.019), (47, -0.025), (48, -0.011), (49, -0.0)]
simIndex simValue blogId blogTitle
same-blog 1 0.96687961 1882 andrew gelman stats-2013-06-03-The statistical properties of smart chains (and referral chains more generally)
Introduction: Louis Mittel writes: The premise of the column this guy is starting is interesting: Noah Davis interviews a smart person and then interviews the smartest person that smart person knows and so on. It reminded me of you mentioning survey design strategy of asking people about other people, like “How many people do you know named Stuart?” or “How many people do you know that have had an abortion?” Ignoring the interview aspect of what this guy is doing, I think there’s some cool questions about the distribution/path behavior of smartest-person-I-know chains (say, seeded at random). Do they loop? If so, how long do they run before looping, how large are the loops? What parts of the population do the explore? Do you know of anything that’s been done on something like this? My reply: Interesting question. It could be asked of any referral chain, for example asking a sequence of people, “Who’s the tallest person you know?” or “Who’s the best piano player you know” or “Who’
2 0.80878657 763 andrew gelman stats-2011-06-13-Inventor of Connect Four dies at 91
Introduction: Obit here . I think I have a cousin with the same last name as this guy, so maybe we’re related by marriage in some way. (By that standard we’re also related to Marge Simpson and, I seem to recall, the guy who wrote the scripts for Dark Shadows.)
3 0.80391687 430 andrew gelman stats-2010-11-25-The von Neumann paradox
Introduction: I, like Steve Hsu , I too would love to read a definitive biography of John von Neumann (or, as we’d say in the U.S., “John Neumann”). I’ve read little things about him in various places such as Stanislaw Ulam’s classic autobiography, and two things I’ve repeatedly noticed are: 1. Neumann comes off as a obnoxious, self-satisfied jerk. He just seems like the kind of guy I wouldn’t like in real life. 2. All these great men seem to really have loved the guy. It’s hard for me to reconcile two impressions above. Of course, lots of people have a good side and a bad side, but what’s striking here is that my impressions of Neumann’s bad side come from the very stories that his friends use to demonstrate how lovable he was! So, yes, I’d like to see the biography–but only if it could resolve this paradox. Also, I don’t know how relevant this is, but Neumann shares one thing with the more-lovable Ulam and the less-lovable Mandelbrot: all had Jewish backgrounds but didn’t seem to
4 0.79642338 1935 andrew gelman stats-2013-07-12-“A tangle of unexamined emotional impulses and illogical responses”
Introduction: Tyler Cowen posts the following note from a taxi driver: I learned very early on to never drive someone to their destination if it was a route they drove themselves, say to their home from the airport . . . Everyone prides themselves on driving the shortest route but they rarely do. . . . When I first started driving a cab, I drove the shortest route—always, I’m ethical—but people would accuse me of taking the long way because it wasn’t the way they drove . . . In the end, experts they consider themselves to be, people are a tangle of unexamined emotional impulses and illogical responses. I take a lot of rides to and from the airport, and I can assure you that a lot of taxi drivers don’t know the good routes. Once I had to start screaming from the back seat to stop the guy from getting on the BQE. I don’t “pride myself” on knowing a good route home from the airport, but I prefer the good route. I’m guessing that the taxi driver quoted above is subject to the same illusions
Introduction: Gayle Laackmann reports ( link from Felix Salmon) that Microsoft, Google, etc. don’t actually ask brain-teasers in their job interviews. The actually ask a lot of questions about programming. (I looked here and was relieved to see that the questions aren’t very hard. I could probably get a job as an entry-level programmer if I needed to.) Laackmann writes: Let’s look at the very widely circulated “15 Google Interview Questions that will make you feel stupid” list [ here's the original list , I think, from Lewis Lin] . . . these questions are fake. Fake fake fake. How can you tell that they’re fake? Because one of them is “Why are manhole covers round?” This is an infamous Microsoft interview question that has since been so very, very banned at both companies . I find it very hard to believe that a Google interviewer asked such a question. We’ll get back to the manhole question in a bit. Lacakmann reports that she never saw any IQ tests in three years of interviewi
6 0.781165 1231 andrew gelman stats-2012-03-27-Attention pollution
7 0.76877528 1007 andrew gelman stats-2011-11-13-At last, treated with the disrespect that I deserve
9 0.76742524 693 andrew gelman stats-2011-05-04-Don’t any statisticians work for the IRS?
10 0.76728433 1316 andrew gelman stats-2012-05-12-black and Black, white and White
11 0.76185954 594 andrew gelman stats-2011-02-28-Behavioral economics doesn’t seem to have much to say about marriage
12 0.75737941 1003 andrew gelman stats-2011-11-11-$
13 0.74890929 2300 andrew gelman stats-2014-04-21-Ticket to Baaaath
14 0.74880171 1597 andrew gelman stats-2012-11-29-What is expected of a consultant
15 0.7480725 892 andrew gelman stats-2011-09-06-Info on patent trolls
16 0.74582314 1639 andrew gelman stats-2012-12-26-Impersonators
17 0.74440092 995 andrew gelman stats-2011-11-06-Statistical models and actual models
19 0.74201912 2080 andrew gelman stats-2013-10-28-Writing for free
20 0.73958576 1044 andrew gelman stats-2011-12-06-The K Foundation burns Cosma’s turkey
topicId topicWeight
[(1, 0.019), (5, 0.02), (15, 0.021), (16, 0.072), (21, 0.054), (24, 0.129), (27, 0.053), (31, 0.011), (36, 0.032), (43, 0.132), (44, 0.027), (63, 0.024), (86, 0.012), (99, 0.245)]
simIndex simValue blogId blogTitle
1 0.96426767 314 andrew gelman stats-2010-10-03-Disconnect between drug and medical device approval
Introduction: Sanjay Kaul wrotes: By statute (“the least burdensome” pathway), the approval standard for devices by the US FDA is lower than for drugs. Before a new drug can be marketed, the sponsor must show “substantial evidence of effectiveness” as based on two or more well-controlled clinical studies (which literally means 2 trials, each with a p value of <0.05, or 1 large trial with a robust p value <0.00125). In contrast, the sponsor of a new device, especially those that are designated as high-risk (Class III) device, need only demonstrate "substantial equivalence" to an FDA-approved device via the 510(k) exemption or a "reasonable assurance of safety and effectiveness", evaluated through a pre-market approval and typically based on a single study. What does “reasonable assurance” or “substantial equivalence” imply to you as a Bayesian? These are obviously qualitative constructs, but if one were to quantify them, how would you go about addressing it? The regulatory definitions for
Introduction: Matt Taibbi writes : Glenn Hubbard, Leading Academic and Mitt Romney Advisor, Took $1200 an Hour to Be Countrywide’s Expert Witness . . . Hidden among the reams of material recently filed in connection with the lawsuit of monoline insurer MBIA against Bank of America and Countrywide is a deposition of none other than Columbia University’s Glenn Hubbard. . . . Hubbard testified on behalf of Countrywide in the MBIA suit. He conducted an “analysis” that essentially concluded that Countrywide’s loans weren’t any worse than the loans produced by other mortgage originators, and that therefore the monstrous losses that investors in those loans suffered were due to other factors related to the economic crisis – and not caused by the serial misrepresentations and fraud in Countrywide’s underwriting. That’s interesting, because I worked on the other side of this case! I was hired by MBIA’s lawyers. It wouldn’t be polite of me to reveal my consulting rate, and I never actually got depose
same-blog 3 0.93874234 1882 andrew gelman stats-2013-06-03-The statistical properties of smart chains (and referral chains more generally)
Introduction: Louis Mittel writes: The premise of the column this guy is starting is interesting: Noah Davis interviews a smart person and then interviews the smartest person that smart person knows and so on. It reminded me of you mentioning survey design strategy of asking people about other people, like “How many people do you know named Stuart?” or “How many people do you know that have had an abortion?” Ignoring the interview aspect of what this guy is doing, I think there’s some cool questions about the distribution/path behavior of smartest-person-I-know chains (say, seeded at random). Do they loop? If so, how long do they run before looping, how large are the loops? What parts of the population do the explore? Do you know of anything that’s been done on something like this? My reply: Interesting question. It could be asked of any referral chain, for example asking a sequence of people, “Who’s the tallest person you know?” or “Who’s the best piano player you know” or “Who’
4 0.93394434 1347 andrew gelman stats-2012-05-27-Macromuddle
Introduction: More and more I feel like economics reporting is based on crude principles of adding up “good news” and “bad news.” Sometimes this makes sense: by almost any measure, an unemployment rate of 10% is bad news compared to an unemployment rate of 5%. Other times, though, the good/bad news framework seems so tangled. For example: house prices up is considered good news but inflation is considered bad news. A strong dollar is considered good news but it’s also an unfavorable exchange rate, which is bad news. When facebook shares go down, that’s bad news, but if they automatically go up, that means they were underpriced which doesn’t seem so good either. Pundits are torn between rooting for the euro to fail (which means our team (the U.S.) is better than Europe (their team)) and rooting for it to survive (because a collapse in Europe is bad news for the U.S. economy). China’s economy doing well is bad news—but if their economy slips, that’s bad news too. I think you get the picture
5 0.93112957 538 andrew gelman stats-2011-01-25-Postdoc Position #2: Hierarchical Modeling and Statistical Graphics
Introduction: Andrew Gelman (Columbia University) and Eric Johnson (Columbia University) seek to hire a post-doctoral fellow to work on the application of the latest methods of multilevel data analysis, visualization and regression modeling to an important commercial problem: forecasting retail sales at the individual item level. These forecasts are used to make ordering, pricing and promotions decisions which can have significant economic impact to the retail chain such that even modest improvements in the accuracy of predictions, across a large retailer’s product line, can yield substantial margin improvements. Activities focus on the development of iterative imputation algorithms and diagnostics for missing-data imputation. Activities would include model-development, programming, and data analysis. This project is to be undertaken with, and largely funded by, a firm which provides forecasting technology and services to large retail chains, and which will provide access to a unique and rich
7 0.92118394 2330 andrew gelman stats-2014-05-12-Historical Arc of Universities
8 0.92029762 1253 andrew gelman stats-2012-04-08-Technology speedup graph
10 0.91646618 481 andrew gelman stats-2010-12-22-The Jumpstart financial literacy survey and the different purposes of tests
11 0.91542351 857 andrew gelman stats-2011-08-17-Bayes pays
12 0.91210544 1920 andrew gelman stats-2013-06-30-“Non-statistical” statistics tools
13 0.90944612 1860 andrew gelman stats-2013-05-17-How can statisticians help psychologists do their research better?
14 0.90580845 75 andrew gelman stats-2010-06-08-“Is the cyber mob a threat to freedom?”
15 0.90086353 1754 andrew gelman stats-2013-03-08-Cool GSS training video! And cumulative file 1972-2012!
16 0.90002298 770 andrew gelman stats-2011-06-15-Still more Mr. P in public health
18 0.89459443 70 andrew gelman stats-2010-06-07-Mister P goes on a date
20 0.89237261 2281 andrew gelman stats-2014-04-04-The Notorious N.H.S.T. presents: Mo P-values Mo Problems