andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-530 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: I received the following email: Did you know that it looks like Microsoft is entering the modeling game? I mean, outside of Excel. I recently received an email at work from a MS research contractor looking for ppl that program in R, SAS, Matlab, Excel, and Mathematica. . . . So far I [the person who sent me this email] haven’t seen anything about applying any actual models. Only stuff about assigning variables, deleting rows, merging tables, etc. I don’t know how common knowledge this all is within the statistical community. I did a quick google search for the name of the programming language and didn’t come up with anything. That sounds cool. Working with anything from Microsoft sounds pretty horrible, but it would be useful to have another modeling language out there, just for checking our answers if nothing else.
sentIndex sentText sentNum sentScore
1 I received the following email: Did you know that it looks like Microsoft is entering the modeling game? [sent-1, score-0.695]
2 I recently received an email at work from a MS research contractor looking for ppl that program in R, SAS, Matlab, Excel, and Mathematica. [sent-3, score-0.726]
3 So far I [the person who sent me this email] haven’t seen anything about applying any actual models. [sent-7, score-0.68]
4 Only stuff about assigning variables, deleting rows, merging tables, etc. [sent-8, score-0.707]
5 I don’t know how common knowledge this all is within the statistical community. [sent-9, score-0.342]
6 I did a quick google search for the name of the programming language and didn’t come up with anything. [sent-10, score-0.808]
7 Working with anything from Microsoft sounds pretty horrible, but it would be useful to have another modeling language out there, just for checking our answers if nothing else. [sent-12, score-1.092]
wordName wordTfidf (topN-words)
[('microsoft', 0.299), ('email', 0.284), ('merging', 0.231), ('ms', 0.231), ('language', 0.219), ('deleting', 0.217), ('sounds', 0.206), ('received', 0.195), ('matlab', 0.19), ('entering', 0.182), ('sas', 0.175), ('rows', 0.172), ('assigning', 0.17), ('modeling', 0.16), ('excel', 0.157), ('anything', 0.131), ('applying', 0.131), ('programming', 0.128), ('tables', 0.127), ('horrible', 0.123), ('answers', 0.122), ('game', 0.115), ('outside', 0.108), ('checking', 0.107), ('google', 0.104), ('search', 0.103), ('program', 0.1), ('knowledge', 0.097), ('quick', 0.094), ('name', 0.093), ('actual', 0.092), ('haven', 0.092), ('common', 0.09), ('sent', 0.09), ('stuff', 0.089), ('looks', 0.084), ('variables', 0.084), ('else', 0.084), ('seen', 0.081), ('within', 0.081), ('person', 0.081), ('far', 0.074), ('looking', 0.074), ('know', 0.074), ('nothing', 0.074), ('useful', 0.073), ('recently', 0.073), ('working', 0.069), ('come', 0.067), ('mean', 0.066)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999988 530 andrew gelman stats-2011-01-22-MS-Bayes?
Introduction: I received the following email: Did you know that it looks like Microsoft is entering the modeling game? I mean, outside of Excel. I recently received an email at work from a MS research contractor looking for ppl that program in R, SAS, Matlab, Excel, and Mathematica. . . . So far I [the person who sent me this email] haven’t seen anything about applying any actual models. Only stuff about assigning variables, deleting rows, merging tables, etc. I don’t know how common knowledge this all is within the statistical community. I did a quick google search for the name of the programming language and didn’t come up with anything. That sounds cool. Working with anything from Microsoft sounds pretty horrible, but it would be useful to have another modeling language out there, just for checking our answers if nothing else.
2 0.21425429 503 andrew gelman stats-2011-01-04-Clarity on my email policy
Introduction: I never read email before 4. That doesn’t mean I never send email before 4.
3 0.1960113 2054 andrew gelman stats-2013-10-07-Bing is preferred to Google by people who aren’t like me
Introduction: This one is fun because I have a double conflict of interest: I’ve been paid (at different times) both by Google and by Microsoft. Here’s the story: Microsoft, September 2012 : An independent research company, Answers Research based in San Diego, CA, conducted a study using a representative online sample of nearly 1000 people, ages 18 and older from across the US. The participants were chosen from a random survey panel and were required to have used a major search engine in the past month. Participants were not aware that Microsoft was involved. In the test, participants were shown the main web search results pane of both Bing and Google for 10 search queries of their choice. Bing and Google search results were shown side-by-side on one page for easy comparison – with all branding removed from both search engines. The test did not include ads or content in other parts of the page such as Bing’s Snapshot and Social Search panes and Google’s Knowledge Graph. For each search,
4 0.15390791 27 andrew gelman stats-2010-05-11-Update on the spam email study
Introduction: A few days ago I reported on the spam email that I received from two business school professors (one at Columbia)! As noted on the blog, I sent an email directly to the study’s authors at the time of reading the email, but they have yet to respond. This surprises me a bit. Certainly if 6300 faculty each have time to respond to one email on this study, the two faculty have time to respond to 6300 email replies, no? I was actually polite enough to respond to both of their emails! If I do hear back, I’ll let youall know! P.S. Paul Basken interviewed me briefly for a story in the Chronicle of Higher Education on the now-notorious spam email study. Basken’s article is reasonable–he points out that (a) the study irritated a lot of people, but (b) is ultimately no big deal. One interesting thing about the article is that, although some people felt that the spam email study was ethical, nobody came forth with an argument that the study was actually worth doing. P.P.S. In
5 0.14428128 18 andrew gelman stats-2010-05-06-$63,000 worth of abusive research . . . or just a really stupid waste of time?
Introduction: As someone who relies strongly on survey research, it’s good for me to be reminded that some surveys are useful, some are useless, but one thing they almost all have in common is . . . they waste the respondents’ time. I thought of this after receiving the following email, which I shall reproduce here. My own comments appear after. Recently, you received an email from a student asking for 10 minutes of your time to discuss your Ph.D. program (the body of the email appears below). We are emailing you today to debrief you on the actual purpose of that email, as it was part of a research study. We sincerely hope our study did not cause you any disruption and we apologize if you were at all inconvenienced. Our hope is that this letter will provide a sufficient explanation of the purpose and design of our study to alleviate any concerns you may have about your involvement. We want to thank you for your time and for reading further if you are interested in understanding why you rece
6 0.1321367 1661 andrew gelman stats-2013-01-08-Software is as software does
7 0.12863277 1808 andrew gelman stats-2013-04-17-Excel-bashing
8 0.12561643 259 andrew gelman stats-2010-09-06-Inbox zero. Really.
9 0.12268947 1219 andrew gelman stats-2012-03-18-Tips on “great design” from . . . Microsoft!
10 0.10948779 605 andrew gelman stats-2011-03-09-Does it feel like cheating when I do this? Variation in ethical standards and expectations
11 0.1050507 2111 andrew gelman stats-2013-11-23-Tables > figures yet again
13 0.1023755 520 andrew gelman stats-2011-01-17-R Advertised
14 0.098755203 240 andrew gelman stats-2010-08-29-ARM solutions
15 0.098294377 1885 andrew gelman stats-2013-06-06-Leahy Versus Albedoman and the Moneygoround, Part One
16 0.095124811 1841 andrew gelman stats-2013-05-04-The Folk Theorem of Statistical Computing
17 0.093208589 1807 andrew gelman stats-2013-04-17-Data problems, coding errors…what can be done?
18 0.092195764 873 andrew gelman stats-2011-08-26-Luck or knowledge?
20 0.089787722 2079 andrew gelman stats-2013-10-27-Uncompressing the concept of compressed sensing
topicId topicWeight
[(0, 0.129), (1, -0.025), (2, -0.06), (3, 0.013), (4, 0.068), (5, 0.052), (6, 0.006), (7, -0.053), (8, 0.006), (9, 0.016), (10, 0.002), (11, -0.063), (12, 0.069), (13, -0.0), (14, -0.048), (15, 0.065), (16, 0.026), (17, -0.072), (18, 0.017), (19, 0.018), (20, 0.018), (21, -0.009), (22, 0.062), (23, -0.039), (24, 0.001), (25, -0.026), (26, 0.046), (27, 0.008), (28, -0.053), (29, 0.008), (30, 0.051), (31, -0.032), (32, -0.002), (33, 0.025), (34, -0.097), (35, 0.003), (36, 0.04), (37, -0.031), (38, -0.031), (39, -0.011), (40, 0.064), (41, 0.021), (42, 0.001), (43, 0.023), (44, 0.072), (45, -0.016), (46, 0.035), (47, -0.05), (48, 0.048), (49, -0.112)]
simIndex simValue blogId blogTitle
same-blog 1 0.98577446 530 andrew gelman stats-2011-01-22-MS-Bayes?
Introduction: I received the following email: Did you know that it looks like Microsoft is entering the modeling game? I mean, outside of Excel. I recently received an email at work from a MS research contractor looking for ppl that program in R, SAS, Matlab, Excel, and Mathematica. . . . So far I [the person who sent me this email] haven’t seen anything about applying any actual models. Only stuff about assigning variables, deleting rows, merging tables, etc. I don’t know how common knowledge this all is within the statistical community. I did a quick google search for the name of the programming language and didn’t come up with anything. That sounds cool. Working with anything from Microsoft sounds pretty horrible, but it would be useful to have another modeling language out there, just for checking our answers if nothing else.
2 0.69742715 2079 andrew gelman stats-2013-10-27-Uncompressing the concept of compressed sensing
Introduction: I received the following email: These compressed sensing people link to Shannon’s advice . It’s refreshing when leaders of a field state that their stuff may not be a panacea. I replied: Scarily enough, I don’t know anything about this research area at all! My correspondent followed up: Meh. They proved L1 approximates L0 when design matrix is basically full rank. Now all sparsity stuff is sometimes called ‘compressed sensing’. Most of it seems to be linear interpolation, rebranded. I wrote back: But rebranding/reframing can be useful! Often reframing is a step in the direction of improvement, of better understanding one’s assumptions and goals.
Introduction: This is hilarious ( link from a completely deadpan Tyler Cowen). I’d call it “unintentionally hilarious” but I’m pretty sure that rms knew this was funny when he was writing it. It’s sort of like when you write a top 10 list—it’s hard to resist getting silly and going over the top. It’s only near the end that we get to the bit about the parrots. All joking aside, the most interesting part of the email was this: I [rms] have to spend 6 to 8 hours *every day* doing my usual work, which is responding to email about the GNU Project and the Free Software Movement. I’d wondered for awhile what is it that Richard Stallman actually does, that is how does he spend his time (aside from giving lectures to promote his ideas and pay the bills). Emailing –> Blogging I too spend a lot of time on email, but a few years ago I consciously tried to shift a bunch of my email exchanges to the blog. I found that I was sending out a lot of information to an audience of one, information
4 0.66936719 1589 andrew gelman stats-2012-11-25-Life as a blogger: the emails just get weirder and weirder
Introduction: In the email the other day, subject line “Casting blogger, writer, journalist to host cable series”: Hi there Andrew, I’m casting a male journalist, writer, blogger, documentary filmmaker or comedian with a certain type personality for a television pilot along with production company, Pipeline39. See below: A certain type of character – no cockiness, no ego, a person who is smart, savvy, dry humor, but someone who isn’t imposing, who can infiltrate these organizations. This person will be hosting his own show and covering alternative lifestyles and secret societies around the world. If you’re interested in hearing more or would like to be considered for this project, please email me a photo and a bio of yourself, along with contact information. I’ll respond to you ASAP. I’m looking forward to hearing from you. *** Casting Producer (646) ***.**** ***@gmail.com I was with them until I got to the “no ego” part. . . . Also, I don’t think I could infiltrate any org
5 0.65923387 27 andrew gelman stats-2010-05-11-Update on the spam email study
Introduction: A few days ago I reported on the spam email that I received from two business school professors (one at Columbia)! As noted on the blog, I sent an email directly to the study’s authors at the time of reading the email, but they have yet to respond. This surprises me a bit. Certainly if 6300 faculty each have time to respond to one email on this study, the two faculty have time to respond to 6300 email replies, no? I was actually polite enough to respond to both of their emails! If I do hear back, I’ll let youall know! P.S. Paul Basken interviewed me briefly for a story in the Chronicle of Higher Education on the now-notorious spam email study. Basken’s article is reasonable–he points out that (a) the study irritated a lot of people, but (b) is ultimately no big deal. One interesting thing about the article is that, although some people felt that the spam email study was ethical, nobody came forth with an argument that the study was actually worth doing. P.P.S. In
6 0.65625256 880 andrew gelman stats-2011-08-30-Annals of spam
7 0.65421343 1618 andrew gelman stats-2012-12-11-The consulting biz
8 0.65240419 2148 andrew gelman stats-2013-12-25-Spam!
9 0.64257932 332 andrew gelman stats-2010-10-10-Proposed new section of the American Statistical Association on Imaging Sciences
10 0.64249933 2118 andrew gelman stats-2013-11-30-???
11 0.64134413 1434 andrew gelman stats-2012-07-29-FindTheData.org
12 0.63981295 1698 andrew gelman stats-2013-01-30-The spam just gets weirder and weirder
13 0.63775313 1922 andrew gelman stats-2013-07-02-They want me to send them free material and pay for the privilege
14 0.63581145 503 andrew gelman stats-2011-01-04-Clarity on my email policy
15 0.63549602 1421 andrew gelman stats-2012-07-19-Alexa, Maricel, and Marty: Three cellular automata who got on my nerves
16 0.63346231 605 andrew gelman stats-2011-03-09-Does it feel like cheating when I do this? Variation in ethical standards and expectations
17 0.62986964 545 andrew gelman stats-2011-01-30-New innovations in spam
18 0.62663651 343 andrew gelman stats-2010-10-15-?
19 0.62023181 2111 andrew gelman stats-2013-11-23-Tables > figures yet again
20 0.61913967 18 andrew gelman stats-2010-05-06-$63,000 worth of abusive research . . . or just a really stupid waste of time?
topicId topicWeight
[(16, 0.045), (21, 0.024), (24, 0.152), (27, 0.025), (75, 0.022), (76, 0.023), (77, 0.057), (86, 0.124), (90, 0.128), (99, 0.285)]
simIndex simValue blogId blogTitle
same-blog 1 0.95218432 530 andrew gelman stats-2011-01-22-MS-Bayes?
Introduction: I received the following email: Did you know that it looks like Microsoft is entering the modeling game? I mean, outside of Excel. I recently received an email at work from a MS research contractor looking for ppl that program in R, SAS, Matlab, Excel, and Mathematica. . . . So far I [the person who sent me this email] haven’t seen anything about applying any actual models. Only stuff about assigning variables, deleting rows, merging tables, etc. I don’t know how common knowledge this all is within the statistical community. I did a quick google search for the name of the programming language and didn’t come up with anything. That sounds cool. Working with anything from Microsoft sounds pretty horrible, but it would be useful to have another modeling language out there, just for checking our answers if nothing else.
Introduction: Solomon Hsiang writes : I [Hsiang] have posted about high temperature inducing individuals to exhibit more violent behavior when driving, playing baseball and prowling bars. These cases are neat anecdotes that let us see the “pure aggression” response in lab-like conditions. But they don’t affect most of us too much. But violent crime in the real world affects everyone. Earlier, I posted a paper by Jacob et al. that looked at assault in the USA for about a decade – they found that higher temperatures lead to more assault and that the rise in violent crimes rose more quickly than the analogous rise in non-violent property-crime, an indicator that there is a “pure aggression” component to the rise in violent crime. A new working paper “Crime, Weather, and Climate Change” by recent Harvard grad Matthew Ranson puts together an impressive data set of all types of crime in USA counties for 50 years. The results tell the aggression story using street-level data very clearly [click to
3 0.93042666 1417 andrew gelman stats-2012-07-15-Some decision analysis problems are pretty easy, no?
Introduction: Cassie Murdoch reports : A 47-year-old woman in Uxbridge, Massachusetts, got behind the wheel of her car after having a bit too much to drink, but instead of wreaking havoc on the road, she ended up lodged in a sand trap at a local golf course. Why? Because her GPS made her do it—obviously! She said the GPS told her to turn left, and she did, right into a cornfield. That didn’t faze her, and she just kept on going until she ended up on the golf course and got stuck in the sand. There were people on the course at the time, but thankfully nobody was injured. Police found a cup full of alcohol in her car and arrested her for driving drunk. Here’s the punchline: This is the fourth time she’s been arrested for a DUI. Assuming this story is accurate, I guess they don’t have one of those “three strikes” laws in Massachusetts? Personally, I’m a lot more afraid of a dangerous driver than of some drug dealer. I’d think a simple cost-benefit calculation would recommend taking away
4 0.9238466 1411 andrew gelman stats-2012-07-10-Defining ourselves arbitrarily
Introduction: Robin Hanson writes that he does’t use slang: I [Hanson] am not into slang. I want to talk to the widest possible audience, and to focus on timeless issues and insights, as opposed to the latest fashionable topics. I can see why people want to signal loyalty to their groups, especially in the military, but I have little confidence that this is good for the world as a whole. I don’t know anything about the military (I don’t think this really counts) so I can’t comment on that part, and I don’t see the opposition between slang and “timeless issues and insights, as opposed to the latest fashionable topics” (after all, Mark Twain used slang and he had some timeless insights), but I’d like to pick up on a slightly different angle here, which is the set of quasi-arbitrary choices we make in order to define ourselves. Robin Hanson happens not to use much slang and he uses this trait to define himself, not quite to stand out in the crowd but to put himself on one end of a scale. I
5 0.91961026 1655 andrew gelman stats-2013-01-05-The statistics software signal
Introduction: Tyler Cowen links to a post by Sean Taylor, who writes the following about users of R: You are willing to invest in learning something difficult. You do not care about aesthetics, only availability of packages and getting results quickly. To me, R is easy and Sas is difficult. I once worked with some students who were running Sas and the output was unreadable! Pages and pages of numbers that made no sense. When it comes to ease or difficulty of use, I think it depends on what you’re used to! And I really don’t understand the bit about aesthetics. What about this ? One reason I use R is to make pretty graphs. That said, if I’d never learned R, I’d just be making pretty graphs in Fortran or whatever. My guess is, the way I program, R is actually hindering rather than helping my ability to make attractive graphs. Half the time I’m scrambling around, writing custom code to get around R’s defaults.
7 0.91798198 1971 andrew gelman stats-2013-08-07-I doubt they cheated
9 0.91686243 15 andrew gelman stats-2010-05-03-Public Opinion on Health Care Reform
10 0.91641212 305 andrew gelman stats-2010-09-29-Decision science vs. social psychology
11 0.91640508 475 andrew gelman stats-2010-12-19-All politics are local — not
12 0.9158442 1983 andrew gelman stats-2013-08-15-More on AIC, WAIC, etc
13 0.91569471 866 andrew gelman stats-2011-08-23-Participate in a research project on combining information for prediction
14 0.91552782 276 andrew gelman stats-2010-09-14-Don’t look at just one poll number–unless you really know what you’re doing!
15 0.91495872 759 andrew gelman stats-2011-06-11-“2 level logit with 2 REs & large sample. computational nightmare – please help”
16 0.91440237 1278 andrew gelman stats-2012-04-23-“Any old map will do” meets “God is in every leaf of every tree”
17 0.91144198 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis
18 0.91071784 2082 andrew gelman stats-2013-10-30-Berri Gladwell Loken football update
19 0.90984964 1947 andrew gelman stats-2013-07-20-We are what we are studying
20 0.90669751 2058 andrew gelman stats-2013-10-11-Gladwell and Chabris, David and Goliath, and science writing as stone soup