andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-530 knowledge-graph by maker-knowledge-mining

530 andrew gelman stats-2011-01-22-MS-Bayes?


meta infos for this blog

Source: html

Introduction: I received the following email: Did you know that it looks like Microsoft is entering the modeling game? I mean, outside of Excel. I recently received an email at work from a MS research contractor looking for ppl that program in R, SAS, Matlab, Excel, and Mathematica. . . . So far I [the person who sent me this email] haven’t seen anything about applying any actual models. Only stuff about assigning variables, deleting rows, merging tables, etc. I don’t know how common knowledge this all is within the statistical community. I did a quick google search for the name of the programming language and didn’t come up with anything. That sounds cool. Working with anything from Microsoft sounds pretty horrible, but it would be useful to have another modeling language out there, just for checking our answers if nothing else.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 I received the following email: Did you know that it looks like Microsoft is entering the modeling game? [sent-1, score-0.695]

2 I recently received an email at work from a MS research contractor looking for ppl that program in R, SAS, Matlab, Excel, and Mathematica. [sent-3, score-0.726]

3 So far I [the person who sent me this email] haven’t seen anything about applying any actual models. [sent-7, score-0.68]

4 Only stuff about assigning variables, deleting rows, merging tables, etc. [sent-8, score-0.707]

5 I don’t know how common knowledge this all is within the statistical community. [sent-9, score-0.342]

6 I did a quick google search for the name of the programming language and didn’t come up with anything. [sent-10, score-0.808]

7 Working with anything from Microsoft sounds pretty horrible, but it would be useful to have another modeling language out there, just for checking our answers if nothing else. [sent-12, score-1.092]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('microsoft', 0.299), ('email', 0.284), ('merging', 0.231), ('ms', 0.231), ('language', 0.219), ('deleting', 0.217), ('sounds', 0.206), ('received', 0.195), ('matlab', 0.19), ('entering', 0.182), ('sas', 0.175), ('rows', 0.172), ('assigning', 0.17), ('modeling', 0.16), ('excel', 0.157), ('anything', 0.131), ('applying', 0.131), ('programming', 0.128), ('tables', 0.127), ('horrible', 0.123), ('answers', 0.122), ('game', 0.115), ('outside', 0.108), ('checking', 0.107), ('google', 0.104), ('search', 0.103), ('program', 0.1), ('knowledge', 0.097), ('quick', 0.094), ('name', 0.093), ('actual', 0.092), ('haven', 0.092), ('common', 0.09), ('sent', 0.09), ('stuff', 0.089), ('looks', 0.084), ('variables', 0.084), ('else', 0.084), ('seen', 0.081), ('within', 0.081), ('person', 0.081), ('far', 0.074), ('looking', 0.074), ('know', 0.074), ('nothing', 0.074), ('useful', 0.073), ('recently', 0.073), ('working', 0.069), ('come', 0.067), ('mean', 0.066)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 530 andrew gelman stats-2011-01-22-MS-Bayes?

Introduction: I received the following email: Did you know that it looks like Microsoft is entering the modeling game? I mean, outside of Excel. I recently received an email at work from a MS research contractor looking for ppl that program in R, SAS, Matlab, Excel, and Mathematica. . . . So far I [the person who sent me this email] haven’t seen anything about applying any actual models. Only stuff about assigning variables, deleting rows, merging tables, etc. I don’t know how common knowledge this all is within the statistical community. I did a quick google search for the name of the programming language and didn’t come up with anything. That sounds cool. Working with anything from Microsoft sounds pretty horrible, but it would be useful to have another modeling language out there, just for checking our answers if nothing else.

2 0.21425429 503 andrew gelman stats-2011-01-04-Clarity on my email policy

Introduction: I never read email before 4. That doesn’t mean I never send email before 4.

3 0.1960113 2054 andrew gelman stats-2013-10-07-Bing is preferred to Google by people who aren’t like me

Introduction: This one is fun because I have a double conflict of interest: I’ve been paid (at different times) both by Google and by Microsoft. Here’s the story: Microsoft, September 2012 : An independent research company, Answers Research based in San Diego, CA, conducted a study using a representative online sample of nearly 1000 people, ages 18 and older from across the US. The participants were chosen from a random survey panel and were required to have used a major search engine in the past month. Participants were not aware that Microsoft was involved. In the test, participants were shown the main web search results pane of both Bing and Google for 10 search queries of their choice. Bing and Google search results were shown side-by-side on one page for easy comparison – with all branding removed from both search engines. The test did not include ads or content in other parts of the page such as Bing’s Snapshot and Social Search panes and Google’s Knowledge Graph. For each search,

4 0.15390791 27 andrew gelman stats-2010-05-11-Update on the spam email study

Introduction: A few days ago I reported on the spam email that I received from two business school professors (one at Columbia)! As noted on the blog, I sent an email directly to the study’s authors at the time of reading the email, but they have yet to respond. This surprises me a bit. Certainly if 6300 faculty each have time to respond to one email on this study, the two faculty have time to respond to 6300 email replies, no? I was actually polite enough to respond to both of their emails! If I do hear back, I’ll let youall know! P.S. Paul Basken interviewed me briefly for a story in the Chronicle of Higher Education on the now-notorious spam email study. Basken’s article is reasonable–he points out that (a) the study irritated a lot of people, but (b) is ultimately no big deal. One interesting thing about the article is that, although some people felt that the spam email study was ethical, nobody came forth with an argument that the study was actually worth doing. P.P.S. In

5 0.14428128 18 andrew gelman stats-2010-05-06-$63,000 worth of abusive research . . . or just a really stupid waste of time?

Introduction: As someone who relies strongly on survey research, it’s good for me to be reminded that some surveys are useful, some are useless, but one thing they almost all have in common is . . . they waste the respondents’ time. I thought of this after receiving the following email, which I shall reproduce here. My own comments appear after. Recently, you received an email from a student asking for 10 minutes of your time to discuss your Ph.D. program (the body of the email appears below). We are emailing you today to debrief you on the actual purpose of that email, as it was part of a research study. We sincerely hope our study did not cause you any disruption and we apologize if you were at all inconvenienced. Our hope is that this letter will provide a sufficient explanation of the purpose and design of our study to alleviate any concerns you may have about your involvement. We want to thank you for your time and for reading further if you are interested in understanding why you rece

6 0.1321367 1661 andrew gelman stats-2013-01-08-Software is as software does

7 0.12863277 1808 andrew gelman stats-2013-04-17-Excel-bashing

8 0.12561643 259 andrew gelman stats-2010-09-06-Inbox zero. Really.

9 0.12268947 1219 andrew gelman stats-2012-03-18-Tips on “great design” from . . . Microsoft!

10 0.10948779 605 andrew gelman stats-2011-03-09-Does it feel like cheating when I do this? Variation in ethical standards and expectations

11 0.1050507 2111 andrew gelman stats-2013-11-23-Tables > figures yet again

12 0.10306149 1481 andrew gelman stats-2012-09-04-Cool one-day miniconference at Columbia Fri 12 Oct on computational and online social science

13 0.1023755 520 andrew gelman stats-2011-01-17-R Advertised

14 0.098755203 240 andrew gelman stats-2010-08-29-ARM solutions

15 0.098294377 1885 andrew gelman stats-2013-06-06-Leahy Versus Albedoman and the Moneygoround, Part One

16 0.095124811 1841 andrew gelman stats-2013-05-04-The Folk Theorem of Statistical Computing

17 0.093208589 1807 andrew gelman stats-2013-04-17-Data problems, coding errors…what can be done?

18 0.092195764 873 andrew gelman stats-2011-08-26-Luck or knowledge?

19 0.09089075 980 andrew gelman stats-2011-10-29-When people meet this guy, can they resist the temptation to ask him what he’s doing for breakfast??

20 0.089787722 2079 andrew gelman stats-2013-10-27-Uncompressing the concept of compressed sensing


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.129), (1, -0.025), (2, -0.06), (3, 0.013), (4, 0.068), (5, 0.052), (6, 0.006), (7, -0.053), (8, 0.006), (9, 0.016), (10, 0.002), (11, -0.063), (12, 0.069), (13, -0.0), (14, -0.048), (15, 0.065), (16, 0.026), (17, -0.072), (18, 0.017), (19, 0.018), (20, 0.018), (21, -0.009), (22, 0.062), (23, -0.039), (24, 0.001), (25, -0.026), (26, 0.046), (27, 0.008), (28, -0.053), (29, 0.008), (30, 0.051), (31, -0.032), (32, -0.002), (33, 0.025), (34, -0.097), (35, 0.003), (36, 0.04), (37, -0.031), (38, -0.031), (39, -0.011), (40, 0.064), (41, 0.021), (42, 0.001), (43, 0.023), (44, 0.072), (45, -0.016), (46, 0.035), (47, -0.05), (48, 0.048), (49, -0.112)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98577446 530 andrew gelman stats-2011-01-22-MS-Bayes?

Introduction: I received the following email: Did you know that it looks like Microsoft is entering the modeling game? I mean, outside of Excel. I recently received an email at work from a MS research contractor looking for ppl that program in R, SAS, Matlab, Excel, and Mathematica. . . . So far I [the person who sent me this email] haven’t seen anything about applying any actual models. Only stuff about assigning variables, deleting rows, merging tables, etc. I don’t know how common knowledge this all is within the statistical community. I did a quick google search for the name of the programming language and didn’t come up with anything. That sounds cool. Working with anything from Microsoft sounds pretty horrible, but it would be useful to have another modeling language out there, just for checking our answers if nothing else.

2 0.69742715 2079 andrew gelman stats-2013-10-27-Uncompressing the concept of compressed sensing

Introduction: I received the following email: These compressed sensing people link to Shannon’s advice . It’s refreshing when leaders of a field state that their stuff may not be a panacea. I replied: Scarily enough, I don’t know anything about this research area at all! My correspondent followed up: Meh. They proved L1 approximates L0 when design matrix is basically full rank. Now all sparsity stuff is sometimes called ‘compressed sensing’. Most of it seems to be linear interpolation, rebranded. I wrote back: But rebranding/reframing can be useful! Often reframing is a step in the direction of improvement, of better understanding one’s assumptions and goals.

3 0.67992759 980 andrew gelman stats-2011-10-29-When people meet this guy, can they resist the temptation to ask him what he’s doing for breakfast??

Introduction: This is hilarious ( link from a completely deadpan Tyler Cowen). I’d call it “unintentionally hilarious” but I’m pretty sure that rms knew this was funny when he was writing it. It’s sort of like when you write a top 10 list—it’s hard to resist getting silly and going over the top. It’s only near the end that we get to the bit about the parrots. All joking aside, the most interesting part of the email was this: I [rms] have to spend 6 to 8 hours *every day* doing my usual work, which is responding to email about the GNU Project and the Free Software Movement. I’d wondered for awhile what is it that Richard Stallman actually does, that is how does he spend his time (aside from giving lectures to promote his ideas and pay the bills). Emailing –> Blogging I too spend a lot of time on email, but a few years ago I consciously tried to shift a bunch of my email exchanges to the blog. I found that I was sending out a lot of information to an audience of one, information

4 0.66936719 1589 andrew gelman stats-2012-11-25-Life as a blogger: the emails just get weirder and weirder

Introduction: In the email the other day, subject line “Casting blogger, writer, journalist to host cable series”: Hi there Andrew, I’m casting a male journalist, writer, blogger, documentary filmmaker or comedian with a certain type personality for a television pilot along with production company, Pipeline39. See below: A certain type of character – no cockiness, no ego, a person who is smart, savvy, dry humor, but someone who isn’t imposing, who can infiltrate these organizations. This person will be hosting his own show and covering alternative lifestyles and secret societies around the world. If you’re interested in hearing more or would like to be considered for this project, please email me a photo and a bio of yourself, along with contact information. I’ll respond to you ASAP. I’m looking forward to hearing from you. *** Casting Producer (646) ***.**** ***@gmail.com I was with them until I got to the “no ego” part. . . . Also, I don’t think I could infiltrate any org

5 0.65923387 27 andrew gelman stats-2010-05-11-Update on the spam email study

Introduction: A few days ago I reported on the spam email that I received from two business school professors (one at Columbia)! As noted on the blog, I sent an email directly to the study’s authors at the time of reading the email, but they have yet to respond. This surprises me a bit. Certainly if 6300 faculty each have time to respond to one email on this study, the two faculty have time to respond to 6300 email replies, no? I was actually polite enough to respond to both of their emails! If I do hear back, I’ll let youall know! P.S. Paul Basken interviewed me briefly for a story in the Chronicle of Higher Education on the now-notorious spam email study. Basken’s article is reasonable–he points out that (a) the study irritated a lot of people, but (b) is ultimately no big deal. One interesting thing about the article is that, although some people felt that the spam email study was ethical, nobody came forth with an argument that the study was actually worth doing. P.P.S. In

6 0.65625256 880 andrew gelman stats-2011-08-30-Annals of spam

7 0.65421343 1618 andrew gelman stats-2012-12-11-The consulting biz

8 0.65240419 2148 andrew gelman stats-2013-12-25-Spam!

9 0.64257932 332 andrew gelman stats-2010-10-10-Proposed new section of the American Statistical Association on Imaging Sciences

10 0.64249933 2118 andrew gelman stats-2013-11-30-???

11 0.64134413 1434 andrew gelman stats-2012-07-29-FindTheData.org

12 0.63981295 1698 andrew gelman stats-2013-01-30-The spam just gets weirder and weirder

13 0.63775313 1922 andrew gelman stats-2013-07-02-They want me to send them free material and pay for the privilege

14 0.63581145 503 andrew gelman stats-2011-01-04-Clarity on my email policy

15 0.63549602 1421 andrew gelman stats-2012-07-19-Alexa, Maricel, and Marty: Three cellular automata who got on my nerves

16 0.63346231 605 andrew gelman stats-2011-03-09-Does it feel like cheating when I do this? Variation in ethical standards and expectations

17 0.62986964 545 andrew gelman stats-2011-01-30-New innovations in spam

18 0.62663651 343 andrew gelman stats-2010-10-15-?

19 0.62023181 2111 andrew gelman stats-2013-11-23-Tables > figures yet again

20 0.61913967 18 andrew gelman stats-2010-05-06-$63,000 worth of abusive research . . . or just a really stupid waste of time?


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(16, 0.045), (21, 0.024), (24, 0.152), (27, 0.025), (75, 0.022), (76, 0.023), (77, 0.057), (86, 0.124), (90, 0.128), (99, 0.285)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95218432 530 andrew gelman stats-2011-01-22-MS-Bayes?

Introduction: I received the following email: Did you know that it looks like Microsoft is entering the modeling game? I mean, outside of Excel. I recently received an email at work from a MS research contractor looking for ppl that program in R, SAS, Matlab, Excel, and Mathematica. . . . So far I [the person who sent me this email] haven’t seen anything about applying any actual models. Only stuff about assigning variables, deleting rows, merging tables, etc. I don’t know how common knowledge this all is within the statistical community. I did a quick google search for the name of the programming language and didn’t come up with anything. That sounds cool. Working with anything from Microsoft sounds pretty horrible, but it would be useful to have another modeling language out there, just for checking our answers if nothing else.

2 0.93496084 1522 andrew gelman stats-2012-10-05-High temperatures cause violent crime and implications for climate change, also some suggestions about how to better summarize these claims

Introduction: Solomon Hsiang writes : I [Hsiang] have posted about high temperature inducing individuals to exhibit more violent behavior when driving, playing baseball and prowling bars. These cases are neat anecdotes that let us see the “pure aggression” response in lab-like conditions. But they don’t affect most of us too much. But violent crime in the real world affects everyone. Earlier, I posted a paper by Jacob et al. that looked at assault in the USA for about a decade – they found that higher temperatures lead to more assault and that the rise in violent crimes rose more quickly than the analogous rise in non-violent property-crime, an indicator that there is a “pure aggression” component to the rise in violent crime. A new working paper “Crime, Weather, and Climate Change” by recent Harvard grad Matthew Ranson puts together an impressive data set of all types of crime in USA counties for 50 years. The results tell the aggression story using street-level data very clearly [click to

3 0.93042666 1417 andrew gelman stats-2012-07-15-Some decision analysis problems are pretty easy, no?

Introduction: Cassie Murdoch reports : A 47-year-old woman in Uxbridge, Massachusetts, got behind the wheel of her car after having a bit too much to drink, but instead of wreaking havoc on the road, she ended up lodged in a sand trap at a local golf course. Why? Because her GPS made her do it—obviously! She said the GPS told her to turn left, and she did, right into a cornfield. That didn’t faze her, and she just kept on going until she ended up on the golf course and got stuck in the sand. There were people on the course at the time, but thankfully nobody was injured. Police found a cup full of alcohol in her car and arrested her for driving drunk. Here’s the punchline: This is the fourth time she’s been arrested for a DUI. Assuming this story is accurate, I guess they don’t have one of those “three strikes” laws in Massachusetts? Personally, I’m a lot more afraid of a dangerous driver than of some drug dealer. I’d think a simple cost-benefit calculation would recommend taking away

4 0.9238466 1411 andrew gelman stats-2012-07-10-Defining ourselves arbitrarily

Introduction: Robin Hanson writes that he does’t use slang: I [Hanson] am not into slang. I want to talk to the widest possible audience, and to focus on timeless issues and insights, as opposed to the latest fashionable topics. I can see why people want to signal loyalty to their groups, especially in the military, but I have little confidence that this is good for the world as a whole. I don’t know anything about the military (I don’t think this really counts) so I can’t comment on that part, and I don’t see the opposition between slang and “timeless issues and insights, as opposed to the latest fashionable topics” (after all, Mark Twain used slang and he had some timeless insights), but I’d like to pick up on a slightly different angle here, which is the set of quasi-arbitrary choices we make in order to define ourselves. Robin Hanson happens not to use much slang and he uses this trait to define himself, not quite to stand out in the crowd but to put himself on one end of a scale. I

5 0.91961026 1655 andrew gelman stats-2013-01-05-The statistics software signal

Introduction: Tyler Cowen links to a post by Sean Taylor, who writes the following about users of R: You are willing to invest in learning something difficult. You do not care about aesthetics, only availability of packages and getting results quickly. To me, R is easy and Sas is difficult. I once worked with some students who were running Sas and the output was unreadable! Pages and pages of numbers that made no sense. When it comes to ease or difficulty of use, I think it depends on what you’re used to! And I really don’t understand the bit about aesthetics. What about this ? One reason I use R is to make pretty graphs. That said, if I’d never learned R, I’d just be making pretty graphs in Fortran or whatever. My guess is, the way I program, R is actually hindering rather than helping my ability to make attractive graphs. Half the time I’m scrambling around, writing custom code to get around R’s defaults.

6 0.91950494 1327 andrew gelman stats-2012-05-18-Comments on “A Bayesian approach to complex clinical diagnoses: a case-study in child abuse”

7 0.91798198 1971 andrew gelman stats-2013-08-07-I doubt they cheated

8 0.91770434 1552 andrew gelman stats-2012-10-29-“Communication is a central task of statistics, and ideally a state-of-the-art data analysis can have state-of-the-art displays to match”

9 0.91686243 15 andrew gelman stats-2010-05-03-Public Opinion on Health Care Reform

10 0.91641212 305 andrew gelman stats-2010-09-29-Decision science vs. social psychology

11 0.91640508 475 andrew gelman stats-2010-12-19-All politics are local — not

12 0.9158442 1983 andrew gelman stats-2013-08-15-More on AIC, WAIC, etc

13 0.91569471 866 andrew gelman stats-2011-08-23-Participate in a research project on combining information for prediction

14 0.91552782 276 andrew gelman stats-2010-09-14-Don’t look at just one poll number–unless you really know what you’re doing!

15 0.91495872 759 andrew gelman stats-2011-06-11-“2 level logit with 2 REs & large sample. computational nightmare – please help”

16 0.91440237 1278 andrew gelman stats-2012-04-23-“Any old map will do” meets “God is in every leaf of every tree”

17 0.91144198 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis

18 0.91071784 2082 andrew gelman stats-2013-10-30-Berri Gladwell Loken football update

19 0.90984964 1947 andrew gelman stats-2013-07-20-We are what we are studying

20 0.90669751 2058 andrew gelman stats-2013-10-11-Gladwell and Chabris, David and Goliath, and science writing as stone soup