andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-2118 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: I received the following unsolicited email, subject line Technology and Engineering Research: Dear Editor We have done research in some of the cutting edge technology and engineering field and would like to if you will be able to write about it in your news section. Our Primarily research focus on building high performance systems that are helping in social networks, web, finding disease, cancer and sports using BIG DATA . Hope to hear from you some time soon. Thanks, ***, PhD Chartered Scientist IBM Corportation ***@us.ibm.com 916 *** **** I thought IBM was a professional operation—don’t they have their own public relations department?
sentIndex sentText sentNum sentScore
1 I received the following unsolicited email, subject line Technology and Engineering Research: Dear Editor We have done research in some of the cutting edge technology and engineering field and would like to if you will be able to write about it in your news section. [sent-1, score-2.098]
2 Our Primarily research focus on building high performance systems that are helping in social networks, web, finding disease, cancer and sports using BIG DATA . [sent-2, score-1.352]
3 com 916 *** **** I thought IBM was a professional operation—don’t they have their own public relations department? [sent-6, score-0.423]
wordName wordTfidf (topN-words)
[('ibm', 0.417), ('engineering', 0.296), ('technology', 0.296), ('operation', 0.187), ('unsolicited', 0.181), ('edge', 0.176), ('primarily', 0.174), ('cutting', 0.172), ('dear', 0.168), ('relations', 0.157), ('helping', 0.152), ('phd', 0.152), ('disease', 0.149), ('research', 0.146), ('networks', 0.143), ('cancer', 0.141), ('sports', 0.14), ('systems', 0.137), ('thanks', 0.135), ('editor', 0.131), ('professional', 0.126), ('building', 0.121), ('web', 0.118), ('department', 0.116), ('hear', 0.112), ('performance', 0.112), ('finding', 0.11), ('scientist', 0.106), ('subject', 0.105), ('received', 0.104), ('email', 0.101), ('focus', 0.098), ('field', 0.097), ('hope', 0.096), ('line', 0.089), ('news', 0.088), ('able', 0.087), ('public', 0.079), ('done', 0.075), ('high', 0.073), ('social', 0.071), ('write', 0.067), ('big', 0.066), ('following', 0.065), ('thought', 0.061), ('using', 0.051), ('time', 0.04), ('data', 0.033), ('would', 0.027), ('like', 0.027)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 2118 andrew gelman stats-2013-11-30-???
Introduction: I received the following unsolicited email, subject line Technology and Engineering Research: Dear Editor We have done research in some of the cutting edge technology and engineering field and would like to if you will be able to write about it in your news section. Our Primarily research focus on building high performance systems that are helping in social networks, web, finding disease, cancer and sports using BIG DATA . Hope to hear from you some time soon. Thanks, ***, PhD Chartered Scientist IBM Corportation ***@us.ibm.com 916 *** **** I thought IBM was a professional operation—don’t they have their own public relations department?
2 0.15785992 1088 andrew gelman stats-2011-12-28-Argument in favor of Ddulites
Introduction: Mark Palko defines a Ddulite as follows: A preference for higher tech solutions even in cases where lower tech alternatives have greater and more appropriate functionality; a person of ddulite tendencies. Though Ddulites are the opposite of Luddites with respect to attitudes toward technology, they occupy more or less the same point with respect to functionality. As a sometime Luddite myself (no cell phone, tv, microwave oven, etc.), I should in fairness point out the logic in favor of being a Ddulite. Old technology is typically pretty stable; new technology is improving. It can make sense to switch early (before the new technology actually performs better than the old) to get the benefits of being familiar with the new technology once it does take off.
3 0.1241212 1297 andrew gelman stats-2012-05-03-New New York data research organizations
Introduction: In a single day, New York City obtained two data analysis/statistics/machine learning organizations: Microsoft Research New York City with John Langford (machine learning), Duncan Watts (networks), and Dave Pennock (algorithmic economics). eBay technology center focusing on data – led by Chris Dixon , the co-founder of the recommendation engine company Hunch, which has recently been acquired by eBay. New York already has Facebook’s engineering unit , Twitter’s East Coast headquarters , and Google’s second-largest engineering office. The data community here is on an upswing, and it might be one of the best places to be if you’re into applied statistics, machine learning or data analysis. Post by Aleks Jakulin . P.S. (from Andrew): The formerly-Yahoo-now-Microsoft researchers have a more-or-less formal connection to Columbia, through the Applied Statistics Center, where some of them will be organizing occasional mini-conferences and workshops!
4 0.12159292 764 andrew gelman stats-2011-06-14-Examining US Legislative process with “Many Bills”
Introduction: This is Many Bills , a visualization of US bills by IBM: I learned about it a few days ago from Irene Ros at Foo Camp . It definitely looks better than my own analysis of US Senate bills .
5 0.11592077 343 andrew gelman stats-2010-10-15-?
Introduction: How am I supposed to handle this sort of thing? (See below.) I just stuck it one of my email folders without responding, but then I wondered . . . what’s it all about? Is there some sort of Glengarry Glen Ross-like parallel world where down-on-their-luck Jack Lemmons of public relations world send out electronic cold calls? More than anything else, this sort of thing makes me glad I have a steady job. Here’s the (unsolicited) email, which came with the subject line “Please help a reporter do his job”: Dear Andrew, As an Editor for the Bulldog Reporter (www.bulldogreporter.com/dailydog), a media relations trade publication, my job is to help ensure that my readers have accurate info about you and send you the best quality pitches. By taking five minutes or less to answer my questions (pasted below), you’ll receive targeted PR pitches from our client base that will match your beat and interests. Any help or direction is appreciated. Here are my questions. We have you listed
7 0.10417673 1766 andrew gelman stats-2013-03-16-“Nightshifts Linked to Increased Risk for Ovarian Cancer”
8 0.10353181 21 andrew gelman stats-2010-05-07-Environmentally induced cancer “grossly underestimated”? Doubtful.
9 0.10353165 635 andrew gelman stats-2011-03-29-Bayesian spam!
10 0.10265408 18 andrew gelman stats-2010-05-06-$63,000 worth of abusive research . . . or just a really stupid waste of time?
11 0.1018928 1173 andrew gelman stats-2012-02-17-Sports examples in class
12 0.096750833 978 andrew gelman stats-2011-10-28-Cool job opening with brilliant researchers at Yahoo
14 0.089533687 1618 andrew gelman stats-2012-12-11-The consulting biz
15 0.084491208 359 andrew gelman stats-2010-10-21-Applied Statistics Center miniconference: Statistical sampling in developing countries
17 0.082782008 1630 andrew gelman stats-2012-12-18-Postdoc positions at Microsoft Research – NYC
18 0.079488486 1904 andrew gelman stats-2013-06-18-Job opening! Come work with us!
19 0.079350501 880 andrew gelman stats-2011-08-30-Annals of spam
20 0.07917054 722 andrew gelman stats-2011-05-20-Why no Wegmania?
topicId topicWeight
[(0, 0.103), (1, -0.05), (2, -0.048), (3, -0.032), (4, 0.019), (5, 0.066), (6, -0.034), (7, -0.029), (8, -0.067), (9, 0.047), (10, -0.025), (11, -0.044), (12, 0.046), (13, -0.008), (14, -0.083), (15, 0.05), (16, 0.055), (17, -0.036), (18, 0.003), (19, 0.018), (20, 0.014), (21, -0.001), (22, 0.022), (23, -0.038), (24, -0.001), (25, -0.019), (26, 0.011), (27, -0.031), (28, -0.003), (29, 0.002), (30, -0.043), (31, -0.007), (32, -0.052), (33, 0.033), (34, 0.007), (35, -0.001), (36, 0.008), (37, -0.024), (38, -0.002), (39, 0.027), (40, 0.006), (41, 0.031), (42, 0.009), (43, -0.027), (44, 0.03), (45, -0.002), (46, 0.04), (47, 0.041), (48, 0.001), (49, -0.065)]
simIndex simValue blogId blogTitle
same-blog 1 0.96830547 2118 andrew gelman stats-2013-11-30-???
Introduction: I received the following unsolicited email, subject line Technology and Engineering Research: Dear Editor We have done research in some of the cutting edge technology and engineering field and would like to if you will be able to write about it in your news section. Our Primarily research focus on building high performance systems that are helping in social networks, web, finding disease, cancer and sports using BIG DATA . Hope to hear from you some time soon. Thanks, ***, PhD Chartered Scientist IBM Corportation ***@us.ibm.com 916 *** **** I thought IBM was a professional operation—don’t they have their own public relations department?
2 0.81416732 866 andrew gelman stats-2011-08-23-Participate in a research project on combining information for prediction
Introduction: Thomas Wallsten writes: To viewers of Dr. Andrew Gelman’s blog, I [Wallsten] am pleased to invite you to participate in an important research project to develop improved methods for predicting future events and outcomes. More specifically, our goal is to develop methods for aggregating many individual judgments in a manner that yields more accurate predictions than any one person or small group alone could provide. Our research is funded by the Intelligence Advanced Research Project Activity (IARPA, iarpa.gov ), but its application will extend far beyond intelligence to such areas as business forecasting or medical diagnosis. Our team consists of researchers at ARA, a private company; as well as researchers at the University of Maryland-College Park, University of Michigan, Ohio State University, Fordham University, University of California-Irvine, Wake Forest University, and the University of Missouri. Details can be found at forecastingace.com/ . We are seeking to recruit ind
3 0.81133974 1618 andrew gelman stats-2012-12-11-The consulting biz
Introduction: I received the following (unsolicited) email: Hello, *** LLC, a ***-based market research company, has a financial client who is interested in speaking with a statistician who has done research in the field of Alzheimer’s Disease and preferably familiar with the SOLA and BAPI trials. We offer an honorarium of $200 for a 30 minute telephone interview. Please advise us if you have an employment or consulting agreement with any organization or operate professionally pursuant to an organization’s code of conduct or employee manual that may control activities by you outside of your regular present and former employment, such as participating in this consulting project for MedPanel. If there are such contracts or other documents that do apply to you, please forward MedPanel a copy of each such document asap as we are obligated to review such documents to determine if you are permitted to participate as a consultant for MedPanel on a project with this particular client. If you are
4 0.75889981 18 andrew gelman stats-2010-05-06-$63,000 worth of abusive research . . . or just a really stupid waste of time?
Introduction: As someone who relies strongly on survey research, it’s good for me to be reminded that some surveys are useful, some are useless, but one thing they almost all have in common is . . . they waste the respondents’ time. I thought of this after receiving the following email, which I shall reproduce here. My own comments appear after. Recently, you received an email from a student asking for 10 minutes of your time to discuss your Ph.D. program (the body of the email appears below). We are emailing you today to debrief you on the actual purpose of that email, as it was part of a research study. We sincerely hope our study did not cause you any disruption and we apologize if you were at all inconvenienced. Our hope is that this letter will provide a sufficient explanation of the purpose and design of our study to alleviate any concerns you may have about your involvement. We want to thank you for your time and for reading further if you are interested in understanding why you rece
Introduction: One thing we do here at the Applied Statistics Center is hold mini-conferences. The next one looks really cool. It’s organized by Sharad Goel and Jake Hofman (Microsoft Research, formerly at Yahoo Research), David Park (Columbia University), and Sergei Vassilvitskii (Google). As with our other conferences, one of our goals is to mix the academic and nonacademic research communities. Here’s the website for the workshop, and here’s the announcement from the organizers: With an explosion of data on every aspect of our everyday existence — from what we buy, to where we travel, to who we know — we are able to observe human behavior with granularity largely thought impossible just a decade ago. The growth of such online activity has further facilitated the design of web-based experiments, enhancing both the scale and efficiency of traditional methods. Together these advances have created an unprecedented opportunity to address longstanding questions in the social sciences, rang
6 0.70903045 882 andrew gelman stats-2011-08-31-Meanwhile, on the sister blog . . .
7 0.68023551 1343 andrew gelman stats-2012-05-25-And now, here’s something we hope you’ll really like
9 0.66549402 978 andrew gelman stats-2011-10-28-Cool job opening with brilliant researchers at Yahoo
10 0.6490413 1904 andrew gelman stats-2013-06-18-Job opening! Come work with us!
11 0.6426574 1279 andrew gelman stats-2012-04-24-ESPN is looking to hire a research analyst
12 0.64244092 343 andrew gelman stats-2010-10-15-?
14 0.63473552 2148 andrew gelman stats-2013-12-25-Spam!
15 0.63130951 1217 andrew gelman stats-2012-03-17-NSF program “to support analytic and methodological research in support of its surveys”
16 0.62996608 412 andrew gelman stats-2010-11-13-Time to apply for the hackNY summer fellows program
17 0.62742603 27 andrew gelman stats-2010-05-11-Update on the spam email study
18 0.626463 1434 andrew gelman stats-2012-07-29-FindTheData.org
19 0.62575823 530 andrew gelman stats-2011-01-22-MS-Bayes?
20 0.6214636 332 andrew gelman stats-2010-10-10-Proposed new section of the American Statistical Association on Imaging Sciences
topicId topicWeight
[(5, 0.044), (15, 0.017), (16, 0.096), (18, 0.025), (24, 0.145), (27, 0.023), (29, 0.06), (48, 0.043), (66, 0.026), (68, 0.026), (86, 0.076), (97, 0.068), (99, 0.231)]
simIndex simValue blogId blogTitle
same-blog 1 0.97076166 2118 andrew gelman stats-2013-11-30-???
Introduction: I received the following unsolicited email, subject line Technology and Engineering Research: Dear Editor We have done research in some of the cutting edge technology and engineering field and would like to if you will be able to write about it in your news section. Our Primarily research focus on building high performance systems that are helping in social networks, web, finding disease, cancer and sports using BIG DATA . Hope to hear from you some time soon. Thanks, ***, PhD Chartered Scientist IBM Corportation ***@us.ibm.com 916 *** **** I thought IBM was a professional operation—don’t they have their own public relations department?
2 0.91499716 639 andrew gelman stats-2011-03-31-Bayes: radical, liberal, or conservative?
Introduction: Radford writes : The word “conservative” gets used many ways, for various political purposes, but I would take it’s basic meaning to be someone who thinks there’s a lot of wisdom in traditional ways of doing things, even if we don’t understand exactly why those ways are good, so we should be reluctant to change unless we have a strong argument that some other way is better. This sounds very Bayesian, with a prior reducing the impact of new data. I agree completely, and I think Radford will very much enjoy my article with Aleks Jakulin , “Bayes: radical, liberal, or conservative?” Radford’s comment also fits with my increasing inclination to use informative prior distributions.
3 0.91478562 1940 andrew gelman stats-2013-07-16-A poll that throws away data???
Introduction: Mark Blumenthal writes: What do you think about the “random rejection” method used by PPP that was attacked at some length today by a Republican pollster. Our just published post on the debate includes all the details as I know them. The Storify of Martino’s tweets has some additional data tables linked to toward the end. Also, more specifically, setting aside Martino’s suggestion of manipulation (which is also quite possible with post-stratification weights), would the PPP method introduce more potential random error than weighting? From Blumenthal’s blog: B.J. Martino, a senior vice president at the Republican polling firm The Tarrance Group, went on an 30-minute Twitter rant on Tuesday questioning the unorthodox method used by PPP [Public Policy Polling] to select samples and weight data: “Looking at @ppppolls new VA SW. Wondering how many interviews they discarded to get down to 601 completes? Because @ppppolls discards a LOT of interviews. Of 64,811 conducted
4 0.91448104 1518 andrew gelman stats-2012-10-02-Fighting a losing battle
Introduction: Following a recent email exchange regarding path sampling and thermodynamic integration (sadly, I’ve gotten rusty and haven’t thought seriously about these challenges for many years), a correspondent referred to the marginal distribution of the data under a model as “the evidence.” I hate that expression! As we discuss in chapter 6 of BDA, for continuous-parametered models, this quantity can be completely sensitive to aspects of the prior that have essentially no impact on the posterior. In the examples I’ve seen, this marginal probability is not “evidence” in any useful sense of the term. When I told this to my correspondent, he replied, I actually don’t find “the evidence” too bothersome. I don’t have BDA at home where I’m working from at the moment, so I’ll read up on chapter 6 later, but I assume you refer to the problem of the marginal likelihood being strongly sensitive to the prior in a way that the posterior typically isn’t, thereby diminishing the value of the margi
5 0.91297042 2057 andrew gelman stats-2013-10-10-Chris Chabris is irritated by Malcolm Gladwell
Introduction: Christopher Chabris reviewed the new book by Malcolm Gladwell: One thing “David and Goliath” shows is that Mr. Gladwell has not changed his own strategy, despite serious criticism of his prior work. What he presents are mostly just intriguing possibilities and musings about human behavior, but what his publisher sells them as, and what his readers may incorrectly take them for, are lawful, causal rules that explain how the world really works. Mr. Gladwell should acknowledge when he is speculating or working with thin evidentiary soup. Yet far from abandoning his hand or even standing pat, Mr. Gladwell has doubled down. This will surely bring more success to a Goliath of nonfiction writing, but not to his readers. Afterward he blogged some further thoughts about the popular popular science writer. Good stuff . Chabris has a thoughtful explanation of why the “Gladwell is just an entertainer” alibi doesn’t work for him (Chabris). Some of his discussion reminds me of my articl
6 0.91126662 1637 andrew gelman stats-2012-12-24-Textbook for data visualization?
8 0.91023183 1019 andrew gelman stats-2011-11-19-Validation of Software for Bayesian Models Using Posterior Quantiles
9 0.90998864 2121 andrew gelman stats-2013-12-02-Should personal genetic testing be regulated? Battle of the blogroll
10 0.9092387 2201 andrew gelman stats-2014-02-06-Bootstrap averaging: Examples where it works and where it doesn’t work
11 0.9078992 1980 andrew gelman stats-2013-08-13-Test scores and grades predict job performance (but maybe not at Google)
12 0.90680063 781 andrew gelman stats-2011-06-28-The holes in my philosophy of Bayesian data analysis
13 0.90666795 996 andrew gelman stats-2011-11-07-Chi-square FAIL when many cells have small expected values
16 0.9059239 160 andrew gelman stats-2010-07-23-Unhappy with improvement by a factor of 10^29
17 0.90580511 1278 andrew gelman stats-2012-04-23-“Any old map will do” meets “God is in every leaf of every tree”
18 0.90549201 1206 andrew gelman stats-2012-03-10-95% intervals that I don’t believe, because they’re from a flat prior I don’t believe
19 0.90526497 187 andrew gelman stats-2010-08-05-Update on state size and governors’ popularity
20 0.904989 1016 andrew gelman stats-2011-11-17-I got 99 comparisons but multiplicity ain’t one