andrew_gelman_stats andrew_gelman_stats-2014 andrew_gelman_stats-2014-2309 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Raphael Silberzahn writes: Brian Nosek, Eric Luis Uhlmann, Dan Martin, and I just launched a project through the Open Science Center we think you’ll find interesting. The basic idea is to “Crowdstorm a Dataset”. Multiple independent analysts are recruited to test the same hypothesis on the same data set in whatever manner they see as best. If everyone comes up with the same results, then scientists can speak with one voice. If not, the subjectivity and conditionality of results on analysis strategy is made transparent. For this first project, we are crowdstorming the question of whether soccer referees are more likely to give red cards to dark skin toned players than light skin toned players. The full project description is here . If you’re interested in being one of the crowdstormer analysts, you can register here . All analysts will receive an author credit on the final paper. We would love to have Bayesian analysts represented in the group. Also, please feel free to let
sentIndex sentText sentNum sentScore
1 Raphael Silberzahn writes: Brian Nosek, Eric Luis Uhlmann, Dan Martin, and I just launched a project through the Open Science Center we think you’ll find interesting. [sent-1, score-0.342]
2 Multiple independent analysts are recruited to test the same hypothesis on the same data set in whatever manner they see as best. [sent-3, score-0.939]
3 If everyone comes up with the same results, then scientists can speak with one voice. [sent-4, score-0.219]
4 If not, the subjectivity and conditionality of results on analysis strategy is made transparent. [sent-5, score-0.309]
5 For this first project, we are crowdstorming the question of whether soccer referees are more likely to give red cards to dark skin toned players than light skin toned players. [sent-6, score-2.225]
6 If you’re interested in being one of the crowdstormer analysts, you can register here . [sent-8, score-0.231]
7 All analysts will receive an author credit on the final paper. [sent-9, score-0.824]
8 We would love to have Bayesian analysts represented in the group. [sent-10, score-0.65]
9 Also, please feel free to let others know about the opportunity, anyone interested is welcome to take part. [sent-11, score-0.344]
10 I have no idea how this will work out but it seems to be worth a try. [sent-12, score-0.138]
wordName wordTfidf (topN-words)
[('analysts', 0.473), ('toned', 0.372), ('skin', 0.267), ('project', 0.217), ('crowdstorming', 0.169), ('luis', 0.16), ('soccer', 0.153), ('recruited', 0.147), ('nosek', 0.136), ('subjectivity', 0.131), ('cards', 0.131), ('register', 0.131), ('launched', 0.125), ('dark', 0.121), ('brian', 0.115), ('referees', 0.112), ('welcome', 0.11), ('represented', 0.108), ('martin', 0.108), ('manner', 0.106), ('receive', 0.101), ('interested', 0.1), ('players', 0.098), ('eric', 0.096), ('results', 0.094), ('final', 0.092), ('dan', 0.09), ('dataset', 0.088), ('speak', 0.087), ('light', 0.087), ('credit', 0.086), ('strategy', 0.084), ('description', 0.082), ('independent', 0.081), ('center', 0.08), ('red', 0.076), ('idea', 0.076), ('opportunity', 0.076), ('author', 0.072), ('please', 0.069), ('love', 0.069), ('hypothesis', 0.069), ('basic', 0.068), ('everyone', 0.068), ('open', 0.067), ('anyone', 0.065), ('multiple', 0.064), ('scientists', 0.064), ('test', 0.063), ('worth', 0.062)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 2309 andrew gelman stats-2014-04-28-Crowdstorming a dataset
Introduction: Raphael Silberzahn writes: Brian Nosek, Eric Luis Uhlmann, Dan Martin, and I just launched a project through the Open Science Center we think you’ll find interesting. The basic idea is to “Crowdstorm a Dataset”. Multiple independent analysts are recruited to test the same hypothesis on the same data set in whatever manner they see as best. If everyone comes up with the same results, then scientists can speak with one voice. If not, the subjectivity and conditionality of results on analysis strategy is made transparent. For this first project, we are crowdstorming the question of whether soccer referees are more likely to give red cards to dark skin toned players than light skin toned players. The full project description is here . If you’re interested in being one of the crowdstormer analysts, you can register here . All analysts will receive an author credit on the final paper. We would love to have Bayesian analysts represented in the group. Also, please feel free to let
2 0.21576636 1519 andrew gelman stats-2012-10-02-Job!
Introduction: Faten Sabry writes: We are looking to hire full time analysts at the undergraduate and graduate levels. The work involves extensive econometric analysis and handling of large databases. The analysts will be part of a team working to address various empirical microeconomic issues. I worked with Faten and her colleagues on a consulting project once, and they seemed like reasonable people to me.
Introduction: As I’ve written here many times, my experiences in social science and public health research have left me skeptical of statistical methods that hypothesize or try to detect zero relationships between observational data (see, for example, the discussion starting at the bottom of page 960 in my review of causal inference in the American Journal of Sociology). In short, I have a taste for continuous rather than discrete models. As discussed in the above-linked article (with respect to the writings of cognitive scientist Steven Sloman), I think that common-sense thinking about causal inference can often mislead. In many cases, I have found that that the theoretical frameworks of instrumental variables and potential outcomes (for a review see, for example, chapters 9 and 10 of my book with Jennifer) help clarify my thinking. Here is an example that came up in a recent blog discussion. Computer science student Elias Bareinboim gave the following example: “suppose we know nothing a
4 0.12334432 362 andrew gelman stats-2010-10-22-A redrawing of the Red-Blue map in November 2010?
Introduction: Here are my answers to the following questions asked by Pauline Peretz: 1. Many analysts have emphasized that there was a redrawing of the electoral map in 2008. To what extent will the November midterm elections affect this red-blue map? How long will the newly blue states remain blue? 2. Do you think the predictable loss of the Democrats in November definitely disqualifies the hypothesis that Obama’s election was the beginning of a realignment in American politics, that is a period of dominance for the Democratic party due to favourable demographics? 3. Some analysts consider that voting patterns are best explained by economic factors, others by values. How do you position yourself in the debate on culture wars vs. economic wars? 4. In your book Red State, Blue State, Rich State, Poor State, you renew the ongoing debate on the correlation between income and vote, showing it is much stronger in poor states. In light of this correlation, would you say that there currently is
5 0.10208385 951 andrew gelman stats-2011-10-11-Data mining efforts for Obama’s campaign
Introduction: From CNN : In July, KDNuggets.com, an online newsite focused on data mining and analytics software, ran an unusual listing in its jobs section: “We are looking for Predictive Modeling/Data Mining Scientists and Analysts, at both the senior and junior level, to join our department through November 2012 at our Chicago Headquarters,” read the ad. “We are a multi-disciplinary team of statisticians, predictive modelers, data mining experts, mathematicians, software developers, general analysts and organizers – all striving for a single goal: re-electing President Obama.” Users of the Obama 2012 – Are You In? app are not only giving the campaign personal data like their name, gender, birthday, current city, religion and political views, they are sharing their list of friends and information those friends share, like their birthday, current city, religion and political views. As Facebook is now offering the geo-targeting of ads down to ZIP code, this kind of fine-grained informa
6 0.098513342 423 andrew gelman stats-2010-11-20-How to schedule projects in an introductory statistics course?
7 0.090620995 1329 andrew gelman stats-2012-05-18-Those mean psychologists, making fun of dodgy research!
8 0.084603891 1532 andrew gelman stats-2012-10-13-A real-life dollar auction game!
9 0.082096919 481 andrew gelman stats-2010-12-22-The Jumpstart financial literacy survey and the different purposes of tests
10 0.072925091 662 andrew gelman stats-2011-04-15-Bayesian statistical pragmatism
11 0.072833784 537 andrew gelman stats-2011-01-25-Postdoc Position #1: Missing-Data Imputation, Diagnostics, and Applications
12 0.072725743 332 andrew gelman stats-2010-10-10-Proposed new section of the American Statistical Association on Imaging Sciences
13 0.072506674 1014 andrew gelman stats-2011-11-16-Visualizations of NYPD stop-and-frisk data
14 0.071190588 1111 andrew gelman stats-2012-01-10-The blog of the Cultural Cognition Project
15 0.068498716 1289 andrew gelman stats-2012-04-29-We go to war with the data we have, not the data we want
16 0.068161063 1959 andrew gelman stats-2013-07-28-50 shades of gray: A research story
17 0.067318231 407 andrew gelman stats-2010-11-11-Data Visualization vs. Statistical Graphics
18 0.064876437 1902 andrew gelman stats-2013-06-17-Job opening at new “big data” consulting firm!
19 0.06452997 1734 andrew gelman stats-2013-02-23-Life in the C-suite: A graph that is both ugly and bad, and an unrelated story
20 0.064370617 1917 andrew gelman stats-2013-06-28-Econ coauthorship update
topicId topicWeight
[(0, 0.122), (1, -0.013), (2, -0.036), (3, -0.015), (4, 0.001), (5, 0.026), (6, -0.026), (7, -0.013), (8, -0.02), (9, -0.01), (10, -0.007), (11, -0.011), (12, 0.045), (13, -0.031), (14, 0.012), (15, 0.03), (16, 0.011), (17, -0.03), (18, 0.005), (19, -0.003), (20, 0.008), (21, 0.018), (22, 0.027), (23, -0.023), (24, -0.021), (25, -0.021), (26, 0.025), (27, -0.05), (28, 0.008), (29, -0.005), (30, 0.043), (31, -0.064), (32, 0.023), (33, -0.008), (34, -0.017), (35, -0.003), (36, 0.021), (37, -0.01), (38, 0.023), (39, 0.027), (40, 0.038), (41, -0.012), (42, 0.006), (43, -0.013), (44, -0.027), (45, 0.008), (46, -0.004), (47, -0.033), (48, -0.042), (49, -0.011)]
simIndex simValue blogId blogTitle
same-blog 1 0.96132171 2309 andrew gelman stats-2014-04-28-Crowdstorming a dataset
Introduction: Raphael Silberzahn writes: Brian Nosek, Eric Luis Uhlmann, Dan Martin, and I just launched a project through the Open Science Center we think you’ll find interesting. The basic idea is to “Crowdstorm a Dataset”. Multiple independent analysts are recruited to test the same hypothesis on the same data set in whatever manner they see as best. If everyone comes up with the same results, then scientists can speak with one voice. If not, the subjectivity and conditionality of results on analysis strategy is made transparent. For this first project, we are crowdstorming the question of whether soccer referees are more likely to give red cards to dark skin toned players than light skin toned players. The full project description is here . If you’re interested in being one of the crowdstormer analysts, you can register here . All analysts will receive an author credit on the final paper. We would love to have Bayesian analysts represented in the group. Also, please feel free to let
2 0.73508686 1434 andrew gelman stats-2012-07-29-FindTheData.org
Introduction: I received the following (unsolicited) email: Hi Andrew, I work on the business development team of FindTheData.org, an unbiased comparison engine founded by Kevin O’Connor (founder and former CEO of DoubleClick) and backed by Kleiner Perkins with ~10M unique visitors per month. We are working with large online publishers including Golf Digest, Huffington Post, Under30CEO, and offer a variety of options to integrate our highly engaging content with your site. I believe our un-biased and reliable data resources would be of interest to you and your readers. I’d like to set up a quick call to discuss similar partnership ideas with you and would greatly appreciate 10 minutes of your time. Please suggest a couple times that work best for you or let me know if you would like me to send some more information before you make time for a call. Looking forward to hearing from you, Jonny – JONNY KINTZELE Business Development, FindThe Data mobile: 619-307-097
3 0.70527798 802 andrew gelman stats-2011-07-13-Super Sam Fuld Needs Your Help (with Foul Ball stats)
Introduction: I was pleasantly surprised to have my recreational reading about baseball in the New Yorker interrupted by a digression on statistics. Sam Fuld of the Tampa Bay Rays, was the subjet of a Ben McGrath profile in the 4 July 2011 issue of the New Yorker , in an article titled Super Sam . After quoting a minor-league trainer who described Fuld as “a bit of a geek” (who isn’t these days?), McGrath gets into that lovely New Yorker detail: One could have pointed out the more persuasive and telling examples, such as the fact that in 2005, after his first pro season, with the Class-A Peoria Chiefs, Fuld applied for a fall internship with Stats, Inc., the research firm that supplies broadcasters with much of the data anad analysis that you hear in sports telecasts. After a description of what they had him doing, reviewing footage of games and cataloguing, he said “I thought, They have a stat for everything, but they don’t have any stats regarding foul balls.” Fuld’s
4 0.70249981 544 andrew gelman stats-2011-01-29-Splitting the data
Introduction: Antonio Rangel writes: I’m a neuroscientist at Caltech . . . I’m using the debate on the ESP paper , as I’m sure other labs around the world are, as an opportunity to discuss some basic statistical issues/ideas w/ my lab. Request: Is there any chance you would be willing to share your thoughts about the difference between exploratory “data mining” studies and confirmatory studies? What I have in mind is that one could use a dataset to explore/discover novel hypotheses and then conduct another experiment to test those hypotheses rigorously. It seems that a good combination of both approaches could be the best of both worlds, since the first would lead to novel hypothesis discovery, and the later to careful testing. . . it is a fundamental issue for neuroscience and psychology. My reply: I know that people talk about this sort of thing . . . but in any real setting, I think I’d want all my data right now to answer any questions I have. I like cross-validation and have used
5 0.69175118 880 andrew gelman stats-2011-08-30-Annals of spam
Introduction: I received the following (unsolicited) email: Howdy Andrew, Hope you’re keeping well! I was wondering if you’re open to guest posts at Statistical Modeling, Causal Inference, and Social Science – if you are interested, I can offer an original 500-1000 word, very high quality article in fitting with the site. All research and writing will be carried out by a professional writer (namely, me) and once approved it will be entirely yours to place on the site as you see fit. I can choose a title for the article or you can suggest one and I’ll work around that. Normally I write for property and travel sites (particularly cruise and ski related), but I’m game for anything if you’re happy to entertain me. I can also include some copyright-free and high quality pictures related to the blog post. You’re probably wondering what’s in it for me, which is a fair question – in return, all I’d ask is a subtle link back in return. Other than that, the material itself would be non-commericial
6 0.6890651 2304 andrew gelman stats-2014-04-24-An open site for researchers to post and share papers
7 0.6885528 423 andrew gelman stats-2010-11-20-How to schedule projects in an introductory statistics course?
9 0.68356937 1922 andrew gelman stats-2013-07-02-They want me to send them free material and pay for the privilege
10 0.68136352 223 andrew gelman stats-2010-08-21-Statoverflow
11 0.67216289 1519 andrew gelman stats-2012-10-02-Job!
12 0.66210204 1238 andrew gelman stats-2012-03-31-Dispute about ethics of data sharing
13 0.65820765 2148 andrew gelman stats-2013-12-25-Spam!
14 0.65249205 714 andrew gelman stats-2011-05-16-NYT Labs releases Openpaths, a utility for saving your iphone data
15 0.65167499 1835 andrew gelman stats-2013-05-02-7 ways to separate errors from statistics
16 0.6512621 2183 andrew gelman stats-2014-01-23-Discussion on preregistration of research studies
18 0.64622509 1447 andrew gelman stats-2012-08-07-Reproducible science FAIL (so far): What’s stoppin people from sharin data and code?
19 0.6396178 2263 andrew gelman stats-2014-03-24-Empirical implications of Empirical Implications of Theoretical Models
20 0.63946158 118 andrew gelman stats-2010-06-30-Question & Answer Communities
topicId topicWeight
[(4, 0.017), (13, 0.127), (15, 0.015), (16, 0.076), (24, 0.1), (25, 0.018), (36, 0.028), (53, 0.02), (70, 0.033), (78, 0.034), (84, 0.072), (86, 0.023), (89, 0.027), (98, 0.032), (99, 0.273)]
simIndex simValue blogId blogTitle
same-blog 1 0.94618833 2309 andrew gelman stats-2014-04-28-Crowdstorming a dataset
Introduction: Raphael Silberzahn writes: Brian Nosek, Eric Luis Uhlmann, Dan Martin, and I just launched a project through the Open Science Center we think you’ll find interesting. The basic idea is to “Crowdstorm a Dataset”. Multiple independent analysts are recruited to test the same hypothesis on the same data set in whatever manner they see as best. If everyone comes up with the same results, then scientists can speak with one voice. If not, the subjectivity and conditionality of results on analysis strategy is made transparent. For this first project, we are crowdstorming the question of whether soccer referees are more likely to give red cards to dark skin toned players than light skin toned players. The full project description is here . If you’re interested in being one of the crowdstormer analysts, you can register here . All analysts will receive an author credit on the final paper. We would love to have Bayesian analysts represented in the group. Also, please feel free to let
2 0.94405603 234 andrew gelman stats-2010-08-25-Modeling constrained parameters
Introduction: Mike McLaughlin writes: In general, is there any way to do MCMC with a fixed constraint? E.g., suppose I measure the three internal angles of a triangle with errors ~dnorm(0, tau) where tau might be different for the three measurements. This would be an easy BUGS/WinBUGS/JAGS exercise but suppose, in addition, I wanted to include prior information to the effect that the three angles had to total 180 degrees exactly. Is this feasible? Could you point me to any BUGS model in which a constraint of this type is implemented? Note: Even in my own (non-hierarchical) code which tends to be component-wise, random-walk Metropolis with tuned Laplacian proposals, I cannot see how I could incorporate such a constraint. My reply: See page 508 of Bayesian Data Analysis (2nd edition). We have an example of such a model there (from this paper with Bois and Jiang).
3 0.9344002 172 andrew gelman stats-2010-07-30-Why don’t we have peer reviewing for oral presentations?
Introduction: Panos Ipeirotis writes in his blog post : Everyone who has attended a conference knows that the quality of the talks is very uneven. There are talks that are highly engaging, entertaining, and describe nicely the research challenges and solutions. And there are talks that are a waste of time. Either the presenter cannot present clearly, or the presented content is impossible to digest within the time frame of the presentation. We already have reviewing for the written part. The program committee examines the quality of the written paper and vouch for its technical content. However, by looking at a paper it is impossible to know how nicely it can be presented. Perhaps the seemingly solid but boring paper can be a very entertaining presentation. Or an excellent paper may be written by a horrible presenter. Why not having a second round of reviewing, where the authors of accepted papers submit their presentations (slides and a YouTube video) for presentation to the conference.
4 0.93387401 1137 andrew gelman stats-2012-01-24-Difficulties in publishing non-replications of implausible findings
Introduction: Eric Tassone points me to this news article by Christopher Shea on the challenges of debunking ESP. Shea writes : Earlier this year, a major psychology journal published a paper suggesting that there was some evidence for “pre-cognition,” a form of ESP. Stuart Ritchie, a doctoral student at the University of Edinburgh, is part of a team that tried, but failed, to replicate those results. Here, he tells the Chronicle of Higher Education’s Tom Bartlett about the difficulties he’s had getting the results published. Several journals told the team they wouldn’t publish a study that did no more than disprove a previous study. . . . An editor at another journal said he’d “only accept our paper if we ran a fourth experiment where we got a believer [in ESP] to run all the participants, to control for . . . experimenter effects.” My reaction is, this isn’t as easy a question as it might seem. At first, one’s reaction might share Ritchie’s frustration that a shoddy paper by Bem got p
5 0.93310976 597 andrew gelman stats-2011-03-02-RStudio – new cross-platform IDE for R
Introduction: The new R environment RStudio looks really great, especially for users new to R. In teaching, these are often people new to programming anything, much less statistical models. The R GUIs were different on each platform, with (sometimes modal) windows appearing and disappearing and no unified design. RStudio fixes that and has already found a happy home on my desktop. Initial impressions I’ve been using it for the past couple of days. For me, it replaces the niche that R.app held: looking at help, quickly doing something I don’t want to pollute a project workspace with; sometimes data munging, merging, and transforming; and prototyping plots. RStudio is better than R.app at all of these things. For actual development and papers, though, I remain wedded to emacs+ess (good old C-x M-c M-Butterfly ). Favorite features in no particular order plots seamlessly made in new graphics devices. This is huge— instead of one active plot window named something like quartz(1) t
6 0.93125427 1509 andrew gelman stats-2012-09-24-Analyzing photon counts
7 0.93065155 1789 andrew gelman stats-2013-04-05-Elites have alcohol problems too!
8 0.92769039 1942 andrew gelman stats-2013-07-17-“Stop and frisk” statistics
9 0.92639267 437 andrew gelman stats-2010-11-29-The mystery of the U-shaped relationship between happiness and age
11 0.92468178 1916 andrew gelman stats-2013-06-27-The weirdest thing about the AJPH story
13 0.92008507 971 andrew gelman stats-2011-10-25-Apply now for Earth Institute postdoctoral fellowships at Columbia University
14 0.90744263 1672 andrew gelman stats-2013-01-14-How do you think about the values in a confidence interval?
15 0.90486842 1559 andrew gelman stats-2012-11-02-The blog is back
16 0.90031749 1852 andrew gelman stats-2013-05-12-Crime novels for economists
17 0.89438164 186 andrew gelman stats-2010-08-04-“To find out what happens when you change something, it is necessary to change it.”
18 0.89341021 2004 andrew gelman stats-2013-09-01-Post-publication peer review: How it (sometimes) really works
19 0.89320916 1732 andrew gelman stats-2013-02-22-Evaluating the impacts of welfare reform?
20 0.89135766 1061 andrew gelman stats-2011-12-16-CrossValidated: A place to post your statistics questions