andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-305 knowledge-graph by maker-knowledge-mining

305 andrew gelman stats-2010-09-29-Decision science vs. social psychology


meta infos for this blog

Source: html

Introduction: Dan Goldstein sends along this bit of research , distinguishing terms used in two different subfields of psychology. Dan writes: Intuitive calls included not listing words that don’t occur 3 or more times in both programs. I [Dan] did this because when I looked at the results, those cases tended to be proper names or arbitrary things like header or footer text. It also narrowed down the space of words to inspect, which means I could actually get the thing done in my copious free time. I think the bar graphs are kinda ugly, maybe there’s a better way to do it based on classifying the words according to content? Also the whole exercise would gain a new dimension by comparing several areas instead of just two. Maybe that’s coming next.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Dan Goldstein sends along this bit of research , distinguishing terms used in two different subfields of psychology. [sent-1, score-0.762]

2 Dan writes: Intuitive calls included not listing words that don’t occur 3 or more times in both programs. [sent-2, score-0.875]

3 I [Dan] did this because when I looked at the results, those cases tended to be proper names or arbitrary things like header or footer text. [sent-3, score-1.194]

4 It also narrowed down the space of words to inspect, which means I could actually get the thing done in my copious free time. [sent-4, score-0.659]

5 I think the bar graphs are kinda ugly, maybe there’s a better way to do it based on classifying the words according to content? [sent-5, score-1.091]

6 Also the whole exercise would gain a new dimension by comparing several areas instead of just two. [sent-6, score-0.843]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('dan', 0.361), ('words', 0.26), ('footer', 0.226), ('header', 0.226), ('inspect', 0.226), ('classifying', 0.191), ('subfields', 0.186), ('distinguishing', 0.186), ('goldstein', 0.175), ('tended', 0.169), ('listing', 0.166), ('kinda', 0.164), ('intuitive', 0.162), ('bar', 0.144), ('dimension', 0.142), ('arbitrary', 0.141), ('proper', 0.139), ('exercise', 0.136), ('ugly', 0.135), ('occur', 0.135), ('calls', 0.134), ('gain', 0.13), ('sends', 0.118), ('names', 0.117), ('content', 0.114), ('included', 0.108), ('comparing', 0.107), ('space', 0.106), ('areas', 0.101), ('maybe', 0.1), ('according', 0.093), ('looked', 0.093), ('terms', 0.089), ('coming', 0.084), ('graphs', 0.083), ('cases', 0.083), ('whole', 0.083), ('free', 0.083), ('means', 0.081), ('next', 0.075), ('along', 0.074), ('instead', 0.073), ('times', 0.072), ('several', 0.071), ('done', 0.069), ('results', 0.063), ('also', 0.06), ('based', 0.056), ('bit', 0.055), ('used', 0.054)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 305 andrew gelman stats-2010-09-29-Decision science vs. social psychology

Introduction: Dan Goldstein sends along this bit of research , distinguishing terms used in two different subfields of psychology. Dan writes: Intuitive calls included not listing words that don’t occur 3 or more times in both programs. I [Dan] did this because when I looked at the results, those cases tended to be proper names or arbitrary things like header or footer text. It also narrowed down the space of words to inspect, which means I could actually get the thing done in my copious free time. I think the bar graphs are kinda ugly, maybe there’s a better way to do it based on classifying the words according to content? Also the whole exercise would gain a new dimension by comparing several areas instead of just two. Maybe that’s coming next.

2 0.22892779 190 andrew gelman stats-2010-08-07-Mister P makes the big jump from the New York Times to the Washington Post

Introduction: See paragraphs 13-15 of this article by Dan Balz.

3 0.16424505 1104 andrew gelman stats-2012-01-07-A compelling reason to go to London, Ontario??

Introduction: Dan Goldstein asks what I think of this : My reply: It’s hard for me to imagine a compelling reason for anyone to go to London, Ontario–but, hey, I guess there’s all kinds of people in this world! More seriously, I see the appeal of the graph but it’s a bit busy for my taste. Over the years I’ve moved toward small multiples rather than single busy graphs. That’s one reason why I prefer Tufte’s second book to his first book. The Napoleon-in-Russia graph is a bad model, in that inspires people to try to cram lots of variables on a single graph. Dan wrote back: I [Dan] like it as a travel planning graph, it gives you what you want to know (how how will the days be, how cold will the nights be, will it rain) but is a bit easier on the brain than a table of highs and lows. Also makes it easy to see the trend. I agree the 2nd axis doesn’t help.

4 0.16216826 29 andrew gelman stats-2010-05-12-Probability of successive wins in baseball

Introduction: Dan Goldstein did an informal study asking people the following question: When two baseball teams play each other on two consecutive days, what is the probability that the winner of the first game will be the winner of the second game? You can make your own guess and the continue reading below. Dan writes: We asked two colleagues knowledgeable in baseball and the mathematics of forecasting. The answers came in between 65% and 70%. The true answer [based on Dan's analysis of a database of baseball games]: 51.3%, a little better than a coin toss. I have to say, I’m surprised his colleagues gave such extreme guesses. I was guessing something like 50%, myself, based on the following very crude reasoning: Suppose two unequal teams are playing, and the chance of team A beating team B is 55%. (This seems like a reasonable average of all matchups, which will include some more extreme disparities but also many more equal contests.) Then the chance of the same team

5 0.13936779 262 andrew gelman stats-2010-09-08-Here’s how rumors get started: Lineplots, dotplots, and nonfunctional modernist architecture

Introduction: 1. I remarked that Sharad had a good research article with some ugly graphs. 2. Dan posted Sharad’s graph and some unpleasant alternatives, inadvertently associating me with one of the unpleasant alternatives. Dan was comparing barplots with dotplots. 3. I commented on Dan’s site that, in this case, I’d much prefer a well-designed lineplot. I wrote: There’s a principle in decision analysis that the most important step is not the evaluation of the decision tree but the decision of what options to include in the tree in the first place. I think that’s what’s happening here. You’re seriously limiting yourself by considering the above options, which really are all the same graph with just slight differences in format. What you need to do is break outside the box. (Graph 2-which I think you think is the kind of thing that Gelman would like-indeed is the kind of thing that I think the R gurus like, but I don’t like it at all . It looks clean without actually being clea

6 0.11945999 126 andrew gelman stats-2010-07-03-Graphical presentation of risk ratios

7 0.11714876 509 andrew gelman stats-2011-01-09-Chartjunk, but in a good cause!

8 0.11473257 2022 andrew gelman stats-2013-09-13-You heard it here first: Intense exercise can suppress appetite

9 0.11148158 77 andrew gelman stats-2010-06-09-Sof[t]

10 0.10954157 687 andrew gelman stats-2011-04-29-Zero is zero

11 0.10719755 455 andrew gelman stats-2010-12-07-Some ideas on communicating risks to the general public

12 0.0992397 574 andrew gelman stats-2011-02-14-“The best data visualizations should stand on their own”? I don’t think so.

13 0.096501283 294 andrew gelman stats-2010-09-23-Thinking outside the (graphical) box: Instead of arguing about how best to fix a bar chart, graph it as a time series lineplot instead

14 0.09089613 2211 andrew gelman stats-2014-02-14-The popularity of certain baby names is falling off the clifffffffffffff

15 0.089258239 863 andrew gelman stats-2011-08-21-Bad graph

16 0.084635392 1919 andrew gelman stats-2013-06-29-R sucks

17 0.084385037 1364 andrew gelman stats-2012-06-04-Massive confusion about a study that purports to show that exercise may increase heart risk

18 0.08263234 207 andrew gelman stats-2010-08-14-Pourquoi Google search est devenu plus raisonnable?

19 0.081283286 1932 andrew gelman stats-2013-07-10-Don’t trust the Turk

20 0.078453213 1090 andrew gelman stats-2011-12-28-“. . . extending for dozens of pages”


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.119), (1, -0.04), (2, -0.012), (3, 0.021), (4, 0.055), (5, -0.051), (6, 0.004), (7, 0.005), (8, -0.011), (9, -0.006), (10, -0.0), (11, -0.01), (12, 0.002), (13, -0.019), (14, -0.027), (15, 0.028), (16, 0.05), (17, -0.004), (18, -0.016), (19, -0.036), (20, -0.027), (21, 0.017), (22, -0.02), (23, 0.002), (24, 0.014), (25, -0.03), (26, -0.005), (27, 0.004), (28, 0.006), (29, -0.032), (30, -0.028), (31, -0.019), (32, -0.036), (33, -0.031), (34, -0.054), (35, -0.028), (36, 0.001), (37, -0.009), (38, -0.003), (39, -0.018), (40, 0.025), (41, -0.054), (42, 0.027), (43, 0.059), (44, -0.037), (45, -0.015), (46, -0.005), (47, 0.117), (48, -0.001), (49, 0.016)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94149828 305 andrew gelman stats-2010-09-29-Decision science vs. social psychology

Introduction: Dan Goldstein sends along this bit of research , distinguishing terms used in two different subfields of psychology. Dan writes: Intuitive calls included not listing words that don’t occur 3 or more times in both programs. I [Dan] did this because when I looked at the results, those cases tended to be proper names or arbitrary things like header or footer text. It also narrowed down the space of words to inspect, which means I could actually get the thing done in my copious free time. I think the bar graphs are kinda ugly, maybe there’s a better way to do it based on classifying the words according to content? Also the whole exercise would gain a new dimension by comparing several areas instead of just two. Maybe that’s coming next.

2 0.68615353 2132 andrew gelman stats-2013-12-13-And now, here’s something that would make Ed Tufte spin in his . . . ummm, Tufte’s still around, actually, so let’s just say I don’t think he’d like it!

Introduction: We haven’t had one of these in awhile, having mostly switched to the “chess trivia” and “bad p-values” genres of blogging . . . But I had to come back to the topic after receiving this note from Raghuveer Parthasarathy: Here’s another bad graph you might like. It might (arguably) be even worse than the “worst graphs of the year” you’ve blogged about, since rather than being a poor representation of data, it is simply the plotting of a tautology that mistakenly gives the impression of being data. (And it’s in Nature.) Parthasarathy explains: On the vertical axis we have the probability of being Type 2 Diabetic (T2D). On the horizontal axis we have the probability of being normal. There’s a clear, important trend evident, right? No! The probability of being normal is trivially one minus the probability of being T2D! The graph could not possibly be anything other than a straight line of slope -1. (For the students out there: the complete lack of scatter in the graph is

3 0.67273438 832 andrew gelman stats-2011-07-31-Even a good data display can sometimes be improved

Introduction: When I first saw this graphic, I thought “boy, that’s great, sometimes the graphic practically makes itself.” Normally it’s hard to use lots of different colors to differentiate items of interest, because there’s usually not an intuitive mapping between color and item (e.g. for countries, or states, or whatever). But the colors of crayons, what could be more perfect? So this graphic seemed awesome. But, as they discovered after some experimentation at datapointed.net there is an even BETTER possibility here. Click the link to see. Crayola Crayon colors by year

4 0.66721314 1747 andrew gelman stats-2013-03-03-More research on the role of puzzles in processing data graphics

Introduction: Ruth Rosenholtz of the department of Brain and Cognitive Science at MIT writes: We mostly do computational modeling of human vision. We try to do on the one hand the sort of basic science that fits in the human vision community, while on the other hand developing predictive models which might actually lend insight into design. Your talk resonated with me in part because of this paper [Do Predictions of Visual Perception Aid Design?, by Ruth Rosenholtz, Amal Dorai, and Rosalind Freeman]. We went into our study thinking that people would like to have a quantitative tool to help analyze designs. But what we concluded, somewhat anecdotally, was that its main use seemed to be as a conversation-starter, and a means of communicating ideas about the design. And the reason it seemed to work is that our visualizations were the right level of a “puzzle” — challenging enough to be a bit fun to work out. On another topic, check out the infographic from last weekend’s NYTimes ma

5 0.66547275 1104 andrew gelman stats-2012-01-07-A compelling reason to go to London, Ontario??

Introduction: Dan Goldstein asks what I think of this : My reply: It’s hard for me to imagine a compelling reason for anyone to go to London, Ontario–but, hey, I guess there’s all kinds of people in this world! More seriously, I see the appeal of the graph but it’s a bit busy for my taste. Over the years I’ve moved toward small multiples rather than single busy graphs. That’s one reason why I prefer Tufte’s second book to his first book. The Napoleon-in-Russia graph is a bad model, in that inspires people to try to cram lots of variables on a single graph. Dan wrote back: I [Dan] like it as a travel planning graph, it gives you what you want to know (how how will the days be, how cold will the nights be, will it rain) but is a bit easier on the brain than a table of highs and lows. Also makes it easy to see the trend. I agree the 2nd axis doesn’t help.

6 0.65208161 126 andrew gelman stats-2010-07-03-Graphical presentation of risk ratios

7 0.65084255 787 andrew gelman stats-2011-07-05-Different goals, different looks: Infovis and the Chris Rock effect

8 0.64997137 672 andrew gelman stats-2011-04-20-The R code for those time-use graphs

9 0.64921224 687 andrew gelman stats-2011-04-29-Zero is zero

10 0.63727075 671 andrew gelman stats-2011-04-20-One more time-use graph

11 0.62974966 1154 andrew gelman stats-2012-02-04-“Turn a Boring Bar Graph into a 3D Masterpiece”

12 0.62153679 1439 andrew gelman stats-2012-08-01-A book with a bunch of simple graphs

13 0.61641932 294 andrew gelman stats-2010-09-23-Thinking outside the (graphical) box: Instead of arguing about how best to fix a bar chart, graph it as a time series lineplot instead

14 0.61396039 2154 andrew gelman stats-2013-12-30-Bill Gates’s favorite graph of the year

15 0.61271834 262 andrew gelman stats-2010-09-08-Here’s how rumors get started: Lineplots, dotplots, and nonfunctional modernist architecture

16 0.60839921 1800 andrew gelman stats-2013-04-12-Too tired to mock

17 0.60697234 296 andrew gelman stats-2010-09-26-A simple semigraphic display

18 0.6054917 37 andrew gelman stats-2010-05-17-Is chartjunk really “more useful” than plain graphs? I don’t think so.

19 0.60411936 1609 andrew gelman stats-2012-12-06-Stephen Kosslyn’s principles of graphics and one more: There’s no need to cram everything into a single plot

20 0.60054332 1862 andrew gelman stats-2013-05-18-uuuuuuuuuuuuugly


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(16, 0.024), (24, 0.203), (42, 0.023), (53, 0.038), (63, 0.018), (86, 0.312), (99, 0.264)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.97857487 436 andrew gelman stats-2010-11-29-Quality control problems at the New York Times

Introduction: I guess there’s a reason they put this stuff in the Opinion section and not in the Science section, huh? P.S. More here .

2 0.96411252 1530 andrew gelman stats-2012-10-11-Migrating your blog from Movable Type to WordPress

Introduction: Cord Blomquist, who did a great job moving us from horrible Movable Type to nice nice WordPress, writes: I [Cord] wanted to share a little news with you related to the original work we did for you last year. When ReadyMadeWeb converted your Movable Type blog to WordPress, we got a lot of other requestes for the same service, so we started thinking about a bigger market for such a product. After a bit of research, we started work on automating the data conversion, writing rules, and exceptions to the rules, on how Movable Type and TypePad data could be translated to WordPress. After many months of work, we’re getting ready to announce TP2WP.com , a service that converts Movable Type and TypePad export files to WordPress import files, so anyone who wants to migrate to WordPress can do so easily and without losing permalinks, comments, images, or other files. By automating our service, we’ve been able to drop the price to just $99. I recommend it (and, no, Cord is not paying m

3 0.95641637 1427 andrew gelman stats-2012-07-24-More from the sister blog

Introduction: Anthropologist Bruce Mannheim reports that a recent well-publicized study on the genetics of native Americans, which used genetic analysis to find “at least three streams of Asian gene flow,” is in fact a confirmation of a long-known fact. Mannheim writes: This three-way distinction was known linguistically since the 1920s (for example, Sapir 1921). Basically, it’s a division among the Eskimo-Aleut languages, which straddle the Bering Straits even today, the Athabaskan languages (which were discovered to be related to a small Siberian language family only within the last few years, not by Greenberg as Wade suggested), and everything else. This is not to say that the results from genetics are unimportant, but it’s good to see how it fits with other aspects of our understanding.

4 0.95026708 253 andrew gelman stats-2010-09-03-Gladwell vs Pinker

Introduction: I just happened to notice this from last year. Eric Loken writes : Steven Pinker reviewed Malcolm Gladwell’s latest book and criticized him rather harshly for several shortcomings. Gladwell appears to have made things worse for himself in a letter to the editor of the NYT by defending a manifestly weak claim from one of his essays – the claim that NFL quarterback performance is unrelated to the order they were drafted out of college. The reason w [Loken and his colleagues] are implicated is that Pinker identified an earlier blog post of ours as one of three sources he used to challenge Gladwell (yay us!). But Gladwell either misrepresented or misunderstood our post in his response, and admonishes Pinker by saying “we should agree that our differences owe less to what can be found in the scientific literature than they do to what can be found on Google.” Well, here’s what you can find on Google. Follow this link to request the data for NFL quarterbacks drafted between 1980 and

5 0.94196272 1718 andrew gelman stats-2013-02-11-Toward a framework for automatic model building

Introduction: Patrick Caldon writes: I saw your recent blog post where you discussed in passing an iterative-chain-of models approach to AI. I essentially built such a thing for my PhD thesis – not in a Bayesian context, but in a logic programming context – and proved it had a few properties and showed how you could solve some toy problems. The important bit of my framework was that at various points you also go and get more data in the process – in a statistical context this might be seen as building a little univariate model on a subset of the data, then iteratively extending into a better model with more data and more independent variables – a generalized forward stepwise regression if you like. It wrapped a proper computational framework around E.M. Gold’s identification/learning in the limit based on a logic my advisor (Eric Martin) had invented. What’s not written up in the thesis is a few months of failed struggle trying to shoehorn some simple statistical inference into this

6 0.93646502 1552 andrew gelman stats-2012-10-29-“Communication is a central task of statistics, and ideally a state-of-the-art data analysis can have state-of-the-art displays to match”

7 0.93510699 873 andrew gelman stats-2011-08-26-Luck or knowledge?

8 0.92829353 904 andrew gelman stats-2011-09-13-My wikipedia edit

9 0.92688894 76 andrew gelman stats-2010-06-09-Both R and Stata

10 0.92389858 1327 andrew gelman stats-2012-05-18-Comments on “A Bayesian approach to complex clinical diagnoses: a case-study in child abuse”

same-blog 11 0.91984582 305 andrew gelman stats-2010-09-29-Decision science vs. social psychology

12 0.91275179 1547 andrew gelman stats-2012-10-25-College football, voting, and the law of large numbers

13 0.90339261 759 andrew gelman stats-2011-06-11-“2 level logit with 2 REs & large sample. computational nightmare – please help”

14 0.89280564 2082 andrew gelman stats-2013-10-30-Berri Gladwell Loken football update

15 0.89069116 2219 andrew gelman stats-2014-02-21-The world’s most popular languages that the Mac documentation hasn’t been translated into

16 0.88351619 2102 andrew gelman stats-2013-11-15-“Are all significant p-values created equal?”

17 0.88234568 558 andrew gelman stats-2011-02-05-Fattening of the world and good use of the alpha channel

18 0.88224137 276 andrew gelman stats-2010-09-14-Don’t look at just one poll number–unless you really know what you’re doing!

19 0.87768364 1971 andrew gelman stats-2013-08-07-I doubt they cheated

20 0.87267148 1983 andrew gelman stats-2013-08-15-More on AIC, WAIC, etc