andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-290 knowledge-graph by maker-knowledge-mining

290 andrew gelman stats-2010-09-22-Data Thief

meta infos for this blog

Source: html

Introduction: John Transue sends along a link to this software for extracting data from graphs. I havenâ€™t tried it out but it could be useful to somebody out there?

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 John Transue sends along a link to this software for extracting data from graphs. [sent-1, score-1.478]

2 I havenâ€™t tried it out but it could be useful to somebody out there? [sent-2, score-0.778]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('transue', 0.53), ('extracting', 0.475), ('sends', 0.292), ('somebody', 0.276), ('software', 0.27), ('tried', 0.241), ('haven', 0.224), ('john', 0.199), ('along', 0.185), ('link', 0.18), ('useful', 0.179), ('could', 0.082), ('data', 0.076)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 290 andrew gelman stats-2010-09-22-Data Thief

Introduction: John Transue sends along a link to this software for extracting data from graphs. I havenâ€™t tried it out but it could be useful to somebody out there?

2 0.24777123 357 andrew gelman stats-2010-10-20-Sas and R

Introduction: Xian sends along this link that might be of interest to some of you.

3 0.17927034 1514 andrew gelman stats-2012-09-28-AdviseStat 47% Campaign Ad

Introduction: Lee Wilkinson sends me this amusing ad for his new software, AdviseStat: The ad is a parody, but the software is real !

4 0.16843136 848 andrew gelman stats-2011-08-11-That xkcd cartoon on multiple comparisons that all of you were sending me a couple months ago

Introduction: John Transue sent it in with the following thoughtful comment: I’d imagine you’ve already received this, but just in case, here’s a cartoon you’d like. At first blush it seems to go against your advice (more nuanced than what I’m about to say by quoting the paper title) to not worry about multiple comparisons. However, if I understand correctly your argument about multiple comparisons in multilevel models, the situation in this comic might have been avoided if shrinkage toward the grand mean (of all colors) had prevented the greens from clearing the .05 threshold. Is that right?

5 0.12177591 1450 andrew gelman stats-2012-08-08-My upcoming talk for the data visualization meetup

Introduction: Somebody asked me to speak sometime at a data visualization meetup. I think I spoke there a year or two ago but I could do it again. Last time I spoke on Infovis vs Statistical Graphics , this time I could just go thru the choices involved in a few zillion graphs I’ve published over the years, to give a sense of the options and choices involved in graphical communication. For this talk there would be no single theme (except, perhaps, my usual “Graphs as comparisons,” “All of statistics as comparisons,” and “Exploratory data analysis as hypothesis testing”), just a bunch of open discussion about what I tried, why I tried it, what worked and what didn’t work, etc. I’ve discussed these sorts of decisions on occasion (and am now writing a paper with Yair about some of this for our voting models), but I’ve never tried to make a talk out of it before. Could be fun.

6 0.11847822 307 andrew gelman stats-2010-09-29-“Texting bans don’t reduce crashes; effects are slight crash increases”

7 0.11763693 424 andrew gelman stats-2010-11-21-Data cleaning tool!

8 0.11453504 1152 andrew gelman stats-2012-02-03-Web equation

9 0.1145028 181 andrew gelman stats-2010-08-03-MCMC in Python

10 0.10950348 869 andrew gelman stats-2011-08-24-Mister P in Stata

11 0.1040854 1197 andrew gelman stats-2012-03-04-“All Models are Right, Most are Useless”

12 0.10361481 1931 andrew gelman stats-2013-07-09-“Frontiers in Massive Data Analysis”

13 0.10232607 770 andrew gelman stats-2011-06-15-Still more Mr. P in public health

14 0.093493208 450 andrew gelman stats-2010-12-04-The Joy of Stats

15 0.092453793 1660 andrew gelman stats-2013-01-08-Bayesian, Permutable Symmetries

16 0.084235437 1026 andrew gelman stats-2011-11-25-Bayes wikipedia update

17 0.084079102 422 andrew gelman stats-2010-11-20-A Gapminder-like data visualization package

18 0.083294593 1668 andrew gelman stats-2013-01-11-My talk at the NY data visualization meetup this Monday!

19 0.080867253 624 andrew gelman stats-2011-03-22-A question about the economic benefits of universities

20 0.079572372 52 andrew gelman stats-2010-05-26-Intellectual property

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.06), (1, -0.005), (2, -0.032), (3, 0.024), (4, 0.05), (5, -0.005), (6, -0.015), (7, -0.018), (8, 0.005), (9, -0.028), (10, 0.009), (11, -0.051), (12, 0.047), (13, 0.001), (14, -0.024), (15, 0.058), (16, -0.011), (17, -0.006), (18, -0.031), (19, -0.029), (20, -0.019), (21, 0.002), (22, 0.042), (23, -0.032), (24, -0.036), (25, -0.006), (26, 0.021), (27, -0.024), (28, 0.042), (29, -0.048), (30, -0.006), (31, 0.041), (32, 0.089), (33, -0.068), (34, 0.006), (35, 0.045), (36, 0.011), (37, -0.001), (38, 0.029), (39, 0.135), (40, 0.011), (41, 0.032), (42, 0.005), (43, 0.029), (44, -0.01), (45, 0.016), (46, -0.106), (47, 0.119), (48, 0.054), (49, 0.071)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96650016 290 andrew gelman stats-2010-09-22-Data Thief

Introduction: John Transue sends along a link to this software for extracting data from graphs. I havenâ€™t tried it out but it could be useful to somebody out there?

2 0.87120891 357 andrew gelman stats-2010-10-20-Sas and R

Introduction: Xian sends along this link that might be of interest to some of you.

3 0.79815131 1514 andrew gelman stats-2012-09-28-AdviseStat 47% Campaign Ad

Introduction: Lee Wilkinson sends me this amusing ad for his new software, AdviseStat: The ad is a parody, but the software is real !

4 0.71705496 450 andrew gelman stats-2010-12-04-The Joy of Stats

Introduction: Hal Varian sends in this link to a series of educational videos described to be “a journey into the heart of statistics.” It seems to be focused on exploratory data analysis, which it describes as “an extraordinary new method of understanding ourselves and our Universe.”

5 0.71090192 587 andrew gelman stats-2011-02-24-5 seconds of every #1 pop single

Introduction: This is pretty amazing. Now I want to hear volume 3. Also is there a way to download this as I play it so I can listen when Iâ€™m offline? P.S. Typo in title fixed. P.P.S. I originally gave a different link but was led to the apparently more definitive link above (which allows direct download) from a commenter . Thanks!

6 0.70178747 869 andrew gelman stats-2011-08-24-Mister P in Stata

7 0.67299569 1257 andrew gelman stats-2012-04-10-Statisticians’ abbreviations are even less interesting than these!

8 0.65811366 1689 andrew gelman stats-2013-01-23-MLB Hall of Fame Voting Trajectories

9 0.65572447 1318 andrew gelman stats-2012-05-13-Stolen jokes

10 0.63759655 2066 andrew gelman stats-2013-10-17-G+ hangout for test run of BDA course

11 0.59350628 380 andrew gelman stats-2010-10-29-“Bluntly put . . .”

12 0.59318852 1433 andrew gelman stats-2012-07-28-LOL without the CATS

13 0.5877735 612 andrew gelman stats-2011-03-14-Uh-oh

14 0.58169991 1973 andrew gelman stats-2013-08-08-For chrissake, just make up an analysis already! We have a lab here to run, y’know?

15 0.57949257 734 andrew gelman stats-2011-05-28-Funniest comment ever

16 0.57240063 2028 andrew gelman stats-2013-09-17-Online conference for young statistics researchers

17 0.56604177 1667 andrew gelman stats-2013-01-10-When you SHARE poorly researched infographics…

18 0.56230551 681 andrew gelman stats-2011-04-26-Worst statistical graphic I have seen this year

19 0.55731648 1614 andrew gelman stats-2012-12-09-The pretty picture is just the beginning of the data exploration. But the pretty picture is a great way to get started. Another example of how a puzzle can make a graph appealing

20 0.55550122 347 andrew gelman stats-2010-10-17-Getting arm and lme4 running on the Mac

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(16, 0.099), (21, 0.056), (24, 0.08), (86, 0.129), (88, 0.241), (99, 0.172)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.88848579 290 andrew gelman stats-2010-09-22-Data Thief

Introduction: John Transue sends along a link to this software for extracting data from graphs. I havenâ€™t tried it out but it could be useful to somebody out there?

2 0.8175137 1174 andrew gelman stats-2012-02-18-Not as ugly as you look

Introduction: Kaiser asks the interesting question: How do you measure what restaurants are “overrated”? You can’t just ask people, right? There’s some sort of social element here, that “overrated” implies that someone’s out there doing the rating.

3 0.8088817 1098 andrew gelman stats-2012-01-04-Bayesian Page Rank?

Introduction: Loren Maxwell writes: I am trying to do some studies on the PageRank algorithm with applying a Bayesian technique. If you are not familiar with PageRank, it is the basis for how Google ranks their pages. It basically treats the internet as a large social network with each link conferring some value onto the page it links to. For example, if I had a webpage that had only one link to it, say from my friend’s webpage, then its PageRank would be dependent on my friend’s PageRank, presumably quite low. However, if the one link to my page was off the Google search page, then my PageRank would be quite high since there are undoubtedly millions of pages linking to Google and few pages that Google links to. The end result of the algorithm, however, is that all the PageRank values of the nodes in the network sum to one and the PageRank of a specific node is the probability that a “random surfer” will end up on that node. For example, in the attached spreadsheet, Column D shows e

4 0.7366637 1992 andrew gelman stats-2013-08-21-Workshop for Women in Machine Learning

Introduction: This might interest some of you: CALL FOR ABSTRACTS Workshop for Women in Machine Learning Co-located with NIPS 2013, Lake Tahoe, Nevada, USA December 5, 2013 http://www.wimlworkshop.org Deadline for abstract submissions: September 16, 2013 WORKSHOP DESCRIPTION The Workshop for Women in Machine Learning is a day-long event taking place on the first day of NIPS. The workshop aims to showcase the research of women in machine learning and to strengthen their community. The event brings together female faculty, graduate students, and research scientists for an opportunity to connect, exchange ideas, and learn from each other. Underrepresented minorities and undergraduates interested in pursuing machine learning research are encouraged to participate. While all presenters will be female, all genders are invited to attend. Scholarships will be provided to female students and postdoctoral attendees with accepted abstracts to partially offset travel costs. Workshop

5 0.72557563 136 andrew gelman stats-2010-07-09-Using ranks as numbers

Introduction: David Shor writes: I’m dealing with a situation where I have two datasets, one that assigns each participant a discrete score out of five for a set of particular traits (Dog behavior characteristics by breed), and another from an independent source that ranks each breed by each characteristic. It’s also possible to obtain the results of a survey, where experts were asked to rank 7 randomly picked breeds by characteristics. I’m interested in obtaining estimates for each trait, and intuitively, it seems clear that the second and third dataset provide a lot of information. But it’s unclear how to incorporate them to infer latent variables, since only sample ranks are observed. This seems like it is a common problem, do you have any suggestions? My quick answer is that you can treat ranks as numbers (a point we make somewhere in Bayesian Data Analysis, I believe) and just fit an item-response model from there. Val Johnson wrote an article on this in Jasa a few years ago, “Bayesia

6 0.71593517 1930 andrew gelman stats-2013-07-09-Symposium Magazine

7 0.71559221 569 andrew gelman stats-2011-02-12-Get the Data

8 0.71110451 1866 andrew gelman stats-2013-05-21-Recently in the sister blog

9 0.70851517 1507 andrew gelman stats-2012-09-22-Grade inflation: why weren’t the instructors all giving all A’s already??

10 0.69606704 825 andrew gelman stats-2011-07-27-Grade inflation: why weren’t the instructors all giving all A’s already??

11 0.69348335 629 andrew gelman stats-2011-03-26-Is it plausible that 1% of people pick a career based on their first name?

12 0.6931144 253 andrew gelman stats-2010-09-03-Gladwell vs Pinker

13 0.69281256 873 andrew gelman stats-2011-08-26-Luck or knowledge?

14 0.69260824 1403 andrew gelman stats-2012-07-02-Moving beyond hopeless graphics

15 0.69002616 400 andrew gelman stats-2010-11-08-Poli sci plagiarism update, and a note about the benefits of not caring

16 0.68960148 1547 andrew gelman stats-2012-10-25-College football, voting, and the law of large numbers

17 0.68661004 1633 andrew gelman stats-2012-12-21-Kahan on Pinker on politics

18 0.68552178 1718 andrew gelman stats-2013-02-11-Toward a framework for automatic model building

19 0.68312407 185 andrew gelman stats-2010-08-04-Why does anyone support private macroeconomic forecasts?

20 0.67749053 2095 andrew gelman stats-2013-11-09-Typo in Ghitza and Gelman MRP paper