andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1538 knowledge-graph by maker-knowledge-mining

1538 andrew gelman stats-2012-10-17-Rust


meta infos for this blog

Source: html

Introduction: I happened to be referring to the path sampling paper today and took a look at Appendix A.2: I’m sure I could reconstruct all of this if I had to, but I certainly can’t read this sort of thing cold anymore.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 I happened to be referring to the path sampling paper today and took a look at Appendix A. [sent-1, score-1.57]

2 2: I’m sure I could reconstruct all of this if I had to, but I certainly can’t read this sort of thing cold anymore. [sent-2, score-1.493]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('reconstruct', 0.433), ('anymore', 0.364), ('appendix', 0.358), ('cold', 0.353), ('path', 0.282), ('referring', 0.282), ('sampling', 0.208), ('happened', 0.196), ('took', 0.19), ('today', 0.188), ('certainly', 0.181), ('read', 0.124), ('look', 0.124), ('sure', 0.114), ('sort', 0.11), ('thing', 0.108), ('paper', 0.1), ('could', 0.07)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1538 andrew gelman stats-2012-10-17-Rust

Introduction: I happened to be referring to the path sampling paper today and took a look at Appendix A.2: I’m sure I could reconstruct all of this if I had to, but I certainly can’t read this sort of thing cold anymore.

2 0.18371569 1583 andrew gelman stats-2012-11-19-I can’t read this interview with me

Introduction: From Alexandr Grigoryev: “Америка: «красная», «синяя» и «пурпурная».” Apparently my name is Эндрю Гелман. I had no idea that the Voice of America even existed anymore!

3 0.10964565 1714 andrew gelman stats-2013-02-09-Partial least squares path analysis

Introduction: Wayne Folta writes: I [Folta] was looking for R packages to address a project I’m working on and stumbled onto a package called ‘plspm’. It seems to be a nice package, but the thing I wanted to pass on is the PDF that Gaston Sanchez, its author, wrote that describes PLS Path Analysis in general and shows how to use plspm in particular. It’s like a 200-page R vignette that’s really informative and fun to read. I’d recommend it to you and your readers: even if you don’t want to delve into PLS and plspm deeply, the first seven pages and the Appendix A provide a great read about a grad student, PLS Path Analysis, and the history of the field. It’s written at a more popular level than you might like. For example, he says at one point: “A moderating effect is the fancy term that some authors use to say that there is a nosy variable M influencing the effect between an independent variable X and a dependent variable Y.” You would obviously never write anything like that [yup --- AG]

4 0.087181978 367 andrew gelman stats-2010-10-25-In today’s economy, the rich get richer

Introduction: I found a $5 bill on the street today.

5 0.083754301 2102 andrew gelman stats-2013-11-15-“Are all significant p-values created equal?”

Introduction: The answer is no, as explained in this classic article by Warren Browner and Thomas Newman from 1987. If I were to rewrite this article today, I would frame things slightly differently—referring to Type S and Type M errors rather than speaking of “the probability that the research hypothesis is true”—but overall they make good points, and I like their analogy to medical diagnostic testing.

6 0.082630798 1357 andrew gelman stats-2012-06-01-Halloween-Valentine’s update

7 0.081703551 1628 andrew gelman stats-2012-12-17-Statistics in a world where nothing is random

8 0.079681374 774 andrew gelman stats-2011-06-20-The pervasive twoishness of statistics; in particular, the “sampling distribution” and the “likelihood” are two different models, and that’s a good thing

9 0.079553731 2345 andrew gelman stats-2014-05-24-An interesting mosaic of a data programming course

10 0.073577166 1880 andrew gelman stats-2013-06-02-Flame bait

11 0.072006106 749 andrew gelman stats-2011-06-06-“Sampling: Design and Analysis”: a course for political science graduate students

12 0.071931824 1144 andrew gelman stats-2012-01-29-How many parameters are in a multilevel model?

13 0.069524765 770 andrew gelman stats-2011-06-15-Still more Mr. P in public health

14 0.067795493 1476 andrew gelman stats-2012-08-30-Stan is fast

15 0.063111201 373 andrew gelman stats-2010-10-27-It’s better than being forwarded the latest works of you-know-who

16 0.062284019 85 andrew gelman stats-2010-06-14-Prior distribution for design effects

17 0.061913636 523 andrew gelman stats-2011-01-18-Spam is out of control

18 0.060549907 107 andrew gelman stats-2010-06-24-PPS in Georgia

19 0.060228564 886 andrew gelman stats-2011-09-02-The new Helen DeWitt novel

20 0.059894748 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.067), (1, -0.017), (2, -0.012), (3, -0.01), (4, 0.01), (5, -0.004), (6, 0.018), (7, -0.019), (8, 0.021), (9, -0.025), (10, 0.029), (11, -0.021), (12, -0.007), (13, 0.006), (14, 0.015), (15, -0.031), (16, 0.006), (17, 0.017), (18, -0.003), (19, -0.014), (20, -0.019), (21, -0.018), (22, -0.005), (23, 0.005), (24, 0.013), (25, 0.014), (26, -0.028), (27, 0.032), (28, 0.034), (29, 0.028), (30, -0.006), (31, 0.015), (32, -0.015), (33, -0.0), (34, -0.039), (35, 0.01), (36, 0.021), (37, -0.006), (38, -0.041), (39, 0.032), (40, 0.028), (41, -0.004), (42, 0.002), (43, -0.025), (44, -0.024), (45, -0.027), (46, 0.009), (47, 0.004), (48, 0.032), (49, -0.008)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96301466 1538 andrew gelman stats-2012-10-17-Rust

Introduction: I happened to be referring to the path sampling paper today and took a look at Appendix A.2: I’m sure I could reconstruct all of this if I had to, but I certainly can’t read this sort of thing cold anymore.

2 0.68613452 5 andrew gelman stats-2010-04-27-Ethical and data-integrity problems in a study of mortality in Iraq

Introduction: Michael Spagat notifies me that his article criticizing the 2006 study of Burnham, Lafta, Doocy and Roberts has just been published . The Burnham et al. paper (also called, to my irritation (see the last item here ), “the Lancet survey”) used a cluster sample to estimate the number of deaths in Iraq in the three years following the 2003 invasion. In his newly-published paper, Spagat writes: [The Spagat article] presents some evidence suggesting ethical violations to the survey’s respondents including endangerment, privacy breaches and violations in obtaining informed consent. Breaches of minimal disclosure standards examined include non-disclosure of the survey’s questionnaire, data-entry form, data matching anonymised interviewer identifications with households and sample design. The paper also presents some evidence relating to data fabrication and falsification, which falls into nine broad categories. This evidence suggests that this survey cannot be considered a reliable or

3 0.64279664 1628 andrew gelman stats-2012-12-17-Statistics in a world where nothing is random

Introduction: Rama Ganesan writes: I think I am having an existential crisis. I used to work with animals (rats, mice, gerbils etc.) Then I started to work in marketing research where we did have some kind of random sampling procedure. So up until a few years ago, I was sort of okay. Now I am teaching marketing research, and I feel like there is no real random sampling anymore. I take pains to get students to understand what random means, and then the whole lot of inferential statistics. Then almost anything they do – the sample is not random. They think I am contradicting myself. They use convenience samples at every turn – for their school work, and the enormous amount on online surveying that gets done. Do you have any suggestions for me? Other than say, something like this . My reply: Statistics does not require randomness. The three essential elements of statistics are measurement, comparison, and variation. Randomness is one way to supply variation, and it’s one way to model

4 0.64170474 107 andrew gelman stats-2010-06-24-PPS in Georgia

Introduction: Lucy Flynn writes: I’m working at a non-profit organization called CRRC in the Republic of Georgia. I’m having a methodological problem and I saw the syllabus for your sampling class online and thought I might be able to ask you about it? We do a lot of complex surveys nationwide; our typical sample design is as follows: - stratify by rural/urban/capital - sub-stratify the rural and urban strata into NE/NW/SE/SW geographic quadrants - select voting precincts as PSUs - select households as SSUs - select individual respondents as TSUs I’m relatively new here, and past practice has been to sample voting precincts with probability proportional to size. It’s desirable because it’s not logistically feasible for us to vary the number of interviews per precinct with precinct size, so it makes the selection probabilities for households more even across precinct sizes. However, I have a complex sampling textbook (Lohr 1999), and it explains how complex it is to calculate sel

5 0.62897199 1301 andrew gelman stats-2012-05-05-Related to z-statistics

Introduction: Pawel Sobkowicz writes: How many zombies do you know?’ Using indirect survey methods to measure alien attacks and outbreaks of the undead, Arxiv preprint arXiv:1003.6087, 2010 I hope you would find interesting the following paper, recently posted on arXiv: Aliens on Earth. Are reports of close encounters correct?, arXiv:1203.6805 This is soooooo much better than getting links to bad graphs or to papers on sex ratios!

6 0.61914068 1120 andrew gelman stats-2012-01-15-Fun fight over the Grover search algorithm

7 0.61795509 1551 andrew gelman stats-2012-10-28-A convenience sample and selected treatments

8 0.60644066 984 andrew gelman stats-2011-11-01-David MacKay sez . . . 12??

9 0.60208005 405 andrew gelman stats-2010-11-10-Estimation from an out-of-date census

10 0.58769983 128 andrew gelman stats-2010-07-05-The greatest works of statistics never published

11 0.5864445 109 andrew gelman stats-2010-06-25-Classics of statistics

12 0.57622343 2233 andrew gelman stats-2014-03-04-Literal vs. rhetorical

13 0.57237184 1917 andrew gelman stats-2013-06-28-Econ coauthorship update

14 0.5709672 2191 andrew gelman stats-2014-01-29-“Questioning The Lancet, PLOS, And Other Surveys On Iraqi Deaths, An Interview With Univ. of London Professor Michael Spagat”

15 0.56684941 650 andrew gelman stats-2011-04-05-Monitor the efficiency of your Markov chain sampler using expected squared jumped distance!

16 0.56497455 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

17 0.55275685 1523 andrew gelman stats-2012-10-06-Comparing people from two surveys, one of which is a simple random sample and one of which is not

18 0.55264938 1865 andrew gelman stats-2013-05-20-What happened that the journal Psychological Science published a paper with no identifiable strengths?

19 0.54677963 2358 andrew gelman stats-2014-06-03-Did you buy laundry detergent on their most recent trip to the store? Also comments on scientific publication and yet another suggestion to do a study that allows within-person comparisons

20 0.54347795 1940 andrew gelman stats-2013-07-16-A poll that throws away data???


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(24, 0.161), (79, 0.376), (99, 0.26)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9179675 1538 andrew gelman stats-2012-10-17-Rust

Introduction: I happened to be referring to the path sampling paper today and took a look at Appendix A.2: I’m sure I could reconstruct all of this if I had to, but I certainly can’t read this sort of thing cold anymore.

2 0.89664596 469 andrew gelman stats-2010-12-16-2500 people living in a park in Chicago?

Introduction: Frank Hansen writes: Columbus Park is on Chicago’s west side, in the Austin neighborhood. The park is a big green area which includes a golf course. Here is the google satellite view. Here is the nytimes page. Go to Chicago, and zoom over to the census tract 2521, which is just north of the horizontal gray line (Eisenhower Expressway, aka I290) and just east of Oak Park. The park is labeled on the nytimes map. The census data have around 50 dots (they say 50 people per dot) in the park which has no residential buildings. Congressional district is Danny Davis, IL7. Here’s a map of the district. So, how do we explain the map showing ~50 dots worth of people living in the park. What’s up with the algorithm to place the dots? I dunno. I leave this one to you, the readers.

3 0.86767936 939 andrew gelman stats-2011-10-03-DBQQ rounding for labeling charts and communicating tolerances

Introduction: This is a mini research note, not deserving of a paper, but perhaps useful to others. It reinvents what has already appeared on this blog. Let’s say we have a line chart with numbers between 152.134 and 210.823, with the mean of 183.463. How should we label the chart with about 3 tics? Perhaps 152.132, 181.4785 and 210.823? Don’t do it! Objective is to fit about 3-7 tics at the optimal level of rounding. I use the following sequence: decimal rounding : fitting integer power and single-digit decimal i , rounding to i * 10^ power (example: 100 200 300) binary having power , fitting single-digit decimal i and binary b , rounding to 2* i /(1+ b ) * 10^ power (150 200 250) (optional)  quaternary having power , fitting single-digit decimal i and  quaternary q (0,1,2,3) round to 4* i /(1+ q ) * 10^ power (150 175 200) quinary  having power , fitting single-digit decimal i and  quinary f (0,1,2,3,4) round to 5* i /(1+ f ) * 10^ power (160 180 200)

4 0.8550384 845 andrew gelman stats-2011-08-08-How adoption speed affects the abandonment of cultural tastes

Introduction: Interesting article by Jonah Berger and Gael Le Mens: Products, styles, and social movements often catch on and become popular, but little is known about why such identity-relevant cultural tastes and practices die out. We demonstrate that the velocity of adoption may affect abandonment: Analysis of over 100 years of data on first-name adoption in both France and the United States illustrates that cultural tastes that have been adopted quickly die faster (i.e., are less likely to persist). Mirroring this aggregate pattern, at the individual level, expecting parents are more hesitant to adopt names that recently experienced sharper increases in adoption. Further analysis indicate that these effects are driven by concerns about symbolic value: Fads are perceived negatively, so people avoid identity-relevant items with sharply increasing popularity because they believe that they will be short lived. Ancillary analyses also indicate that, in contrast to conventional wisdom, identity-r

5 0.84833336 1515 andrew gelman stats-2012-09-29-Jost Haidt

Introduction: Research psychologist John Jost reviews the recent book, “The Righteous Mind,” by research psychologist Jonathan Haidt. Some of my thoughts on Haidt’s book are here . And here’s some of Jost’s review: Haidt’s book is creative, interesting, and provocative. . . . The book shines a new light on moral psychology and presents a bold, confrontational message. From a scientific perspective, however, I worry that his theory raises more questions than it answers. Why do some individuals feel that it is morally good (or necessary) to obey authority, favor the ingroup, and maintain purity, whereas others are skeptical? (Perhaps parenting style is relevant after all.) Why do some people think that it is morally acceptable to judge or even mistreat others such as gay or lesbian couples or, only a generation ago, interracial couples because they dislike or feel disgusted by them, whereas others do not? Why does the present generation “care about violence toward many more classes of victims

6 0.81850982 1379 andrew gelman stats-2012-06-14-Cool-ass signal processing using Gaussian processes (birthdays again)

7 0.80821127 1126 andrew gelman stats-2012-01-18-Bob on Stan

8 0.76916003 1172 andrew gelman stats-2012-02-17-Rare name analysis and wealth convergence

9 0.76154196 1048 andrew gelman stats-2011-12-09-Maze generation algorithms!

10 0.75628698 863 andrew gelman stats-2011-08-21-Bad graph

11 0.75537503 2139 andrew gelman stats-2013-12-19-Happy birthday

12 0.75232089 1825 andrew gelman stats-2013-04-25-It’s binless! A program for computing normalizing functions

13 0.74634004 1786 andrew gelman stats-2013-04-03-Hierarchical array priors for ANOVA decompositions

14 0.72411084 399 andrew gelman stats-2010-11-07-Challenges of experimental design; also another rant on the practice of mentioning the publication of an article but not naming its author

15 0.72272032 1884 andrew gelman stats-2013-06-05-A story of fake-data checking being used to shoot down a flawed analysis at the Farm Credit Agency

16 0.70743799 1384 andrew gelman stats-2012-06-19-Slick time series decomposition of the birthdays data

17 0.70650643 2145 andrew gelman stats-2013-12-24-Estimating and summarizing inference for hierarchical variance parameters when the number of groups is small

18 0.69295156 1041 andrew gelman stats-2011-12-04-David MacKay and Occam’s Razor

19 0.66064173 1229 andrew gelman stats-2012-03-25-Same old story

20 0.65952802 63 andrew gelman stats-2010-06-02-The problem of overestimation of group-level variance parameters