andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1412 knowledge-graph by maker-knowledge-mining

1412 andrew gelman stats-2012-07-10-More questions on the contagion of obesity, height, etc.


meta infos for this blog

Source: html

Introduction: AT discusses [link broken; see P.P.S. below] a new paper of his that casts doubt on the robustness of the controversial Christakis and Fowler papers. AT writes that he ran some simulations of contagion on social networks and found that (a) in a simple model assuming the contagion of the sort hypothesized by Christakis and Fowler, their procedure would indeed give the sorts of estimates they found in their papers, but (b) in another simple model assuming a different sort of contagion, the C&F; estimation would give indistinguishable estimates. Thus, if you believe AT’s simulation model, C&F;’s procedure cannot statistically distinguish between two sorts of contagion (directional and simultaneous). I have not looked at AT’s paper so I can’t fully comment, but I don’t fully understand his method for simulating network connections. AT uses what he calls a “rewiring” model. This makes sense: as time progresses, we make new friends and lose old ones—but I am confused by the details


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 below] a new paper of his that casts doubt on the robustness of the controversial Christakis and Fowler papers. [sent-4, score-0.233]

2 Thus, if you believe AT’s simulation model, C&F;’s procedure cannot statistically distinguish between two sorts of contagion (directional and simultaneous). [sent-6, score-0.575]

3 I have not looked at AT’s paper so I can’t fully comment, but I don’t fully understand his method for simulating network connections. [sent-7, score-0.508]

4 This makes sense: as time progresses, we make new friends and lose old ones—but I am confused by the details (“First, some number of ties are assigned to have a new receiver, but the same sender; this changes the distribution of in-degree but keeps out-degree the same. [sent-9, score-0.502]

5 Then, some number of ties are assigned to have a new sender, but the same receiver, changing the out-degree but maintaining in-degree. [sent-10, score-0.448]

6 The interesting part is that, according to AT, C&F;’s model would actually work if they were analyzing continuous data. [sent-12, score-0.08]

7 It is only with binary data that the nature of the contagion becomes non-identifiable. [sent-13, score-0.479]

8 I’ll leave it to others to pursue this further. [sent-14, score-0.178]

9 As AT writes, Christakis and Fowler’s work has spurred a lot of controversy, and I think a lot of this derives from the difficult nature of their data, in which only a very small fraction of the total network connections are ever observed. [sent-15, score-0.495]

10 But you can’t just sit there and leave the case unresolved. [sent-17, score-0.103]

11 Whatever happens with their claims, and whatever one might think about how their research claims have been presented in the press, I admire C&F; for taking the big step for trying to learn about network effects from this unusual and sparse datasets. [sent-18, score-0.449]

12 It’s a lot better than one more analysis of the degree distribution of the scientific collaboration network, the fractal dimension of Wikipedia, or power laws anywhere. [sent-19, score-0.148]

13 In the first sentence, AT writes, “In stuffy academic fashion, I discuss . [sent-23, score-0.169]

14 ” I think that sort of apology is usually a bad idea. [sent-26, score-0.08]

15 If the academic “stuffiness” is appropriate, there’s no need to apologize. [sent-28, score-0.081]

16 AT refers to studies published by several other people but does not refer to them by name, instead merely mentioning “a pair of economists” and “a scathing review. [sent-30, score-0.072]

17 ” Why not mention names such as Hans Noel, Brendan Nyhan, Ethan Cohen-Cole, Jason Fletcher, and Russell Lyons? [sent-31, score-0.105]

18 At the very least, they’re more likely to notice their names and read your post. [sent-32, score-0.105]

19 AT says that for copyright reasons he had to take his post down. [sent-36, score-0.08]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('contagion', 0.391), ('network', 0.231), ('christakis', 0.231), ('fowler', 0.22), ('sender', 0.203), ('receiver', 0.203), ('ties', 0.157), ('assigned', 0.131), ('names', 0.105), ('procedure', 0.105), ('leave', 0.103), ('noel', 0.101), ('fletcher', 0.101), ('progresses', 0.101), ('fully', 0.1), ('assuming', 0.097), ('spurred', 0.096), ('ethan', 0.096), ('directional', 0.091), ('hans', 0.091), ('maintaining', 0.088), ('indistinguishable', 0.088), ('casts', 0.088), ('stuffy', 0.088), ('nature', 0.088), ('lyons', 0.086), ('simultaneous', 0.083), ('academic', 0.081), ('model', 0.08), ('copyright', 0.08), ('derives', 0.08), ('apology', 0.08), ('sorts', 0.079), ('hypothesized', 0.078), ('fractal', 0.078), ('simulating', 0.077), ('jason', 0.077), ('claims', 0.075), ('fashion', 0.075), ('pursue', 0.075), ('nyhan', 0.073), ('robustness', 0.073), ('whatever', 0.073), ('new', 0.072), ('brendan', 0.072), ('mentioning', 0.072), ('russell', 0.072), ('distribution', 0.07), ('broken', 0.07), ('sparse', 0.07)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1412 andrew gelman stats-2012-07-10-More questions on the contagion of obesity, height, etc.

Introduction: AT discusses [link broken; see P.P.S. below] a new paper of his that casts doubt on the robustness of the controversial Christakis and Fowler papers. AT writes that he ran some simulations of contagion on social networks and found that (a) in a simple model assuming the contagion of the sort hypothesized by Christakis and Fowler, their procedure would indeed give the sorts of estimates they found in their papers, but (b) in another simple model assuming a different sort of contagion, the C&F; estimation would give indistinguishable estimates. Thus, if you believe AT’s simulation model, C&F;’s procedure cannot statistically distinguish between two sorts of contagion (directional and simultaneous). I have not looked at AT’s paper so I can’t fully comment, but I don’t fully understand his method for simulating network connections. AT uses what he calls a “rewiring” model. This makes sense: as time progresses, we make new friends and lose old ones—but I am confused by the details

2 0.36257049 756 andrew gelman stats-2011-06-10-Christakis-Fowler update

Introduction: After I posted on Russ Lyons’s criticisms of the work of Nicholas Christakis and James Fowler’s work on social networks, several people emailed in with links to related articles. (Nobody wants to comment on the blog anymore; all I get is emails.) Here they are: Political scientists Hans Noel and Brendan Nyhan wrote a paper called “The ‘Unfriending’ Problem: The Consequences of Homophily in Friendship Retention for Causal Estimates of Social Influence” in which they argue that the Christakis-Fowler results are subject to bias because of patterns in the time course of friendships. Statisticians Cosma Shalizi and AT wrote a paper called “Homophily and Contagion Are Generically Confounded in Observational Social Network Studies” arguing that analyses such as those of Christakis and Fowler cannot hope to disentangle different sorts of network effects. And Christakis and Fowler reply to Noel and Nyhan, Shalizi and Thomas, Lyons, and others in an article that begins: H

3 0.28533092 757 andrew gelman stats-2011-06-10-Controversy over the Christakis-Fowler findings on the contagion of obesity

Introduction: Nicholas Christakis and James Fowler are famous for finding that obesity is contagious. Their claims, which have been received with both respect and skepticism (perhaps we need a new word for this: “respecticism”?) are based on analysis of data from the Framingham heart study, a large longitudinal public-health study that happened to have some social network data (for the odd reason that each participant was asked to provide the name of a friend who could help the researchers locate them if they were to move away during the study period. The short story is that if your close contact became obese, you were likely to become obese also. The long story is a debate about the reliability of this finding (that is, can it be explained by measurement error and sampling variability) and its causal implications. This sort of study is in my wheelhouse, as it were, but I have never looked at the Christakis-Fowler work in detail. Thus, my previous and current comments are more along the line

4 0.25544265 1699 andrew gelman stats-2013-01-31-Fowlerpalooza!

Introduction: Russ Lyons points us to a discussion in Statistics in Medicine of the famous claims by Christakis and Fowler on the contagion of obesity etc. James O’Malley and Christakis and Fowler present the positive case. Andrew Thomas and Tyler VanderWeele present constructive criticism. Christakis and Fowler reply . Coincidentally, a couple weeks ago an epidemiologist was explaining to me the differences between the Framingham Heart Study and the Nurses Health Study and why Framingham got the postmenopausal supplement risks right while Nurses got it wrong. P.S. The journal issue also includes a comment on “A distribution-free test of constant mean in linear mixed effects models.” Wow! I had no idea people still did this sort of thing. How horrible. But I guess that’s what half-life is all about. These ideas last forever, they just become less and less relevant to people.

5 0.24751133 2042 andrew gelman stats-2013-09-28-Difficulties of using statistical significance (or lack thereof) to sift through and compare research hypotheses

Introduction: Dean Eckles writes: Thought you might be interested in an example that touches on a couple recurring topics: 1. The difference between a statistically significant finding and one that is non-significant need not be itself statistically significant (thus highlighting the problems of using NHST to declare whether an effect exists or not). 2. Continued issues with the credibility of high profile studies of “social contagion”, especially by Christakis and Fowler . A new paper in Archives of Sexual Behavior produces observational estimates of peer effects in sexual behavior and same-sex attraction. In the text, the authors (who include C&F;) make repeated comparisons of the results for peer effects in sexual intercourse and those for peer effects in same-sex attraction. However, the 95% CI for the later actually includes the point estimate for the former! This is most clear in Figure 2, as highlighted by Real Clear Science’s blog post about the study. (Now because there is som

6 0.18287201 1952 andrew gelman stats-2013-07-23-Christakis response to my comment on his comments on social science (or just skip to the P.P.P.S. at the end)

7 0.12783709 1228 andrew gelman stats-2012-03-25-Continuous variables in Bayesian networks

8 0.12571603 1949 andrew gelman stats-2013-07-21-Defensive political science responds defensively to an attack on social science

9 0.10509109 1615 andrew gelman stats-2012-12-10-A defense of Tom Wolfe based on the impossibility of the law of small numbers in network structure

10 0.10408809 1865 andrew gelman stats-2013-05-20-What happened that the journal Psychological Science published a paper with no identifiable strengths?

11 0.10208594 1191 andrew gelman stats-2012-03-01-Hoe noem je?

12 0.10024992 706 andrew gelman stats-2011-05-11-The happiness gene: My bottom line (for now)

13 0.097852968 925 andrew gelman stats-2011-09-26-Ethnicity and Population Structure in Personal Naming Networks

14 0.087714806 2245 andrew gelman stats-2014-03-12-More on publishing in journals

15 0.085051566 2211 andrew gelman stats-2014-02-14-The popularity of certain baby names is falling off the clifffffffffffff

16 0.084933743 1392 andrew gelman stats-2012-06-26-Occam

17 0.084539317 1527 andrew gelman stats-2012-10-10-Another reason why you can get good inferences from a bad model

18 0.084160343 1708 andrew gelman stats-2013-02-05-Wouldn’t it be cool if Glenn Hubbard were consulting for Herbalife and I were on the other side?

19 0.08396212 702 andrew gelman stats-2011-05-09-“Discovered: the genetic secret of a happy life”

20 0.082567275 2006 andrew gelman stats-2013-09-03-Evaluating evidence from published research


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.195), (1, 0.005), (2, -0.019), (3, -0.039), (4, 0.001), (5, -0.032), (6, 0.025), (7, -0.035), (8, 0.026), (9, 0.028), (10, 0.018), (11, 0.028), (12, -0.023), (13, -0.009), (14, -0.041), (15, 0.039), (16, 0.008), (17, -0.015), (18, -0.001), (19, -0.007), (20, 0.007), (21, -0.034), (22, -0.022), (23, -0.022), (24, -0.003), (25, 0.005), (26, 0.004), (27, 0.002), (28, 0.03), (29, -0.025), (30, -0.025), (31, 0.006), (32, -0.025), (33, -0.048), (34, 0.09), (35, 0.028), (36, -0.023), (37, 0.05), (38, -0.021), (39, -0.025), (40, 0.038), (41, -0.014), (42, 0.013), (43, 0.002), (44, -0.004), (45, -0.014), (46, -0.046), (47, 0.016), (48, -0.018), (49, 0.005)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94242322 1412 andrew gelman stats-2012-07-10-More questions on the contagion of obesity, height, etc.

Introduction: AT discusses [link broken; see P.P.S. below] a new paper of his that casts doubt on the robustness of the controversial Christakis and Fowler papers. AT writes that he ran some simulations of contagion on social networks and found that (a) in a simple model assuming the contagion of the sort hypothesized by Christakis and Fowler, their procedure would indeed give the sorts of estimates they found in their papers, but (b) in another simple model assuming a different sort of contagion, the C&F; estimation would give indistinguishable estimates. Thus, if you believe AT’s simulation model, C&F;’s procedure cannot statistically distinguish between two sorts of contagion (directional and simultaneous). I have not looked at AT’s paper so I can’t fully comment, but I don’t fully understand his method for simulating network connections. AT uses what he calls a “rewiring” model. This makes sense: as time progresses, we make new friends and lose old ones—but I am confused by the details

2 0.82252055 757 andrew gelman stats-2011-06-10-Controversy over the Christakis-Fowler findings on the contagion of obesity

Introduction: Nicholas Christakis and James Fowler are famous for finding that obesity is contagious. Their claims, which have been received with both respect and skepticism (perhaps we need a new word for this: “respecticism”?) are based on analysis of data from the Framingham heart study, a large longitudinal public-health study that happened to have some social network data (for the odd reason that each participant was asked to provide the name of a friend who could help the researchers locate them if they were to move away during the study period. The short story is that if your close contact became obese, you were likely to become obese also. The long story is a debate about the reliability of this finding (that is, can it be explained by measurement error and sampling variability) and its causal implications. This sort of study is in my wheelhouse, as it were, but I have never looked at the Christakis-Fowler work in detail. Thus, my previous and current comments are more along the line

3 0.7753666 302 andrew gelman stats-2010-09-28-This is a link to a news article about a scientific paper

Introduction: Somebody I know sent me a link to this news article by Martin Robbins describing a potential scientific breakthrough. I express some skepticism but in a vague enough way that, in the unlikely event that the research claim turns out to be correct, there’s no paper trail showing that I was wrong. I have some comments on the graphs–the tables are horrible, no need to even discuss them!–and I’d prefer if the authors of the paper could display their data and model on a single graph. I realize that their results reached a standard level of statistical significance, but it’s hard for me to interpret their claims until I see their estimates on some sort of direct real-world scale. In any case, though, I’m sure these researchers are working hard, and I wish them the best of luck in their future efforts to replicate their findings. I’m sure they’ll have no problem replicating, whether or not their claims are actually true. That’s the way science works: Once you know what you’re looking

4 0.77047151 976 andrew gelman stats-2011-10-27-Geophysicist Discovers Modeling Error (in Economics)

Introduction: Continuing “heckle the press” month here at the blog, I (Bob) found the following “discovery” a little overplayed by David H. Freedman , who was writing for Scientific American in the following article and blog post: Blog: Why Economic Models are Always Wrong Article: A Formula for Economic Calamity The article’s paywalled, but the blog entry isn’t. Apparently, a geophysicist named Jonathan Carter (good luck finding him on the web given only that information) found that when he simulated from a complicated model, then fit the model to the simulated data, he sometimes got different results. What’s more, these differing estimates fit the data equally well but made different predictions on new data. Now we don’t know if the model was identifiable, had different local optima (i.e., multiple modes), how he fit the data, or really anything, but it doesn’t really matter. Reading the comments and article is a depressing exercise in the sociology of science, with clueles

5 0.76685119 2361 andrew gelman stats-2014-06-06-Hurricanes vs. Himmicanes

Introduction: The story’s on the sister blog and I quote liberally from Jeremy Freese, who wrote : The authors have issued a statement that argues against some criticisms of their study that others have offered. These are irrelevant to the above observations, as I [Freese] am taking everything about the measurement and model specification at their word–my starting point is the model that fully replicates the analyses that they themselves published. A qualification is that one of their comments is that they deny they are making any claims about the importance of other factors that kill people in hurricanes. But they are. If you claim that 27 out of the 42 deaths in Hurricane Eloise would have been prevented if it was named Hurricane Charley, that is indeed a claim that diminishes the potential importance of other causes of deaths in that hurricane. Freese also raises an important general issue in science communication: The authors’ university issued a press release with a dramatic prese

6 0.7638495 2236 andrew gelman stats-2014-03-07-Selection bias in the reporting of shaky research

7 0.76333404 2042 andrew gelman stats-2013-09-28-Difficulties of using statistical significance (or lack thereof) to sift through and compare research hypotheses

8 0.76030123 706 andrew gelman stats-2011-05-11-The happiness gene: My bottom line (for now)

9 0.75109375 1949 andrew gelman stats-2013-07-21-Defensive political science responds defensively to an attack on social science

10 0.75047809 2355 andrew gelman stats-2014-05-31-Jessica Tracy and Alec Beall (authors of the fertile-women-wear-pink study) comment on our Garden of Forking Paths paper, and I comment on their comments

11 0.74902052 2136 andrew gelman stats-2013-12-16-Whither the “bet on sparsity principle” in a nonsparse world?

12 0.74807441 1742 andrew gelman stats-2013-02-27-What is “explanation”?

13 0.73586589 2218 andrew gelman stats-2014-02-20-Do differences between biology and statistics explain some of our diverging attitudes regarding criticism and replication of scientific claims?

14 0.72595578 2241 andrew gelman stats-2014-03-10-Preregistration: what’s in it for you?

15 0.72543252 1952 andrew gelman stats-2013-07-23-Christakis response to my comment on his comments on social science (or just skip to the P.P.P.S. at the end)

16 0.72310686 1683 andrew gelman stats-2013-01-19-“Confirmation, on the other hand, is not sexy”

17 0.72218674 1968 andrew gelman stats-2013-08-05-Evidence on the impact of sustained use of polynomial regression on causal inference (a claim that coal heating is reducing lifespan by 5 years for half a billion people)

18 0.72096372 1690 andrew gelman stats-2013-01-23-When are complicated models helpful in psychology research and when are they overkill?

19 0.72064245 1680 andrew gelman stats-2013-01-18-“If scientists wrote horoscopes, this is what yours would say”

20 0.72021127 2018 andrew gelman stats-2013-09-12-Do you ever have that I-just-fit-a-model feeling?


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(6, 0.017), (9, 0.155), (15, 0.022), (16, 0.09), (21, 0.018), (24, 0.09), (42, 0.017), (84, 0.023), (86, 0.013), (99, 0.429)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.99189061 1110 andrew gelman stats-2012-01-10-Jobs in statistics research! In New Jersey!

Introduction: Kenny writes: The Statistics Research group in AT&T; Labs invites applications for full time research positions. Applicants should have a Ph.D. in Statistics (or a related field), and be able to make major, widely-recognized contributions to statistics research: theory, methods, computing, and data analysis. Candidates must demonstrate a potential for excellence in research, a knowledge of fundamental statistical theory, a collaborative spirit, and strong communication skills. We are especially interested in statisticians who are interested in and capable of working on large scale data problems. A non-exclusive list of statistical fields we are interested in include: data mining, statistical computing, forecasting, time series, spatial statistics, social networks, machine learning, and Bayesian analysis. We invite applications from both new and experienced Ph.Ds, and women and underrepresented minorities are especially encouraged to apply. AT&T; Companies are Equal Opportunity Emp

2 0.9903717 1226 andrew gelman stats-2012-03-22-Story time meets the all-else-equal fallacy and the fallacy of measurement

Introduction: Alex Tabarrok with a good catch : In Why Don’t Women Patent? , a recent NBER paper, Jennifer Hunt et al. [Jean-Philippe Garant, Hannah Herman, and David Munroe] present a stark fact: Only 5.5% of the holders of commercialized patents are women. One might think that this is explained by the relative lack of women with science and engineering degrees but Hunt et al. find that “women with such a degree are scarcely more likely to patent than women without.” Instead, most of the difference is “accounted for by differences among those with a science or engineering degree” especially the fact that women are underrepresented in patent-intensive fields such as electrical and mechanical engineering and in development and design. Predictably, the authors do not ask why women might self-selection into non patent-intensive fields, perhaps because this would require at least a discussion of politically incorrect questions . The failure to investigate these questions leads to some dubious co

3 0.98471689 1142 andrew gelman stats-2012-01-29-Difficulties with the 1-4-power transformation

Introduction: John Hayes writes: I am a fan of the quarter root transform ever since reading about it on your blog . However, today my student and I hit a wall that I’m hoping you might have some insight on. By training, I am a psychophysicist (think SS Stevens), and people in my field often log transform data prior to analysis. However, this data frequently contains zeros, so I’ve tried using quarter root transforms to get around this. But until today, I had never tried to back transform the plot axis for readability. I assumed this would be straightforward – alas it is not. Specifically, we quarter root transformed our data, performed an ANOVA, got what we thought was a reasonable effect, and then plotted the data. So far so good. However, the LS means in question are below 1, meaning that raising them to the 4th power just makes them smaller, and uninterpretable in the original metric. Do you have any thoughts or insights you might share? My reply: I don’t see the problem with pre

4 0.98392481 1715 andrew gelman stats-2013-02-09-Thomas Hobbes would be spinning in his grave

Introduction: A few years ago I watched a bunch of Speed Racer cartoons with Phil in a movie theater in the early 90s. These were low-budget Japanese cartoons from the 60s that we loved as kids. From my adult perspective, the best parts were during the characters’ long drives, where you could see Japanese industrial scenes in the background. Similarly, sometimes the most interesting aspect of a book or article is not its overt content but rather its unexamined assumptions. I was reminded of this today when reading the Times this morning. In an interesting column reviewing recent research on happiness (marred only by his decision not to interview any psychology researchers; after all, they’re the academic experts on the topic), Adam Davidson writes : So much debate about government policy is based on economic statistics that come out of the market. But the goal of government is not just to maximize revenue. This perked me up. Not just to maximize revenue? This has to be a sign of the

5 0.98261708 29 andrew gelman stats-2010-05-12-Probability of successive wins in baseball

Introduction: Dan Goldstein did an informal study asking people the following question: When two baseball teams play each other on two consecutive days, what is the probability that the winner of the first game will be the winner of the second game? You can make your own guess and the continue reading below. Dan writes: We asked two colleagues knowledgeable in baseball and the mathematics of forecasting. The answers came in between 65% and 70%. The true answer [based on Dan's analysis of a database of baseball games]: 51.3%, a little better than a coin toss. I have to say, I’m surprised his colleagues gave such extreme guesses. I was guessing something like 50%, myself, based on the following very crude reasoning: Suppose two unequal teams are playing, and the chance of team A beating team B is 55%. (This seems like a reasonable average of all matchups, which will include some more extreme disparities but also many more equal contests.) Then the chance of the same team

6 0.98078656 640 andrew gelman stats-2011-03-31-Why Edit Wikipedia?

same-blog 7 0.97895712 1412 andrew gelman stats-2012-07-10-More questions on the contagion of obesity, height, etc.

8 0.97756755 389 andrew gelman stats-2010-11-01-Why it can be rational to vote

9 0.97756678 1565 andrew gelman stats-2012-11-06-Why it can be rational to vote

10 0.97526443 2199 andrew gelman stats-2014-02-04-Widening the goalposts in medical trials

11 0.97184759 1532 andrew gelman stats-2012-10-13-A real-life dollar auction game!

12 0.97000873 560 andrew gelman stats-2011-02-06-Education and Poverty

13 0.9692843 369 andrew gelman stats-2010-10-25-Misunderstanding of divided government

14 0.9672128 1961 andrew gelman stats-2013-07-29-Postdocs in probabilistic modeling! With David Blei! And Stan!

15 0.96712118 675 andrew gelman stats-2011-04-22-Arrow’s other theorem

16 0.96505994 571 andrew gelman stats-2011-02-13-A departmental wiki page?

17 0.96473795 1358 andrew gelman stats-2012-06-01-Question 22 of my final exam for Design and Analysis of Sample Surveys

18 0.96278638 1424 andrew gelman stats-2012-07-22-Extreme events as evidence for differences in distributions

19 0.9622857 2113 andrew gelman stats-2013-11-25-Postdoc position on psychometrics and network modeling

20 0.96183187 2226 andrew gelman stats-2014-02-26-Econometrics, political science, epidemiology, etc.: Don’t model the probability of a discrete outcome, model the underlying continuous variable