andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-303 knowledge-graph by maker-knowledge-mining

303 andrew gelman stats-2010-09-28-“Genomics” vs. genetics


meta infos for this blog

Source: html

Introduction: John Cook and Joseph Delaney point to an article by Yurii Aulchenko et al., who write: 54 loci showing strong statistical evidence for association to human height were described, providing us with potential genomic means of human height prediction. In a population-based study of 5748 people, we find that a 54-loci genomic profile explained 4-6% of the sex- and age-adjusted height variance, and had limited ability to discriminate tall/short people. . . . In a family-based study of 550 people, with both parents having height measurements, we find that the Galtonian mid-parental prediction method explained 40% of the sex- and age-adjusted height variance, and showed high discriminative accuracy. . . . The message is that the simple approach of predicting child’s height using a regression model given parents’ average height performs much better than the method they have based on combining 54 genes. They also find that, if you start with the prediction based on parents’ heigh


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 John Cook and Joseph Delaney point to an article by Yurii Aulchenko et al. [sent-1, score-0.062]

2 , who write: 54 loci showing strong statistical evidence for association to human height were described, providing us with potential genomic means of human height prediction. [sent-2, score-1.633]

3 In a population-based study of 5748 people, we find that a 54-loci genomic profile explained 4-6% of the sex- and age-adjusted height variance, and had limited ability to discriminate tall/short people. [sent-3, score-1.588]

4 In a family-based study of 550 people, with both parents having height measurements, we find that the Galtonian mid-parental prediction method explained 40% of the sex- and age-adjusted height variance, and showed high discriminative accuracy. [sent-7, score-1.853]

5 The message is that the simple approach of predicting child’s height using a regression model given parents’ average height performs much better than the method they have based on combining 54 genes. [sent-11, score-1.354]

6 They also find that, if you start with the prediction based on parents’ heights and then throw the genetic profile information into the model, you can do better–but not much better. [sent-12, score-0.888]

7 Parents’ height + genetic profile is only very slightly better, as a predictor, than parents’ height alone. [sent-13, score-1.722]

8 The most important point, I think, is that made by Delaney: The predictive power of parents’ heights on child height is, presumably, itself mostly genetic in this population. [sent-16, score-0.96]

9 Thus, the correct interpretation of the study is not that genetics doesn’t predict height, but that the particular technique described in the paper doesn’t work well. [sent-17, score-0.288]

10 Galton’s predictor also uses a combination of genes. [sent-18, score-0.169]

11 How exactly did the researchers combine those 54 genes to get their predictor? [sent-20, score-0.155]

12 Here’s what they write: The genomic profile, based on 54 recently identified loci, was computed as the sum of the number of height-increasing alleles carried by a person, similar to Weedon et al. [sent-22, score-0.669]

13 8% of the sex- and age-adjusted variation of height in the Rotterdam Study (Figure 2a). [sent-24, score-0.565]

14 We also estimated the upper explanatory limit of the 54-loci allelic profile by defining the profile as a weighted sum of height-increasing alleles, with weights proportional to the effects estimated in our own data using a multivariable model. [sent-25, score-1.271]

15 Is it possible that a savvier use of this genetic information could give a much better predictor? [sent-26, score-0.257]

16 The 5748 people in the study come from “a prospective cohort study that started in 1990 in Ommoord, a suburb of Rotterdam, among 10 994 men and women aged 55 and over. [sent-29, score-0.49]

17 But maybe things would look different if they were studying a more diverse group. [sent-32, score-0.047]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('height', 0.565), ('profile', 0.399), ('parents', 0.237), ('genomic', 0.221), ('genetic', 0.193), ('predictor', 0.169), ('loci', 0.162), ('rotterdam', 0.162), ('alleles', 0.133), ('study', 0.123), ('discriminate', 0.116), ('heights', 0.116), ('delaney', 0.112), ('explained', 0.11), ('genes', 0.107), ('sum', 0.098), ('child', 0.086), ('discriminative', 0.074), ('homogeneous', 0.074), ('suburb', 0.074), ('prediction', 0.071), ('galton', 0.07), ('described', 0.068), ('variance', 0.067), ('estimated', 0.065), ('better', 0.064), ('et', 0.062), ('aged', 0.062), ('human', 0.06), ('based', 0.055), ('prospective', 0.054), ('cohort', 0.054), ('explanatory', 0.054), ('method', 0.054), ('find', 0.054), ('carried', 0.052), ('performs', 0.051), ('weighted', 0.05), ('genetics', 0.049), ('mon', 0.049), ('combine', 0.048), ('proportional', 0.048), ('technique', 0.048), ('computed', 0.048), ('defining', 0.047), ('diverse', 0.047), ('cook', 0.047), ('weights', 0.046), ('cite', 0.046), ('joseph', 0.045)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999976 303 andrew gelman stats-2010-09-28-“Genomics” vs. genetics

Introduction: John Cook and Joseph Delaney point to an article by Yurii Aulchenko et al., who write: 54 loci showing strong statistical evidence for association to human height were described, providing us with potential genomic means of human height prediction. In a population-based study of 5748 people, we find that a 54-loci genomic profile explained 4-6% of the sex- and age-adjusted height variance, and had limited ability to discriminate tall/short people. . . . In a family-based study of 550 people, with both parents having height measurements, we find that the Galtonian mid-parental prediction method explained 40% of the sex- and age-adjusted height variance, and showed high discriminative accuracy. . . . The message is that the simple approach of predicting child’s height using a regression model given parents’ average height performs much better than the method they have based on combining 54 genes. They also find that, if you start with the prediction based on parents’ heigh

2 0.29535139 2204 andrew gelman stats-2014-02-09-Keli Liu and Xiao-Li Meng on Simpson’s paradox

Introduction: XL sent me this paper , “A Fruitful Resolution to Simpson’s Paradox via Multi-Resolution Inference.” I told Keli and Xiao-Li that I wasn’t sure I fully understood the paper—as usual, XL is subtle and sophisticated, also I only get about half of his jokes—but I sent along these thoughts: 1. I do not think counterfactuals or potential outcomes are necessary for Simpson’s paradox. I say this because one can set up Simpson’s paradox with variables that cannot be manipulated, or for which manipulations are not directly of interest. 2. Simpson’s paradox is part of a more general issue that regression coefs change if you add more predictors, the flipping of sign is not really necessary. Here’s an example that I use in my teaching that illustrates both points: I can run a regression predicting income from sex and height. I find that the coef of sex is $10,000 (i.e., comparing a man and woman of the same height, on average the man will make $10,000 more) and the coefficient of h

3 0.16390534 810 andrew gelman stats-2011-07-20-Adding more information can make the variance go up (depending on your model)

Introduction: Andy McKenzie writes: In their March 9 “ counterpoint ” in nature biotech to the prospect that we should try to integrate more sources of data in clinical practice (see “ point ” arguing for this), Isaac Kohane and David Margulies claim that, “Finally, how much better is our new knowledge than older knowledge? When is the incremental benefit of a genomic variant(s) or gene expression profile relative to a family history or classic histopathology insufficient and when does it add rather than subtract variance?” Perhaps I am mistaken (thus this email), but it seems that this claim runs contra to the definition of conditional probability. That is, if you have a hierarchical model, and the family history / classical histopathology already suggests a parameter estimate with some variance, how could the new genomic info possibly increase the variance of that parameter estimate? Surely the question is how much variance the new genomic info reduces and whether it therefore justifies t

4 0.14319094 370 andrew gelman stats-2010-10-25-Who gets wedding announcements in the Times?

Introduction: I was flipping through the paper yesterday and noticed something which I think is a bit of innumeracy–although I don’t have all the facts at my disposal so I can’t be sure. It came in an item by Robert Woletz, society editor of the New York Times, in response to the following letter from Max Sarinsky ( click here and scroll down): The heavy majority of couples typically featured in the Sunday wedding announcements either attended elite universities, hold corporate management positions or have parents with corporate management positions. It’s nice to learn about the nuptials of the privileged, but Times readers would benefit from learning about a more representative sampling of weddings in our diverse city. I [Sarinksy] am curious as to how editors select which announcements to publish, and why editors don’t make a sustained effort to include different types of couples. Woletz replied: The Weddings/Celebrations pages are truly open to everyone, and The Times persistentl

5 0.14122996 1762 andrew gelman stats-2013-03-13-“I have no idea who Catalina Garcia is, but she makes a decent ruler”: I don’t know if John Lee “little twerp” Anderson actually suffers from tall-person syndrome, but he is indeed tall

Introduction: I just want to share with you the best comment we’ve every had in the nearly ten-year history of this blog. Also it has statistical content! Here’s the story. After seeing an amusing article by Tom Scocca relating how reporter John Lee Anderson called someone as a “little twerp” on twitter: I conjectured that Anderson suffered from “tall person syndrome,” that problem that some people of above-average height have, that they think they’re more important than other people because they literally look down on them. But I had no idea of Anderson’s actual height. Commenter Gary responded with this impressive bit of investigative reporting: Based on this picture: he appears to be fairly tall. But the perspective makes it hard to judge. Based on this picture: he appears to be about 9-10 inches taller than Catalina Garcia. But how tall is Catalina Garcia? Not that tall – she’s shorter than the high-wire artist Phillipe Petit: And he doesn’t appear

6 0.13875072 706 andrew gelman stats-2011-05-11-The happiness gene: My bottom line (for now)

7 0.12866005 1015 andrew gelman stats-2011-11-17-Good examples of lurking variables?

8 0.11367476 702 andrew gelman stats-2011-05-09-“Discovered: the genetic secret of a happy life”

9 0.11204363 965 andrew gelman stats-2011-10-19-Web-friendly visualizations in R

10 0.11157394 2141 andrew gelman stats-2013-12-20-Don’t douthat, man! Please give this fallacy a name.

11 0.10955797 2121 andrew gelman stats-2013-12-02-Should personal genetic testing be regulated? Battle of the blogroll

12 0.10736363 1196 andrew gelman stats-2012-03-04-Piss-poor monocausal social science

13 0.099672593 561 andrew gelman stats-2011-02-06-Poverty, educational performance – and can be done about it

14 0.097317927 1107 andrew gelman stats-2012-01-08-More on essentialism

15 0.092392482 1759 andrew gelman stats-2013-03-12-How tall is Jon Lee Anderson?

16 0.09230753 1305 andrew gelman stats-2012-05-07-Happy news on happiness; what can we believe?

17 0.091341652 2250 andrew gelman stats-2014-03-16-“I have no idea who Catalina Garcia is, but she makes a decent ruler”

18 0.090305045 1367 andrew gelman stats-2012-06-05-Question 26 of my final exam for Design and Analysis of Sample Surveys

19 0.08539369 2258 andrew gelman stats-2014-03-21-Random matrices in the news

20 0.084682815 1955 andrew gelman stats-2013-07-25-Bayes-respecting experimental design and other things


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.126), (1, 0.017), (2, 0.058), (3, -0.062), (4, 0.025), (5, -0.016), (6, 0.008), (7, -0.003), (8, 0.005), (9, 0.021), (10, -0.026), (11, 0.031), (12, 0.006), (13, 0.006), (14, 0.014), (15, 0.028), (16, 0.056), (17, 0.005), (18, -0.024), (19, 0.0), (20, -0.04), (21, -0.016), (22, 0.009), (23, -0.021), (24, 0.052), (25, 0.016), (26, -0.005), (27, -0.009), (28, 0.013), (29, -0.022), (30, 0.015), (31, 0.04), (32, 0.026), (33, 0.006), (34, 0.055), (35, 0.042), (36, 0.04), (37, 0.021), (38, -0.043), (39, -0.051), (40, 0.039), (41, -0.017), (42, 0.021), (43, 0.016), (44, -0.017), (45, -0.028), (46, 0.003), (47, 0.037), (48, 0.076), (49, 0.036)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95373929 303 andrew gelman stats-2010-09-28-“Genomics” vs. genetics

Introduction: John Cook and Joseph Delaney point to an article by Yurii Aulchenko et al., who write: 54 loci showing strong statistical evidence for association to human height were described, providing us with potential genomic means of human height prediction. In a population-based study of 5748 people, we find that a 54-loci genomic profile explained 4-6% of the sex- and age-adjusted height variance, and had limited ability to discriminate tall/short people. . . . In a family-based study of 550 people, with both parents having height measurements, we find that the Galtonian mid-parental prediction method explained 40% of the sex- and age-adjusted height variance, and showed high discriminative accuracy. . . . The message is that the simple approach of predicting child’s height using a regression model given parents’ average height performs much better than the method they have based on combining 54 genes. They also find that, if you start with the prediction based on parents’ heigh

2 0.78004885 561 andrew gelman stats-2011-02-06-Poverty, educational performance – and can be done about it

Introduction: Andrew has pointed to Jonathan Livengood’s analysis of the correlation between poverty and PISA results, whereby schools with poorer students get poorer test results. I’d have written a comment, but then I couldn’t have inserted a chart. Andrew points out that a causal analysis is needed. This reminds me of an intervention that has been done before: take a child out of poverty, and bring him up in a better-off family. What’s going to happen? There have been several studies examining correlations between adoptive and biological parents’ IQ (assuming IQ is a test analogous to the math and verbal tests, and that parent IQ is analogous to the quality of instruction – but the point is in the analysis not in the metric). This is the result (from Adoption Strategies by Robin P Corley in Encyclopedia of Life Sciences): So, while it did make a difference at an early age, with increasing age of the adopted child, the intelligence of adoptive parents might not be making any difference

3 0.73922163 1114 andrew gelman stats-2012-01-12-Controversy about average personality differences between men and women

Introduction: Blogger Echidne pointed me to a recent article , “The Distance Between Mars and Venus: Measuring Global Sex Differences in Personality,” by Marco Del Giudice, Tom Booth, and Paul Irwing, who find: Sex differences in personality are believed to be comparatively small. However, research in this area has suffered from significant methodological limitations. We advance a set of guidelines for overcoming those limitations: (a) measure personality with a higher resolution than that afforded by the Big Five; (b) estimate sex differences on latent factors; and (c) assess global sex differences with multivariate effect sizes. . . . We found a global effect size D = 2.71, corresponding to an overlap of only 10% between the male and female distributions. Even excluding the factor showing the largest univariate ES [effect size], the global effect size was D = 1.71 (24% overlap). Echidne quotes a news article in which one of the study’s authors going overboard: “Psychologically, men a

4 0.70969599 2204 andrew gelman stats-2014-02-09-Keli Liu and Xiao-Li Meng on Simpson’s paradox

Introduction: XL sent me this paper , “A Fruitful Resolution to Simpson’s Paradox via Multi-Resolution Inference.” I told Keli and Xiao-Li that I wasn’t sure I fully understood the paper—as usual, XL is subtle and sophisticated, also I only get about half of his jokes—but I sent along these thoughts: 1. I do not think counterfactuals or potential outcomes are necessary for Simpson’s paradox. I say this because one can set up Simpson’s paradox with variables that cannot be manipulated, or for which manipulations are not directly of interest. 2. Simpson’s paradox is part of a more general issue that regression coefs change if you add more predictors, the flipping of sign is not really necessary. Here’s an example that I use in my teaching that illustrates both points: I can run a regression predicting income from sex and height. I find that the coef of sex is $10,000 (i.e., comparing a man and woman of the same height, on average the man will make $10,000 more) and the coefficient of h

5 0.70072865 1184 andrew gelman stats-2012-02-25-Facebook Profiles as Predictors of Job Performance? Maybe…but not yet.

Introduction: Eric Loken explains : Some newspapers and radio stations recently picked up a story that Facebook profiles can be revealing, and can yield information more predictive of job performance than typical self-report personality questionnaires or even an IQ test. . . . A most consistent finding from the last 50 years of organizational psychology research is that cognitive ability is the strongest predictor of job performance, sometimes followed closely by measures of conscientiousness (and recently there has been interest in perseverance or grit). So has the Facebook study upended all this established research? Not at all, and the reason lies in the enormous gap between the claims about the study’s outcomes, and the details of what was actually done. The researchers had two college population samples. In Study 1 they had job performance ratings for the part-time college jobs of about 10% of the original sample. But in study 1 they did not have any IQ or cognitive ability measure.

6 0.68991816 1305 andrew gelman stats-2012-05-07-Happy news on happiness; what can we believe?

7 0.67956883 1918 andrew gelman stats-2013-06-29-Going negative

8 0.67714953 1128 andrew gelman stats-2012-01-19-Sharon Begley: Worse than Stephen Jay Gould?

9 0.67325187 2193 andrew gelman stats-2014-01-31-Into the thicket of variation: More on the political orientations of parents of sons and daughters, and a return to the tradeoff between internal and external validity in design and interpretation of research studies

10 0.65725911 1910 andrew gelman stats-2013-06-22-Struggles over the criticism of the “cannabis users and IQ change” paper

11 0.65437734 2156 andrew gelman stats-2014-01-01-“Though They May Be Unaware, Newlyweds Implicitly Know Whether Their Marriage Will Be Satisfying”

12 0.65084374 845 andrew gelman stats-2011-08-08-How adoption speed affects the abandonment of cultural tastes

13 0.63867539 257 andrew gelman stats-2010-09-04-Question about standard range for social science correlations

14 0.63786346 2030 andrew gelman stats-2013-09-19-Is coffee a killer? I don’t think the effect is as high as was estimated from the highest number that came out of a noisy study

15 0.62871182 301 andrew gelman stats-2010-09-28-Correlation, prediction, variation, etc.

16 0.6272015 161 andrew gelman stats-2010-07-24-Differences in color perception by sex, also the Bechdel test for women in movies

17 0.62429452 706 andrew gelman stats-2011-05-11-The happiness gene: My bottom line (for now)

18 0.62298715 1086 andrew gelman stats-2011-12-27-The most dangerous jobs in America

19 0.6207726 702 andrew gelman stats-2011-05-09-“Discovered: the genetic secret of a happy life”

20 0.61992115 2336 andrew gelman stats-2014-05-16-How much can we learn about individual-level causal claims from state-level correlations?


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.075), (13, 0.016), (16, 0.062), (21, 0.01), (24, 0.102), (41, 0.291), (55, 0.032), (77, 0.012), (95, 0.011), (99, 0.251)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.89945918 303 andrew gelman stats-2010-09-28-“Genomics” vs. genetics

Introduction: John Cook and Joseph Delaney point to an article by Yurii Aulchenko et al., who write: 54 loci showing strong statistical evidence for association to human height were described, providing us with potential genomic means of human height prediction. In a population-based study of 5748 people, we find that a 54-loci genomic profile explained 4-6% of the sex- and age-adjusted height variance, and had limited ability to discriminate tall/short people. . . . In a family-based study of 550 people, with both parents having height measurements, we find that the Galtonian mid-parental prediction method explained 40% of the sex- and age-adjusted height variance, and showed high discriminative accuracy. . . . The message is that the simple approach of predicting child’s height using a regression model given parents’ average height performs much better than the method they have based on combining 54 genes. They also find that, if you start with the prediction based on parents’ heigh

2 0.86415207 1626 andrew gelman stats-2012-12-16-The lamest, grudgingest, non-retraction retraction ever

Introduction: In politics we’re familiar with the non-apology apology (well described in Wikipedia as “a statement that has the form of an apology but does not express the expected contrition”). Here’s the scientific equivalent: the non-retraction retraction. Sanjay Srivastava points to an amusing yet barfable story of a pair of researchers who (inadvertently, I assume) made a data coding error and were eventually moved to issue a correction notice, but even then refused to fully admit their error. As Srivastava puts it, the story “ended up with Lew [Goldberg] and colleagues [Kibeom Lee and Michael Ashton] publishing a comment on an erratum – the only time I’ve ever heard of that happening in a scientific journal.” From the comment on the erratum: In their “erratum and addendum,” Anderson and Ones (this issue) explained that we had brought their attention to the “potential” of a “possible” misalignment and described the results computed from re-aligned data as being based on a “post-ho

3 0.84608763 685 andrew gelman stats-2011-04-29-Data mining and allergies

Introduction: With all this data floating around, there are some interesting analyses one can do. I came across “The Association of Tree Pollen Concentration Peaks and Allergy Medication Sales in New York City: 2003-2008″ by Perry Sheffield . There they correlate pollen counts with anti-allergy medicine sales – and indeed find that two days after high pollen counts, the medicine sales are the highest. Of course, it would be interesting to play with the data to see *what* tree is actually causing the sales to increase the most. Perhaps this would help the arborists what trees to plant. At the moment they seem to be following a rather sexist approach to tree planting: Ogren says the city could solve the problem by planting only female trees, which don’t produce pollen like male trees do. City arborists shy away from females because many produce messy – or in the case of ginkgos, smelly – fruit that litters sidewalks. In Ogren’s opinion, that’s a mistake. He says the females only pro

4 0.82399809 1214 andrew gelman stats-2012-03-15-Of forecasts and graph theory and characterizing a statistical method by the information it uses

Introduction: Wayne Folta points me to “EigenBracket 2012: Using Graph Theory to Predict NCAA March Madness Basketball” and writes, “I [Folta] have got to believe that he’s simply re-invented a statistical method in a graph-ish context, but don’t know enough to judge.” I have not looked in detail at the method being presented here—I’m not much of college basketball fan—but I’d like to use this as an excuse to make one of my favorite general point, which is that a good way to characterize any statistical method is by what information it uses. The basketball ranking method here uses score differentials between teams in the past season. On the plus side, that is better than simply using one-loss records (which (a) discards score differentials and (b) discards information on who played whom). On the minus side, the method appears to be discretizing the scores (thus throwing away information on the exact score differential) and doesn’t use any external information such as external ratings. A

5 0.81157899 1669 andrew gelman stats-2013-01-12-The power of the puzzlegraph

Introduction: The Organisation for Economic Co-operation and Development reports that the following project from Krisztina Szucs and Mate Cziner has won their visualization challenge, “launched in September 2012 to solicit visualisations based on the OECD’s data-rich Education at a Glance report”: (The graph is interactive. Click on the above image and click again to see the full version.) From the press release: Entries from around the world focused on data related to the economic costs and return on investment in education . . . [The winning entry] takes a detailed look at public vs. private and men vs. women for selected countries . . . The judges were particularly impressed by the angled slope format of the visualisation, which encourages comparison between the upper-secondary and tertiary benefits of education. Szucs and Cziner were also lauded for their striking visual design, which draws users into exploring their piece [emphasis added]. I used boldface to highlight a p

6 0.80963969 1300 andrew gelman stats-2012-05-05-Recently in the sister blog

7 0.80807209 516 andrew gelman stats-2011-01-14-A new idea for a science core course based entirely on computer simulation

8 0.80611145 1013 andrew gelman stats-2011-11-16-My talk at Math for America on Saturday

9 0.80435377 2185 andrew gelman stats-2014-01-25-Xihong Lin on sparsity and density

10 0.80365783 454 andrew gelman stats-2010-12-07-Diabetes stops at the state line?

11 0.79182303 2204 andrew gelman stats-2014-02-09-Keli Liu and Xiao-Li Meng on Simpson’s paradox

12 0.79166484 1895 andrew gelman stats-2013-06-12-Peter Thiel is writing another book!

13 0.76871812 1297 andrew gelman stats-2012-05-03-New New York data research organizations

14 0.7639792 1816 andrew gelman stats-2013-04-21-Exponential increase in the number of stat majors

15 0.76060897 2311 andrew gelman stats-2014-04-29-Bayesian Uncertainty Quantification for Differential Equations!

16 0.75267565 2202 andrew gelman stats-2014-02-07-Outrage of the week

17 0.74587077 2262 andrew gelman stats-2014-03-23-Win probabilities during a sporting event

18 0.74179047 778 andrew gelman stats-2011-06-24-New ideas on DIC from Martyn Plummer and Sumio Watanabe

19 0.74041772 447 andrew gelman stats-2010-12-03-Reinventing the wheel, only more so.

20 0.73093206 2226 andrew gelman stats-2014-02-26-Econometrics, political science, epidemiology, etc.: Don’t model the probability of a discrete outcome, model the underlying continuous variable