andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-501 knowledge-graph by maker-knowledge-mining

501 andrew gelman stats-2011-01-04-A new R package for fititng multilevel models


meta infos for this blog

Source: html

Introduction: Joscha Legewie points to this article by Lars Ronnegard, Xia Shen, and Moudud Alam, “hglm: A Package for Fitting Hierarchical Generalized Linear Models,” which just appeared in the R journal. This new package has the advantage, compared to lmer(), of allowing non-normal distributions for the varying coefficients. On the downside, they seem to have reverted to the ugly lme-style syntax (for example, “fixed = y ~ week, random = ~ 1|ID” rather than “y ~ week + (1|D)”). The old-style syntax has difficulties handling non-nested grouping factors. They also say they can estimated models with correlated random effects, but isn’t that just the same as varying-intercept, varying-slope models, which lmer (or Stata alternatives such as gllam) can already do? There’s also a bunch of stuff on H-likelihood theory, which seems pretty pointless to me (although probably it won’t do much harm either). In any case, this package might be useful to some of you, hence this note.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Joscha Legewie points to this article by Lars Ronnegard, Xia Shen, and Moudud Alam, “hglm: A Package for Fitting Hierarchical Generalized Linear Models,” which just appeared in the R journal. [sent-1, score-0.165]

2 This new package has the advantage, compared to lmer(), of allowing non-normal distributions for the varying coefficients. [sent-2, score-0.784]

3 On the downside, they seem to have reverted to the ugly lme-style syntax (for example, “fixed = y ~ week, random = ~ 1|ID” rather than “y ~ week + (1|D)”). [sent-3, score-0.963]

4 The old-style syntax has difficulties handling non-nested grouping factors. [sent-4, score-0.832]

5 They also say they can estimated models with correlated random effects, but isn’t that just the same as varying-intercept, varying-slope models, which lmer (or Stata alternatives such as gllam) can already do? [sent-5, score-1.137]

6 There’s also a bunch of stuff on H-likelihood theory, which seems pretty pointless to me (although probably it won’t do much harm either). [sent-6, score-0.7]

7 In any case, this package might be useful to some of you, hence this note. [sent-7, score-0.519]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('syntax', 0.369), ('package', 0.344), ('lmer', 0.332), ('gllam', 0.219), ('grouping', 0.197), ('week', 0.196), ('id', 0.169), ('random', 0.166), ('models', 0.164), ('downside', 0.163), ('pointless', 0.158), ('handling', 0.152), ('harm', 0.147), ('alternatives', 0.141), ('stata', 0.138), ('allowing', 0.137), ('generalized', 0.134), ('ugly', 0.13), ('varying', 0.12), ('difficulties', 0.114), ('correlated', 0.112), ('advantage', 0.109), ('appeared', 0.108), ('hence', 0.105), ('fixed', 0.103), ('fitting', 0.102), ('distributions', 0.097), ('linear', 0.097), ('estimated', 0.096), ('hierarchical', 0.094), ('note', 0.091), ('bunch', 0.086), ('compared', 0.086), ('stuff', 0.084), ('won', 0.079), ('although', 0.077), ('theory', 0.074), ('isn', 0.073), ('either', 0.072), ('useful', 0.07), ('probably', 0.069), ('already', 0.069), ('effects', 0.068), ('also', 0.057), ('seem', 0.057), ('points', 0.057), ('pretty', 0.054), ('case', 0.047), ('rather', 0.045), ('seems', 0.045)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 501 andrew gelman stats-2011-01-04-A new R package for fititng multilevel models

Introduction: Joscha Legewie points to this article by Lars Ronnegard, Xia Shen, and Moudud Alam, “hglm: A Package for Fitting Hierarchical Generalized Linear Models,” which just appeared in the R journal. This new package has the advantage, compared to lmer(), of allowing non-normal distributions for the varying coefficients. On the downside, they seem to have reverted to the ugly lme-style syntax (for example, “fixed = y ~ week, random = ~ 1|ID” rather than “y ~ week + (1|D)”). The old-style syntax has difficulties handling non-nested grouping factors. They also say they can estimated models with correlated random effects, but isn’t that just the same as varying-intercept, varying-slope models, which lmer (or Stata alternatives such as gllam) can already do? There’s also a bunch of stuff on H-likelihood theory, which seems pretty pointless to me (although probably it won’t do much harm either). In any case, this package might be useful to some of you, hence this note.

2 0.32582426 1682 andrew gelman stats-2013-01-19-R package for Bayes factors

Introduction: Richard Morey writes: You and your blog readers may be interested to know that a we’ve released a major new version of the BayesFactor package to CRAN. The package computes Bayes factors for linear mixed models and regression models. Of course, I’m aware you don’t like point-null model comparisons, but the package does more than that; it also allows sampling from posterior distributions of the compared models, in much the same way that your arm package does with lmer objects. The sampling (both for the Bayes factors and posteriors) is quite fast, since the back end is written in C. Some basic examples using the package can be found here , and the CRAN page is here . Indeed I don’t like point-null model comparisons . . . but maybe this will be useful to some of you!

3 0.19524369 1134 andrew gelman stats-2012-01-21-Lessons learned from a recent R package submission

Introduction: R has zillions of packages, and people are submitting new ones each day . The volunteers who keep R going are doing an incredibly useful service to the profession, and they’re busy . A colleague sends in some suugestions based on a recent experience with a package update: 1. Always use the R dev version to write a package. Not the current stable release. The R people use the R dev version to check your package anyway. If you don’t use the R dev version, there is chance that your package won’t pass the check. In my own experience, every time R has a major change, it tends to have new standards and find new errors in your package with these new standards. So better use the dev version to find out the potential errors in advance. 2. After submission, write an email to claim it. I used to submit the package to the CRAN without writing an email. This was standard operating procedure, but it has changed. Writing an email to claim about the submission is now a requir

4 0.15192847 269 andrew gelman stats-2010-09-10-R vs. Stata, or, Different ways to estimate multilevel models

Introduction: Cyrus writes: I [Cyrus] was teaching a class on multilevel modeling, and we were playing around with different method to fit a random effects logit model with 2 random intercepts—one corresponding to “family” and another corresponding to “community” (labeled “mom” and “cluster” in the data, respectively). There are also a few regressors at the individual, family, and community level. We were replicating in part some of the results from the following paper : Improved estimation procedures for multilevel models with binary response: a case-study, by G Rodriguez, N Goldman. (I say “replicating in part” because we didn’t include all the regressors that they use, only a subset.) We were looking at the performance of estimation via glmer in R’s lme4 package, glmmPQL in R’s MASS package, and Stata’s xtmelogit. We wanted to study the performance of various estimation methods, including adaptive quadrature methods and penalized quasi-likelihood. I was shocked to discover that glmer

5 0.13347904 1267 andrew gelman stats-2012-04-17-Hierarchical-multilevel modeling with “big data”

Introduction: Dean Eckles writes: I make extensive use of random effects models in my academic and industry research, as they are very often appropriate. However, with very large data sets, I am not sure what to do. Say I have thousands of levels of a grouping factor, and the number of observations totals in the billions. Despite having lots of observations, I am often either dealing with (a) small effects or (b) trying to fit models with many predictors. So I would really like to use a random effects model to borrow strength across the levels of the grouping factor, but I am not sure how to practically do this. Are you aware of any approaches to fitting random effects models (including approximations) that work for very large data sets? For example, applying a procedure to each group, and then using the results of this to shrink each fit in some appropriate way. Just to clarify, here I am only worried about the non-crossed and in fact single-level case. I don’t see any easy route for cross

6 0.11565842 2069 andrew gelman stats-2013-10-19-R package for effect size calculations for psychology researchers

7 0.11533605 653 andrew gelman stats-2011-04-08-Multilevel regression with shrinkage for “fixed” effects

8 0.10988834 1194 andrew gelman stats-2012-03-04-Multilevel modeling even when you’re not interested in predictions for new groups

9 0.10810836 246 andrew gelman stats-2010-08-31-Somewhat Bayesian multilevel modeling

10 0.10338886 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?

11 0.10172385 1400 andrew gelman stats-2012-06-29-Decline Effect in Linguistics?

12 0.098323032 851 andrew gelman stats-2011-08-12-year + (1|year)

13 0.097943634 25 andrew gelman stats-2010-05-10-Two great tastes that taste great together

14 0.0960363 472 andrew gelman stats-2010-12-17-So-called fixed and random effects

15 0.095098801 555 andrew gelman stats-2011-02-04-Handy Matrix Cheat Sheet, with Gradients

16 0.094665639 417 andrew gelman stats-2010-11-17-Clutering and variance components

17 0.091773085 1966 andrew gelman stats-2013-08-03-Uncertainty in parameter estimates using multilevel models

18 0.089714877 1241 andrew gelman stats-2012-04-02-Fixed effects and identification

19 0.088533685 184 andrew gelman stats-2010-08-04-That half-Cauchy prior

20 0.085970446 2086 andrew gelman stats-2013-11-03-How best to compare effects measured in two different time periods?


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.119), (1, 0.073), (2, 0.019), (3, -0.004), (4, 0.061), (5, 0.019), (6, 0.009), (7, -0.061), (8, 0.042), (9, 0.017), (10, -0.0), (11, -0.019), (12, 0.013), (13, -0.02), (14, 0.036), (15, -0.003), (16, -0.034), (17, 0.028), (18, -0.021), (19, 0.013), (20, -0.009), (21, -0.024), (22, 0.013), (23, 0.031), (24, -0.023), (25, -0.049), (26, -0.073), (27, 0.11), (28, 0.017), (29, -0.015), (30, -0.034), (31, 0.026), (32, -0.01), (33, -0.063), (34, 0.025), (35, -0.033), (36, -0.063), (37, -0.015), (38, -0.045), (39, -0.015), (40, -0.029), (41, 0.017), (42, 0.048), (43, 0.02), (44, 0.018), (45, 0.022), (46, -0.044), (47, 0.001), (48, -0.005), (49, -0.105)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98187494 501 andrew gelman stats-2011-01-04-A new R package for fititng multilevel models

Introduction: Joscha Legewie points to this article by Lars Ronnegard, Xia Shen, and Moudud Alam, “hglm: A Package for Fitting Hierarchical Generalized Linear Models,” which just appeared in the R journal. This new package has the advantage, compared to lmer(), of allowing non-normal distributions for the varying coefficients. On the downside, they seem to have reverted to the ugly lme-style syntax (for example, “fixed = y ~ week, random = ~ 1|ID” rather than “y ~ week + (1|D)”). The old-style syntax has difficulties handling non-nested grouping factors. They also say they can estimated models with correlated random effects, but isn’t that just the same as varying-intercept, varying-slope models, which lmer (or Stata alternatives such as gllam) can already do? There’s also a bunch of stuff on H-likelihood theory, which seems pretty pointless to me (although probably it won’t do much harm either). In any case, this package might be useful to some of you, hence this note.

2 0.75700641 269 andrew gelman stats-2010-09-10-R vs. Stata, or, Different ways to estimate multilevel models

Introduction: Cyrus writes: I [Cyrus] was teaching a class on multilevel modeling, and we were playing around with different method to fit a random effects logit model with 2 random intercepts—one corresponding to “family” and another corresponding to “community” (labeled “mom” and “cluster” in the data, respectively). There are also a few regressors at the individual, family, and community level. We were replicating in part some of the results from the following paper : Improved estimation procedures for multilevel models with binary response: a case-study, by G Rodriguez, N Goldman. (I say “replicating in part” because we didn’t include all the regressors that they use, only a subset.) We were looking at the performance of estimation via glmer in R’s lme4 package, glmmPQL in R’s MASS package, and Stata’s xtmelogit. We wanted to study the performance of various estimation methods, including adaptive quadrature methods and penalized quasi-likelihood. I was shocked to discover that glmer

3 0.73743725 1682 andrew gelman stats-2013-01-19-R package for Bayes factors

Introduction: Richard Morey writes: You and your blog readers may be interested to know that a we’ve released a major new version of the BayesFactor package to CRAN. The package computes Bayes factors for linear mixed models and regression models. Of course, I’m aware you don’t like point-null model comparisons, but the package does more than that; it also allows sampling from posterior distributions of the compared models, in much the same way that your arm package does with lmer objects. The sampling (both for the Bayes factors and posteriors) is quite fast, since the back end is written in C. Some basic examples using the package can be found here , and the CRAN page is here . Indeed I don’t like point-null model comparisons . . . but maybe this will be useful to some of you!

4 0.71151024 243 andrew gelman stats-2010-08-30-Computer models of the oil spill

Introduction: Chris Wilson points me to this visualizatio n of three physical models of the oil spill in the Gulf of Mexico. Cool (and scary) stuff. Wilson writes: One of the major advantages is that the models are 3D and show the plumes and tails beneath the surface. One of the major disadvantages is that they’re still just models.

5 0.70992744 555 andrew gelman stats-2011-02-04-Handy Matrix Cheat Sheet, with Gradients

Introduction: This post is an (unpaid) advertisement for the following extremely useful resource: Petersen, K. B. and M. S. Pedersen. 2008. The Matrix Cookbook . Tehcnical Report, Technical University of Denmark. It contains 70+ pages of useful relations and derivations involving matrices. What grabbed my eye was the computation of gradients for matrix operations ranging from eigenvalues and determinants to multivariate normal density functions. I had no idea the multivariate normal had such a clean gradient (see section 8). We’ve been playing around with Hamiltonian (aka Hybrid) Monte Carlo for sampling from the posterior of hierarchical generalized linear models with lots of interactions. HMC speeds up Metropolis sampling by using the gradient of the log probability to drive samples in the direction of higher probability density, which is particularly useful for correlated parameters that mix slowly with standard Gibbs sampling. Matt “III” Hoffman ‘s already got it workin

6 0.70623916 2117 andrew gelman stats-2013-11-29-The gradual transition to replicable science

7 0.68949127 1786 andrew gelman stats-2013-04-03-Hierarchical array priors for ANOVA decompositions

8 0.65508324 419 andrew gelman stats-2010-11-18-Derivative-based MCMC as a breakthrough technique for implementing Bayesian statistics

9 0.6526255 2086 andrew gelman stats-2013-11-03-How best to compare effects measured in two different time periods?

10 0.65112841 1267 andrew gelman stats-2012-04-17-Hierarchical-multilevel modeling with “big data”

11 0.64850146 1194 andrew gelman stats-2012-03-04-Multilevel modeling even when you’re not interested in predictions for new groups

12 0.64817309 1737 andrew gelman stats-2013-02-25-Correlation of 1 . . . too good to be true?

13 0.64110821 653 andrew gelman stats-2011-04-08-Multilevel regression with shrinkage for “fixed” effects

14 0.63819975 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?

15 0.62504113 1991 andrew gelman stats-2013-08-21-BDA3 table of contents (also a new paper on visualization)

16 0.61892933 464 andrew gelman stats-2010-12-12-Finite-population standard deviation in a hierarchical model

17 0.61756009 1726 andrew gelman stats-2013-02-18-What to read to catch up on multivariate statistics?

18 0.61498863 1241 andrew gelman stats-2012-04-02-Fixed effects and identification

19 0.60009211 1739 andrew gelman stats-2013-02-26-An AI can build and try out statistical models using an open-ended generative grammar

20 0.59902728 759 andrew gelman stats-2011-06-11-“2 level logit with 2 REs & large sample. computational nightmare – please help”


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(16, 0.032), (17, 0.021), (24, 0.11), (43, 0.024), (45, 0.143), (55, 0.047), (72, 0.022), (82, 0.033), (85, 0.047), (86, 0.07), (99, 0.328)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96288025 501 andrew gelman stats-2011-01-04-A new R package for fititng multilevel models

Introduction: Joscha Legewie points to this article by Lars Ronnegard, Xia Shen, and Moudud Alam, “hglm: A Package for Fitting Hierarchical Generalized Linear Models,” which just appeared in the R journal. This new package has the advantage, compared to lmer(), of allowing non-normal distributions for the varying coefficients. On the downside, they seem to have reverted to the ugly lme-style syntax (for example, “fixed = y ~ week, random = ~ 1|ID” rather than “y ~ week + (1|D)”). The old-style syntax has difficulties handling non-nested grouping factors. They also say they can estimated models with correlated random effects, but isn’t that just the same as varying-intercept, varying-slope models, which lmer (or Stata alternatives such as gllam) can already do? There’s also a bunch of stuff on H-likelihood theory, which seems pretty pointless to me (although probably it won’t do much harm either). In any case, this package might be useful to some of you, hence this note.

2 0.9615941 69 andrew gelman stats-2010-06-04-A Wikipedia whitewash

Introduction: After hearing a few times about the divorce predictions of researchers John Gottman and James Murray (work that was featured in Blink with a claim that they could predict with 83 percent accuracy whether a couple would be divorced–after meeting with them for 15 minutes) and feeling some skepticism , I decided to do the Lord’s work and amend Gottman’s wikipedia entry, which had a paragraph saying: Gottman found his methodology predicts with 90% accuracy which newlywed couples will remain married and which will divorce four to six years later. It is also 81% percent accurate in predicting which marriages will survive after seven to nine years. I added the following: Gottman’s claim of 81% or 90% accuracy is misleading, however, because the accuracy is measured only after fitting a model to his data. There is no evidence that he can predict the outcome of a marriage with high accuracy in advance. As Laurie Abraham writes, “For the 1998 study, which focused on videotapes of 57

3 0.96036386 1325 andrew gelman stats-2012-05-17-More on the difficulty of “preaching what you practice”

Introduction: A couple months ago, in discussing Charles Murray’s argument that America’s social leaders should “preach what they practice” (Murray argues that they—we!—tend to lead good lives of hard work and moderation but are all too tolerant of antisocial and unproductive behavior among the lower classes), I wrote : Murray does not consider the case of Joe Paterno, but in many ways the Penn State football coach fits his story well. Paterno was said to live an exemplary personal and professional life, combining traditional morality with football success—but, by his actions, he showed little concern about the morality of his players and coaches. At a professional level, Paterno rose higher and higher, and in his personal life he was a responsible adult. But he had an increasing disconnect with the real world, to the extent that horrible crimes were occurring nearby (in the physical and social senses) but he was completely insulated from the consequences for many years. Paterno’s story is s

4 0.95458579 206 andrew gelman stats-2010-08-13-Indiemapper makes thematic mapping easy

Introduction: Arthur Breitman writes: I had to forward this to you when I read about it… My reply: Interesting; thanks. Things like this make me feel so computer-incompetent! The younger generation is passing me by…

5 0.9545573 673 andrew gelman stats-2011-04-20-Upper-income people still don’t realize they’re upper-income

Introduction: Catherine Rampell highlights this stunning Gallup Poll result: 6 percent of Americans in households earning over $250,000 a year think their taxes are “too low.” Of that same group, 26 percent said their taxes were “about right,” and a whopping 67 percent said their taxes were “too high.” OK, fine. Most people don’t like taxes. No surprise there. But get this next part: And yet when this same group of high earners was asked whether “upper-income people” paid their fair share in taxes, 30 percent said “upper-income people” paid too little, 30 percent said it was a “fair share,” and 38 percent said it was too much. 30 percent of these upper-income people say that upper-income people pay too little, but only 6 percent say that they personally pay too little. 38% say that upper-income people pay too much, but 67% say they personally pay too much. Rampell attributes this to people’s ignorance about population statistics–these 250K+ families just don’t realize t

6 0.95378113 735 andrew gelman stats-2011-05-28-New app for learning intro statistics

7 0.95242941 999 andrew gelman stats-2011-11-09-I was at a meeting a couple months ago . . .

8 0.95078439 362 andrew gelman stats-2010-10-22-A redrawing of the Red-Blue map in November 2010?

9 0.95055759 449 andrew gelman stats-2010-12-04-Generalized Method of Moments, whatever that is

10 0.95044988 1504 andrew gelman stats-2012-09-20-Could someone please lock this guy and Niall Ferguson in a room together?

11 0.94774395 1854 andrew gelman stats-2013-05-13-A Structural Comparison of Conspicuous Consumption in China and the United States

12 0.94687462 1031 andrew gelman stats-2011-11-27-Richard Stallman and John McCarthy

13 0.94342899 192 andrew gelman stats-2010-08-08-Turning pages into data

14 0.94117296 1012 andrew gelman stats-2011-11-16-Blog bribes!

15 0.94088089 543 andrew gelman stats-2011-01-28-NYT shills for personal DNA tests

16 0.94079477 1658 andrew gelman stats-2013-01-07-Free advice from an academic writing coach!

17 0.9392485 1767 andrew gelman stats-2013-03-17-The disappearing or non-disappearing middle class

18 0.93891519 728 andrew gelman stats-2011-05-24-A (not quite) grand unified theory of plagiarism, as applied to the Wegman case

19 0.93726051 105 andrew gelman stats-2010-06-23-More on those divorce prediction statistics, including a discussion of the innumeracy of (some) mathematicians

20 0.93690026 2189 andrew gelman stats-2014-01-28-History is too important to be left to the history professors