andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1134 knowledge-graph by maker-knowledge-mining

1134 andrew gelman stats-2012-01-21-Lessons learned from a recent R package submission

meta infos for this blog

Source: html

Introduction: R has zillions of packages, and people are submitting new ones each day . The volunteers who keep R going are doing an incredibly useful service to the profession, and they’re busy . A colleague sends in some suugestions based on a recent experience with a package update: 1. Always use the R dev version to write a package. Not the current stable release. The R people use the R dev version to check your package anyway. If you don’t use the R dev version, there is chance that your package won’t pass the check. In my own experience, every time R has a major change, it tends to have new standards and find new errors in your package with these new standards. So better use the dev version to find out the potential errors in advance. 2. After submission, write an email to claim it. I used to submit the package to the CRAN without writing an email. This was standard operating procedure, but it has changed. Writing an email to claim about the submission is now a requir

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 R has zillions of packages, and people are submitting new ones each day . [sent-1, score-0.331]

2 The volunteers who keep R going are doing an incredibly useful service to the profession, and they’re busy . [sent-2, score-0.282]

3 A colleague sends in some suugestions based on a recent experience with a package update: 1. [sent-3, score-0.762]

4 The R people use the R dev version to check your package anyway. [sent-6, score-1.327]

5 If you don’t use the R dev version, there is chance that your package won’t pass the check. [sent-7, score-1.107]

6 In my own experience, every time R has a major change, it tends to have new standards and find new errors in your package with these new standards. [sent-8, score-0.978]

7 So better use the dev version to find out the potential errors in advance. [sent-9, score-0.806]

8 I used to submit the package to the CRAN without writing an email. [sent-12, score-0.751]

9 This was standard operating procedure, but it has changed. [sent-13, score-0.079]

10 Writing an email to claim about the submission is now a requirement. [sent-14, score-0.376]

11 The R team is afraid that the package was not submitted by a legal developer. [sent-16, score-0.784]

12 Write an email to remind them that you submit a package, not a virus. [sent-18, score-0.353]

13 The number of R packages submitted to CRAN is growing exponentially. [sent-21, score-0.467]

14 We should understand their situation and try to work with them to solve the package issues, when problems come up. [sent-23, score-0.627]

15 I’ve never actually written an R package myself—my last experience with this sort of thing was several years ago, using dyn. [sent-26, score-0.718]

16 load2 in S—but I’ve used many R packages and I’ve contributed to several widely-used R packages. [sent-28, score-0.417]

17 So I really appreciate the effort put in by the central R people, and I’m posting this note as a way to make their lives easier and also help the people who are writing and updating R packages. [sent-29, score-0.442]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('package', 0.519), ('dev', 0.44), ('packages', 0.272), ('cran', 0.198), ('version', 0.193), ('submission', 0.164), ('email', 0.136), ('submit', 0.133), ('experience', 0.13), ('submitted', 0.128), ('loads', 0.104), ('writing', 0.099), ('people', 0.09), ('write', 0.09), ('errors', 0.088), ('zillions', 0.085), ('use', 0.085), ('lessons', 0.084), ('updating', 0.084), ('remind', 0.084), ('won', 0.08), ('volunteers', 0.08), ('new', 0.079), ('incredibly', 0.079), ('operating', 0.079), ('submitting', 0.077), ('claim', 0.076), ('contributed', 0.076), ('profession', 0.075), ('tends', 0.073), ('stable', 0.07), ('afraid', 0.07), ('several', 0.069), ('security', 0.068), ('growing', 0.067), ('legal', 0.067), ('busy', 0.066), ('pass', 0.063), ('standards', 0.061), ('update', 0.06), ('posting', 0.058), ('sends', 0.057), ('procedure', 0.057), ('lives', 0.057), ('service', 0.057), ('ve', 0.056), ('colleague', 0.056), ('situation', 0.054), ('easier', 0.054), ('solve', 0.054)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1134 andrew gelman stats-2012-01-21-Lessons learned from a recent R package submission

2 0.39729309 1682 andrew gelman stats-2013-01-19-R package for Bayes factors

Introduction: Richard Morey writes: You and your blog readers may be interested to know that a we’ve released a major new version of the BayesFactor package to CRAN. The package computes Bayes factors for linear mixed models and regression models. Of course, I’m aware you don’t like point-null model comparisons, but the package does more than that; it also allows sampling from posterior distributions of the compared models, in much the same way that your arm package does with lmer objects. The sampling (both for the Bayes factors and posteriors) is quite fast, since the back end is written in C. Some basic examples using the package can be found here , and the CRAN page is here . Indeed I don’t like point-null model comparisons . . . but maybe this will be useful to some of you!

3 0.19524369 501 andrew gelman stats-2011-01-04-A new R package for fititng multilevel models

Introduction: Joscha Legewie points to this article by Lars Ronnegard, Xia Shen, and Moudud Alam, “hglm: A Package for Fitting Hierarchical Generalized Linear Models,” which just appeared in the R journal. This new package has the advantage, compared to lmer(), of allowing non-normal distributions for the varying coefficients. On the downside, they seem to have reverted to the ugly lme-style syntax (for example, “fixed = y ~ week, random = ~ 1|ID” rather than “y ~ week + (1|D)”). The old-style syntax has difficulties handling non-nested grouping factors. They also say they can estimated models with correlated random effects, but isn’t that just the same as varying-intercept, varying-slope models, which lmer (or Stata alternatives such as gllam) can already do? There’s also a bunch of stuff on H-likelihood theory, which seems pretty pointless to me (although probably it won’t do much harm either). In any case, this package might be useful to some of you, hence this note.

4 0.1741378 2069 andrew gelman stats-2013-10-19-R package for effect size calculations for psychology researchers

Introduction: Dan Gerlanc writes: I read your post the other day [now the other month, as our blog is on a bit of a delay] on helping psychologists do research and thought you might be interested in our R package, “bootES”, for robust effect size calculation and confidence interval estimation using resampling techniques. The package provides one function, ‘bootES’, that makes a variety of effect size calculations fairly straightforward for researchers with limited programming experience. The majority of the implemented are not available in R or SPSS without custom coding. Kris Kirby (Williams College) and I have published a paper in Behavioral Research Methods describing the methods and providing a tutorial on use of the package: http://bit.ly/YIM6VD. We hope that it’s useful to psychologists and other social science researchers! I haven’t tried this out but it might be of interest for some of you.

5 0.16461527 1736 andrew gelman stats-2013-02-24-Rcpp class in Sat 9 Mar in NYC

Introduction: Join Dirk Eddelbuettel for six hours of detailed and hands-on instructions and discussions around Rcpp, RInside, RcppArmadillo, RcppGSL and other packages . . . Rcpp has become the most widely-used language extension for R. Currently deployed by 103 CRAN packages and a further 10 BioConductor packages, it permits users and developers to pass “whole R objects” with ease between R and C++ . . . Morning session: “A Hands-on Introduction to R and C++” . . . Afternoon session: “Advanced R and C++ Topics” . . .

6 0.16142654 25 andrew gelman stats-2010-05-10-Two great tastes that taste great together

7 0.15257336 347 andrew gelman stats-2010-10-17-Getting arm and lme4 running on the Mac

8 0.13273621 2188 andrew gelman stats-2014-01-27-“Disappointed with your results? Boost your scientific paper”

9 0.12157593 348 andrew gelman stats-2010-10-17-Joanne Gowa scooped me by 22 years in my criticism of Axelrod’s Evolution of Cooperation

10 0.11954989 535 andrew gelman stats-2011-01-24-Bleg: Automatic Differentiation for Log Prob Gradients?

11 0.11433159 1661 andrew gelman stats-2013-01-08-Software is as software does

12 0.10848677 503 andrew gelman stats-2011-01-04-Clarity on my email policy

13 0.1026817 1216 andrew gelman stats-2012-03-17-Modeling group-level predictors in a multilevel regression

14 0.10261263 1714 andrew gelman stats-2013-02-09-Partial least squares path analysis

15 0.099712268 324 andrew gelman stats-2010-10-07-Contest for developing an R package recommendation system

16 0.0990142 1642 andrew gelman stats-2012-12-28-New book by Stef van Buuren on missing-data imputation looks really good!

17 0.079310916 833 andrew gelman stats-2011-07-31-Untunable Metropolis

18 0.078164041 2346 andrew gelman stats-2014-05-24-Buzzfeed, Porn, Kansas…That Can’t Be Good

19 0.078107037 726 andrew gelman stats-2011-05-22-Handling multiple versions of an outcome variable

20 0.077918351 27 andrew gelman stats-2010-05-11-Update on the spam email study

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.142), (1, -0.024), (2, -0.027), (3, 0.018), (4, 0.054), (5, 0.021), (6, 0.029), (7, -0.077), (8, 0.003), (9, -0.033), (10, 0.019), (11, -0.047), (12, 0.014), (13, -0.021), (14, -0.003), (15, 0.01), (16, 0.003), (17, -0.007), (18, -0.008), (19, 0.019), (20, 0.011), (21, 0.016), (22, 0.005), (23, 0.018), (24, -0.035), (25, -0.004), (26, 0.018), (27, 0.041), (28, 0.052), (29, -0.0), (30, -0.017), (31, 0.002), (32, -0.013), (33, 0.026), (34, 0.02), (35, -0.069), (36, -0.038), (37, 0.033), (38, 0.022), (39, -0.007), (40, -0.006), (41, 0.026), (42, 0.012), (43, 0.015), (44, 0.02), (45, -0.021), (46, -0.092), (47, -0.001), (48, 0.036), (49, -0.095)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95349187 1134 andrew gelman stats-2012-01-21-Lessons learned from a recent R package submission

2 0.77349699 535 andrew gelman stats-2011-01-24-Bleg: Automatic Differentiation for Log Prob Gradients?

Introduction: We need help picking out an automatic differentiation package for Hamiltonian Monte Carlo sampling from the posterior of a generalized linear model with deep interactions. Specifically, we need to compute gradients for log probability functions with thousands of parameters that involve matrix (determinants, eigenvalues, inverses), stats (distributions), and math (log gamma) functions. Any suggestions? The Application: Hybrid Monte Carlo for Posteriors We’re getting serious about implementing posterior sampling using Hamiltonian Monte Carlo. HMC speeds up mixing by including gradient information to help guide the Metropolis proposals toward areas high probability. In practice, the algorithm requires a handful or of gradient calculations per sample, but there are many dimensions and the functions are hairy enough we don’t want to compute derivaties by hand. Auto Diff: Perhaps not What you Think It may not have been clear to readers of this blog that automatic diffe

3 0.73937953 347 andrew gelman stats-2010-10-17-Getting arm and lme4 running on the Mac

Introduction: Our “arm” package in R requires Doug Bates’s “lme4″ which fits multilevel models. lme4 is currently having some problems on the Mac. But installation on the Mac can be done; it just takes a bit of work. I have two sets of instructions below. From Yu-Sung: If you have MAC OS DVD, you should install developer X code packages from it. Otherwise, install them from here . After this, do the following in R: install.packages(“lme4″, type = “source”) Then you will have lme4 in R and you can install arm without a problem. And, from David Ozonoff: I installed the lme4 package via the Package Installer but this didn’t work, of course. I then installed, via this link , gfortran which seemed to put the libraries in the right place (I had earlier installed via Fink the gcc42 compiler, so I’m not sure if this is required or not). I then ran, in R, this: install.packages(c(“Matrix”,”lme4″), repos=”http://R-Forge.R-project.org”) This does not appear to work since it wi

4 0.71421045 1736 andrew gelman stats-2013-02-24-Rcpp class in Sat 9 Mar in NYC

5 0.70312315 555 andrew gelman stats-2011-02-04-Handy Matrix Cheat Sheet, with Gradients

Introduction: This post is an (unpaid) advertisement for the following extremely useful resource: Petersen, K. B. and M. S. Pedersen. 2008. The Matrix Cookbook . Tehcnical Report, Technical University of Denmark. It contains 70+ pages of useful relations and derivations involving matrices. What grabbed my eye was the computation of gradients for matrix operations ranging from eigenvalues and determinants to multivariate normal density functions. I had no idea the multivariate normal had such a clean gradient (see section 8). We’ve been playing around with Hamiltonian (aka Hybrid) Monte Carlo for sampling from the posterior of hierarchical generalized linear models with lots of interactions. HMC speeds up Metropolis sampling by using the gradient of the log probability to drive samples in the direction of higher probability density, which is particularly useful for correlated parameters that mix slowly with standard Gibbs sampling. Matt “III” Hoffman ‘s already got it workin

6 0.6806336 266 andrew gelman stats-2010-09-09-The future of R

7 0.67201275 419 andrew gelman stats-2010-11-18-Derivative-based MCMC as a breakthrough technique for implementing Bayesian statistics

8 0.66563487 1338 andrew gelman stats-2012-05-23-Advice on writing research articles

9 0.66494352 2148 andrew gelman stats-2013-12-25-Spam!

10 0.6647948 1682 andrew gelman stats-2013-01-19-R package for Bayes factors

11 0.65979916 2089 andrew gelman stats-2013-11-04-Shlemiel the Software Developer and Unknown Unknowns

12 0.65473646 1520 andrew gelman stats-2012-10-03-Advice that’s so eminently sensible but so difficult to follow

13 0.64881313 166 andrew gelman stats-2010-07-27-The Three Golden Rules for Successful Scientific Research

14 0.64632261 727 andrew gelman stats-2011-05-23-My new writing strategy

15 0.64022416 2172 andrew gelman stats-2014-01-14-Advice on writing research articles

16 0.62847567 931 andrew gelman stats-2011-09-29-Hamiltonian Monte Carlo stories

17 0.62521893 2011 andrew gelman stats-2013-09-07-Here’s what happened when I finished my PhD thesis

18 0.6215629 1296 andrew gelman stats-2012-05-03-Google Translate for code, and an R help-list bot

19 0.61969012 597 andrew gelman stats-2011-03-02-RStudio – new cross-platform IDE for R

20 0.61230928 793 andrew gelman stats-2011-07-09-R on the cloud

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(0, 0.038), (15, 0.053), (16, 0.038), (22, 0.011), (24, 0.158), (48, 0.011), (72, 0.02), (82, 0.175), (86, 0.058), (89, 0.024), (96, 0.011), (99, 0.295)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.97516054 940 andrew gelman stats-2011-10-03-It depends upon what the meaning of the word “firm” is.

Introduction: David Hogg pointed me to this news article by Angela Saini: It’s not often that the quiet world of mathematics is rocked by a murder case. But last summer saw a trial that sent academics into a tailspin, and has since swollen into a fevered clash between science and the law. At its heart, this is a story about chance. And it begins with a convicted killer, “T”, who took his case to the court of appeal in 2010. Among the evidence against him was a shoeprint from a pair of Nike trainers, which seemed to match a pair found at his home. While appeals often unmask shaky evidence, this was different. This time, a mathematical formula was thrown out of court. The footwear expert made what the judge believed were poor calculations about the likelihood of the match, compounded by a bad explanation of how he reached his opinion. The conviction was quashed. . . . “The impact will be quite shattering,” says Professor Norman Fenton, a mathematician at Queen Mary, University of London.

2 0.96555912 335 andrew gelman stats-2010-10-11-How to think about Lou Dobbs

Introduction: I was unsurprised to read that Lou Dobbs, the former CNN host who crusaded against illegal immigrants, had actually hired a bunch of them himself to maintain his large house and his horse farm. (OK, I have to admit I was surprised by the part about the horse farm.) But I think most of the reactions to this story missed the point. Isabel Macdonald’s article that broke the story was entitled, “Lou Dobbs, American Hypocrite,” and most of the discussion went from there, with some commenters piling on Dobbs and others defending him by saying that Dobbs hired his laborers through contractors and may not have known they were in the country illegally. To me, though, the key issue is slightly different. And Macdonald’s story is relevant whether or not Dobbs knew he was hiring illegals. My point is not that Dobbs is a bad guy, or a hypocrite, or whatever. My point is that, in his setting, it would take an extraordinary effort to not hire illegal immigrants to take care of his house

3 0.95713973 178 andrew gelman stats-2010-08-03-(Partisan) visualization of health care legislation

Introduction: Congressman Kevin Brady from Texas distributes this visualization of reformed health care in the US (click for a bigger picture): Here’s a PDF at Brady’s page, and a local copy of it. Complexity has its costs. Beyond the cost of writing it, learning it, following it, there’s also the cost of checking it. John Walker has some funny examples of what’s hidden in the almost 8000 pages of IRS code. Text mining and applied statistics will solve all that, hopefully. Anyone interested in developing a pork detection system for the legislation? Or an analysis of how much entropy to the legal code did each congressman contribute? There are already spin detectors , that help you detect whether the writer is a Democrat (“stimulus”, “health care”) or a Republican (“deficit spending”, “ObamaCare”). D+0.1: Jared Lander points to versions by Rep. Boehner and Robert Palmer .

same-blog 4 0.95233595 1134 andrew gelman stats-2012-01-21-Lessons learned from a recent R package submission

5 0.95222282 1094 andrew gelman stats-2011-12-31-Using factor analysis or principal components analysis or measurement-error models for biological measurements in archaeology?

Introduction: Greg Campbell writes: I am a Canadian archaeologist (BSc in Chemistry) researching the past human use of European Atlantic shellfish. After two decades of practice I am finally getting a MA in archaeology at Reading. I am seeing if the habitat or size of harvested mussels (Mytilus edulis) can be reconstructed from measurements of the umbo (the pointy end, and the only bit that survives well in archaeological deposits) using log-transformed measurements (or allometry; relationships between dimensions are more likely exponential than linear). Of course multivariate regressions in most statistics packages (Minitab, SPSS, SAS) assume you are trying to predict one variable from all the others (a Model I regression), and use ordinary least squares to fit the regression line. For organismal dimensions this makes little sense, since all the dimensions are (at least in theory) free to change their mutual proportions during growth. So there is no predictor and predicted, mutual variation of

6 0.95060003 1488 andrew gelman stats-2012-09-08-Annals of spam

7 0.94838935 340 andrew gelman stats-2010-10-13-Randomized experiments, non-randomized experiments, and observational studies

8 0.93926311 1749 andrew gelman stats-2013-03-04-Stan in L.A. this Wed 3:30pm

9 0.93761414 1772 andrew gelman stats-2013-03-20-Stan at Google this Thurs and at Berkeley this Fri noon

10 0.93499732 2003 andrew gelman stats-2013-08-30-Stan Project: Continuous Relaxations for Discrete MRFs

11 0.93498409 1440 andrew gelman stats-2012-08-02-“A Christmas Carol” as applied to plagiarism

12 0.93409258 326 andrew gelman stats-2010-10-07-Peer pressure, selection, and educational reform

13 0.93280524 1963 andrew gelman stats-2013-07-31-Response by Jessica Tracy and Alec Beall to my critique of the methods in their paper, “Women Are More Likely to Wear Red or Pink at Peak Fertility”

14 0.92508292 1725 andrew gelman stats-2013-02-17-“1.7%” ha ha ha

15 0.92475784 1553 andrew gelman stats-2012-10-30-Real rothko, fake rothko

16 0.92332482 67 andrew gelman stats-2010-06-03-More on that Dartmouth health care study

17 0.91923976 931 andrew gelman stats-2011-09-29-Hamiltonian Monte Carlo stories

18 0.91535717 1682 andrew gelman stats-2013-01-19-R package for Bayes factors

19 0.91302407 2355 andrew gelman stats-2014-05-31-Jessica Tracy and Alec Beall (authors of the fertile-women-wear-pink study) comment on our Garden of Forking Paths paper, and I comment on their comments

20 0.91087794 699 andrew gelman stats-2011-05-06-Another stereotype demolished