andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1825 knowledge-graph by maker-knowledge-mining

1825 andrew gelman stats-2013-04-25-It’s binless! A program for computing normalizing functions

meta infos for this blog

Source: html

Introduction: Zhiqiang Tan writes: I have created an R package to implement the full likelihood method in Kong et al. (2003). The method can be seen as a binless extension of so-called Weighted Histogram Analysis Method (UWHAM) widely used in physics and chemistry. The method has also been introduced to the physics literature and called the Multivariate Bennet Acceptance Ratio (MBAR) method. But a key point of my implementation is to compute the free energy estimates by minimizing a convex function, instead of solving nonlinear equations by the self-consistency or the Newton-Raphson algorithm.

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Zhiqiang Tan writes: I have created an R package to implement the full likelihood method in Kong et al. [sent-1, score-1.133]

2 The method can be seen as a binless extension of so-called Weighted Histogram Analysis Method (UWHAM) widely used in physics and chemistry. [sent-3, score-1.113]

3 The method has also been introduced to the physics literature and called the Multivariate Bennet Acceptance Ratio (MBAR) method. [sent-4, score-1.015]

4 But a key point of my implementation is to compute the free energy estimates by minimizing a convex function, instead of solving nonlinear equations by the self-consistency or the Newton-Raphson algorithm. [sent-5, score-1.81]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('method', 0.378), ('physics', 0.257), ('kong', 0.245), ('tan', 0.234), ('convex', 0.226), ('minimizing', 0.209), ('histogram', 0.2), ('extension', 0.186), ('equations', 0.177), ('weighted', 0.175), ('solving', 0.173), ('nonlinear', 0.168), ('acceptance', 0.168), ('implement', 0.161), ('introduced', 0.157), ('ratio', 0.156), ('implementation', 0.155), ('multivariate', 0.154), ('energy', 0.145), ('compute', 0.145), ('created', 0.14), ('widely', 0.139), ('algorithm', 0.137), ('package', 0.136), ('likelihood', 0.114), ('function', 0.114), ('et', 0.11), ('key', 0.097), ('literature', 0.096), ('free', 0.095), ('full', 0.094), ('called', 0.093), ('seen', 0.091), ('estimates', 0.089), ('instead', 0.084), ('used', 0.062), ('analysis', 0.056), ('point', 0.047), ('writes', 0.039), ('also', 0.034)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 1825 andrew gelman stats-2013-04-25-It’s binless! A program for computing normalizing functions

2 0.12834077 861 andrew gelman stats-2011-08-19-Will Stan work well with 40×40 matrices?

Introduction: Tomas Iesmantas writes: I’m dealing with high dimensional (40-50 parameters) hierarchical bayesian model applied to nonlinear Poisson regression problem. Now I’m using an adaptive version for the Metropolis adjusted Langevin algorithm with a truncated drift (Yves F. Atchade, 2003) to obtain samples from posterior. But this algorithm is not very efficient in my case, it needs several millions iterations as burn-in period. And simulation takes quite a long time, since algorithm has to work with 40×40 matrices. Maybe you know another MCMC algorithm which could take not so many burn-in samples and would be able to deal with nonlinear regression? In non-hierarchical nonlinear regression model adaptive metropolis algorithm is enough, but in hierarchical case I could use something more effective. My reply: Try fitting the model in Stan. If that doesn’t work, let me know.

3 0.1283406 1422 andrew gelman stats-2012-07-20-Likelihood thresholds and decisions

Introduction: David Hogg points me to this discussion: Martin Strasbourg and I [Hogg] discussed his project to detect new satellites of M31 in the PAndAS survey. He can construct a likelihood ratio (possibly even a marginalized likelihood ratio) at every position in the M31 imaging, between the best-fit satellite-plus-background model and the best nothing-plus-background model. He can make a two-dimensional map of these likelihood ratios and show a the histogram of them. Looking at this histogram, which has a tail to very large ratios, he asked me, where should I put my cut? That is, at what likelihood ratio does a candidate deserve follow-up? Here’s my unsatisfying answer: To a statistician, the distribution of likelihood ratios is interesting and valuable to study. To an astronomer, it is uninteresting. You don’t want to know the distribution of likelihoods, you want to find satellites . . . I wrote that I think this makes sense and that it would actualy be an interesting and useful rese

4 0.12736434 1214 andrew gelman stats-2012-03-15-Of forecasts and graph theory and characterizing a statistical method by the information it uses

Introduction: Wayne Folta points me to “EigenBracket 2012: Using Graph Theory to Predict NCAA March Madness Basketball” and writes, “I [Folta] have got to believe that he’s simply re-invented a statistical method in a graph-ish context, but don’t know enough to judge.” I have not looked in detail at the method being presented here—I’m not much of college basketball fan—but I’d like to use this as an excuse to make one of my favorite general point, which is that a good way to characterize any statistical method is by what information it uses. The basketball ranking method here uses score differentials between teams in the past season. On the plus side, that is better than simply using one-loss records (which (a) discards score differentials and (b) discards information on who played whom). On the minus side, the method appears to be discretizing the scores (thus throwing away information on the exact score differential) and doesn’t use any external information such as external ratings. A

5 0.12718695 1010 andrew gelman stats-2011-11-14-“Free energy” and economic resources

Introduction: By “free energy” I don’t mean perpetual motion machines, cars that run on water and get 200 mpg, or the latest cold-fusion hype. No, I’m referring to the term from physics. The free energy of a system is, roughly, the amount of energy that can be directly extracted from it. For example, a rock at room temperature is just full of energy—not just the energy locked in its nuclei, but basic thermal energy—but at room temperature you can’t extract any of it. To the physicists in the audience: Yes, I realize that free energy has a technical meaning in statistical mechanics and that my above definition is sloppy. Please bear with me. And, to the non-physicists: feel free to head to Wikipedia or a physics textbook for a more careful treatment. I was thinking about free energy the other day when hearing someone on the radio say something about China bailing out the E.U. I did a double-take. Huh? The E.U. is rich, China’s not so rich. How can a middle-income country bail out a

6 0.11247364 468 andrew gelman stats-2010-12-15-Weakly informative priors and imprecise probabilities

7 0.10770394 2314 andrew gelman stats-2014-05-01-Heller, Heller, and Gorfine on univariate and multivariate information measures

8 0.10399823 1019 andrew gelman stats-2011-11-19-Validation of Software for Bayesian Models Using Posterior Quantiles

9 0.10122589 2020 andrew gelman stats-2013-09-12-Samplers for Big Science: emcee and BAT

10 0.099438168 626 andrew gelman stats-2011-03-23-Physics is hard

11 0.098845065 650 andrew gelman stats-2011-04-05-Monitor the efficiency of your Markov chain sampler using expected squared jumped distance!

12 0.096967645 775 andrew gelman stats-2011-06-21-Fundamental difficulty of inference for a ratio when the denominator could be positive or negative

13 0.094513975 519 andrew gelman stats-2011-01-16-Update on the generalized method of moments

14 0.092067115 723 andrew gelman stats-2011-05-21-Literary blurb translation guide

15 0.092062704 496 andrew gelman stats-2011-01-01-Tukey’s philosophy

16 0.08831843 1062 andrew gelman stats-2011-12-16-Mr. Pearson, meet Mr. Mandelbrot: Detecting Novel Associations in Large Data Sets

17 0.085600972 1682 andrew gelman stats-2013-01-19-R package for Bayes factors

18 0.084612116 2340 andrew gelman stats-2014-05-20-Thermodynamic Monte Carlo: Michael Betancourt’s new method for simulating from difficult distributions and evaluating normalizing constants

19 0.080897629 247 andrew gelman stats-2010-09-01-How does Bayes do it?

20 0.079310365 1560 andrew gelman stats-2012-11-03-Statistical methods that work in some settings but not others

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.082), (1, 0.064), (2, 0.001), (3, 0.001), (4, 0.03), (5, 0.008), (6, 0.01), (7, -0.037), (8, -0.013), (9, -0.024), (10, 0.002), (11, -0.027), (12, -0.024), (13, -0.019), (14, -0.01), (15, -0.021), (16, 0.008), (17, 0.023), (18, 0.006), (19, -0.039), (20, 0.027), (21, 0.002), (22, 0.022), (23, 0.05), (24, 0.048), (25, 0.016), (26, 0.005), (27, 0.054), (28, 0.101), (29, 0.018), (30, 0.042), (31, 0.081), (32, 0.066), (33, -0.011), (34, 0.011), (35, -0.025), (36, -0.024), (37, 0.026), (38, -0.025), (39, -0.01), (40, 0.003), (41, 0.028), (42, -0.001), (43, 0.055), (44, 0.023), (45, -0.056), (46, -0.028), (47, 0.004), (48, -0.015), (49, 0.046)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98869818 1825 andrew gelman stats-2013-04-25-It’s binless! A program for computing normalizing functions

2 0.70265466 2314 andrew gelman stats-2014-05-01-Heller, Heller, and Gorfine on univariate and multivariate information measures

Introduction: Malka Gorfine writes: We noticed that the important topic of association measures and tests came up again in your blog, and we have few comments in this regard. It is useful to distinguish between the univariate and multivariate methods. A consistent multivariate method can recognise dependence between two vectors of random variables, while a univariate method can only loop over pairs of components and check for dependency between them. There are very few consistent multivariate methods. To the best of our knowledge there are three practical methods: 1) HSIC by Gretton et al. (http://www.gatsby.ucl.ac.uk/~gretton/papers/GreBouSmoSch05.pdf) 2) dcov by Szekely et al. (http://projecteuclid.org/euclid.aoas/1267453933) 3) A method we introduced in Heller et al (Biometrika, 2013, 503—510, http://biomet.oxfordjournals.org/content/early/2012/12/04/biomet.ass070.full.pdf+html, and an R package, HHG, is available as well http://cran.r-project.org/web/packages/HHG/index.html). A

3 0.68418598 1230 andrew gelman stats-2012-03-26-Further thoughts on nonparametric correlation measures

Introduction: Malka Gorfine, Ruth Heller, and Yair Heller write a comment on the paper of Reshef et al. that we discussed a few months ago. Just to remind you what’s going on here, here’s my quick summary from December: Reshef et al. propose a new nonlinear R-squared-like measure. Unlike R-squared, this new method depends on a tuning parameter that controls the level of discretization, in a “How long is the coast of Britain” sort of way. The dependence on scale is inevitable for such a general method. Just consider: if you sample 1000 points from the unit bivariate normal distribution, (x,y) ~ N(0,I), you’ll be able to fit them perfectly by a 999-degree polynomial fit to the data. So the scale of the fit matters. The clever idea of the paper is that, instead of going for an absolute measure (which, as we’ve seen, will be scale-dependent), they focus on the problem of summarizing the grid of pairwise dependences in a large set of variables. As they put it: “Imagine a data set with hundreds

4 0.63953209 1062 andrew gelman stats-2011-12-16-Mr. Pearson, meet Mr. Mandelbrot: Detecting Novel Associations in Large Data Sets

Introduction: Jeremy Fox asks what I think about this paper by David N. Reshef, Yakir Reshef, Hilary Finucane, Sharon Grossman, Gilean McVean, Peter Turnbaugh, Eric Lander, Michael Mitzenmacher, and Pardis Sabeti which proposes a new nonlinear R-squared-like measure. My quick answer is that it looks really cool! From my quick reading of the paper, it appears that the method reduces on average to the usual R-squared when fit to data of the form y = a + bx + error, and that it also has a similar interpretation when “a + bx” is replaced by other continuous functions. Unlike R-squared, the method of Reshef et al. depends on a tuning parameter that controls the level of discretization, in a “How long is the coast of Britain” sort of way. The dependence on scale is inevitable for such a general method. Just consider: if you sample 1000 points from the unit bivariate normal distribution, (x,y) ~ N(0,I), you’ll be able to fit them perfectly by a 999-degree polynomial fit to the data. So the sca

5 0.61746031 2311 andrew gelman stats-2014-04-29-Bayesian Uncertainty Quantification for Differential Equations!

Introduction: Mark Girolami points us to this paper and software (with Oksana Chkrebtii, David Campbell, and Ben Calderhead). They write: We develop a general methodology for the probabilistic integration of differential equations via model based updating of a joint prior measure on the space of functions and their temporal and spatial derivatives. This results in a posterior measure over functions reflecting how well they satisfy the system of differential equations and corresponding initial and boundary values. We show how this posterior measure can be naturally incorporated within the Kennedy and O’Hagan framework for uncertainty quantification and provides a fully Bayesian approach to model calibration. . . . A broad variety of examples are provided to illustrate the potential of this framework for characterising discretization uncertainty, including initial value, delay, and boundary value differential equations, as well as partial differential equations. We also demonstrate our methodolo

6 0.61351335 1706 andrew gelman stats-2013-02-04-Too many MC’s not enough MIC’s, or What principles should govern attempts to summarize bivariate associations in large multivariate datasets?

7 0.61145198 535 andrew gelman stats-2011-01-24-Bleg: Automatic Differentiation for Log Prob Gradients?

8 0.61135978 2247 andrew gelman stats-2014-03-14-The maximal information coefficient

9 0.58863235 519 andrew gelman stats-2011-01-16-Update on the generalized method of moments

10 0.57370305 778 andrew gelman stats-2011-06-24-New ideas on DIC from Martyn Plummer and Sumio Watanabe

11 0.57109743 2340 andrew gelman stats-2014-05-20-Thermodynamic Monte Carlo: Michael Betancourt’s new method for simulating from difficult distributions and evaluating normalizing constants

12 0.56517559 931 andrew gelman stats-2011-09-29-Hamiltonian Monte Carlo stories

13 0.56114775 1309 andrew gelman stats-2012-05-09-The first version of my “inference from iterative simulation using parallel sequences” paper!

14 0.55212843 738 andrew gelman stats-2011-05-30-Works well versus well understood

15 0.54013413 2324 andrew gelman stats-2014-05-07-Once more on nonparametric measures of mutual information

16 0.53005236 2231 andrew gelman stats-2014-03-03-Running into a Stan Reference by Accident

17 0.52755934 1443 andrew gelman stats-2012-08-04-Bayesian Learning via Stochastic Gradient Langevin Dynamics

18 0.52665913 650 andrew gelman stats-2011-04-05-Monitor the efficiency of your Markov chain sampler using expected squared jumped distance!

19 0.52362943 818 andrew gelman stats-2011-07-23-Parallel JAGS RNGs

20 0.51639056 246 andrew gelman stats-2010-08-31-Somewhat Bayesian multilevel modeling

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(6, 0.057), (15, 0.026), (16, 0.145), (24, 0.102), (43, 0.029), (79, 0.185), (82, 0.022), (86, 0.023), (95, 0.025), (99, 0.252)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94804418 1825 andrew gelman stats-2013-04-25-It’s binless! A program for computing normalizing functions

2 0.94027257 1379 andrew gelman stats-2012-06-14-Cool-ass signal processing using Gaussian processes (birthdays again)

Introduction: Aki writes: Here’s my version of the birthday frequency graph . I used Gaussian process with two slowly varying components and periodic component with decay, so that periodic form can change in time. I used Student’s t-distribution as observation model to allow exceptional dates to be outliers. I guess that periodic component due to week effect is still in the data because there is data only from twenty years. Naturally it would be better to model the whole timeseries, but it was easier to just use the cvs by Mulligan. ALl I can say is . . . wow. Bayes wins again. Maybe Aki can supply the R or Matlab code? P.S. And let’s not forget how great the simple and clear time series plots are, compared to various fancy visualizations that people might try. P.P.S. More here .

3 0.93741846 845 andrew gelman stats-2011-08-08-How adoption speed affects the abandonment of cultural tastes

Introduction: Interesting article by Jonah Berger and Gael Le Mens: Products, styles, and social movements often catch on and become popular, but little is known about why such identity-relevant cultural tastes and practices die out. We demonstrate that the velocity of adoption may affect abandonment: Analysis of over 100 years of data on first-name adoption in both France and the United States illustrates that cultural tastes that have been adopted quickly die faster (i.e., are less likely to persist). Mirroring this aggregate pattern, at the individual level, expecting parents are more hesitant to adopt names that recently experienced sharper increases in adoption. Further analysis indicate that these effects are driven by concerns about symbolic value: Fads are perceived negatively, so people avoid identity-relevant items with sharply increasing popularity because they believe that they will be short lived. Ancillary analyses also indicate that, in contrast to conventional wisdom, identity-r

4 0.92724442 1126 andrew gelman stats-2012-01-18-Bob on Stan

Introduction: Thurs 19 Jan 7pm at the NYC Machine Learning meetup. Stan ‘s entirely publicly funded and open-source and it has no secrets . Ask us about it and we’ll tell you everything you might want to know. P.S. And here ‘s the talk.

5 0.91395915 1515 andrew gelman stats-2012-09-29-Jost Haidt

Introduction: Research psychologist John Jost reviews the recent book, “The Righteous Mind,” by research psychologist Jonathan Haidt. Some of my thoughts on Haidt’s book are here . And here’s some of Jost’s review: Haidt’s book is creative, interesting, and provocative. . . . The book shines a new light on moral psychology and presents a bold, confrontational message. From a scientific perspective, however, I worry that his theory raises more questions than it answers. Why do some individuals feel that it is morally good (or necessary) to obey authority, favor the ingroup, and maintain purity, whereas others are skeptical? (Perhaps parenting style is relevant after all.) Why do some people think that it is morally acceptable to judge or even mistreat others such as gay or lesbian couples or, only a generation ago, interracial couples because they dislike or feel disgusted by them, whereas others do not? Why does the present generation “care about violence toward many more classes of victims

6 0.8919071 469 andrew gelman stats-2010-12-16-2500 people living in a park in Chicago?

7 0.88730943 863 andrew gelman stats-2011-08-21-Bad graph

8 0.88404155 2139 andrew gelman stats-2013-12-19-Happy birthday

9 0.88216817 1538 andrew gelman stats-2012-10-17-Rust

10 0.87966269 1172 andrew gelman stats-2012-02-17-Rare name analysis and wealth convergence

11 0.87963879 939 andrew gelman stats-2011-10-03-DBQQ rounding for labeling charts and communicating tolerances

12 0.87587547 1884 andrew gelman stats-2013-06-05-A story of fake-data checking being used to shoot down a flawed analysis at the Farm Credit Agency

13 0.85633409 1786 andrew gelman stats-2013-04-03-Hierarchical array priors for ANOVA decompositions

14 0.85382581 1044 andrew gelman stats-2011-12-06-The K Foundation burns Cosma’s turkey

15 0.8467083 159 andrew gelman stats-2010-07-23-Popular governor, small state

16 0.84156603 1924 andrew gelman stats-2013-07-03-Kuhn, 1-f noise, and the fractal nature of scientific revolutions

17 0.84066939 636 andrew gelman stats-2011-03-29-The Conservative States of America

18 0.83898187 1712 andrew gelman stats-2013-02-07-Philosophy and the practice of Bayesian statistics (with all the discussions!)

19 0.83653826 960 andrew gelman stats-2011-10-15-The bias-variance tradeoff

20 0.83480245 411 andrew gelman stats-2010-11-13-Ethical concerns in medical trials