andrew_gelman_stats andrew_gelman_stats-2014 andrew_gelman_stats-2014-2190 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Andrew and I have been discussing how we’re going to define functions in Stan for defining systems of differential equations; see our evolving ode design doc ; comments welcome, of course. About Scope I mentioned to Andrew I would prefer pure lexical, static scoping, as found in languages like C++ and Java. If you’re not familiar with the alternatives, there’s a nice overview in the Wikipedia article on scope . Let me call out a few passages that will help set the context. A fundamental distinction in scoping is what “context” means – whether name resolution depends on the location in the source code (lexical scope, static scope, which depends on the lexical context) or depends on the program state when the name is encountered (dynamic scope, which depends on the execution context or calling context). Lexical resolution can be determined at compile time, and is also known as early binding, while dynamic resolution can in general only be determined at run time, and thus
sentIndex sentText sentNum sentScore
1 Andrew and I have been discussing how we’re going to define functions in Stan for defining systems of differential equations; see our evolving ode design doc ; comments welcome, of course. [sent-1, score-0.351]
2 If you’re not familiar with the alternatives, there’s a nice overview in the Wikipedia article on scope . [sent-3, score-0.453]
3 Let me call out a few passages that will help set the context. [sent-4, score-0.196]
4 Lexical resolution can be determined at compile time, and is also known as early binding, while dynamic resolution can in general only be determined at run time, and thus is known as late binding. [sent-6, score-0.831]
5 …scoping rules are crucial in modular programming, so a change in one part of the program does not break an unrelated part. [sent-7, score-0.178]
6 R, on the Other Hand R’s scope rules can be quite confusing. [sent-8, score-0.514]
7 First, it lets you define a function with an undefined variable. [sent-9, score-0.499]
8 > a <- 10 > f(3) [1] 30 So clearly the value of a is getting set dynamically at run time. [sent-12, score-0.3]
9 > g <- function(y) { a <- 100; f(y); } > g(3) [1] 30 Now even if a had not been defined in the global scope, the call to f(y) in the definition of g would not pick up the definition in the body of g . [sent-14, score-0.576]
10 I expected given the dynamic nature of the definition that the answer would be 300, not 30. [sent-15, score-0.234]
11 It seems that what’s happening is that the location of the variable a is defined when f is first defined, not when f is used. [sent-16, score-0.367]
12 But the value is whatever is defined in the global environment at the time the function is called. [sent-17, score-0.713]
13 For example, redefining a produces a new value for f(3) : > a <- 20 > f(3) [1] 60 Stupid R Trick 1 Now for the stupid R trick. [sent-18, score-0.287]
14 Update from comments: Peter Meilstrup provided a link to a comment on Christian Robert’s blog by Ross Ihaka, which turns out to be where I first saw this idea: Simply Start Over and Build Something Better Suppose I define a new function h as follows. [sent-19, score-0.441]
15 5)) { b <- 1000 }; return(b * x); } and then call it a few times > h(2) [1] 40 > h(2) [1] 2000 Whether the value of b is the local variable set to 1000 or the global value set to 20 depends on the outcome of the coin flip determined by calling rbinom(1,1,0. [sent-21, score-1.08]
16 Presumably this is the behavior intended by the designers of R's scoping mechanism. [sent-23, score-0.321]
17 If you want to read more about scoping in R and S, John Fox has a document on CRAN, Frames, Environments, and Scope in R and S-PLUS . [sent-25, score-0.321]
18 > ff <- function(x) { g <- function(y) { return(x * y) }; return(g) } > ff(7)(9) [1] 63 What's going on is that R uses the scope of a variable at the point at which the function is defined and that inner function g is not defined until the function ff is called. [sent-28, score-2.322]
19 The "stupid R trick" is simply based on making the variable's scope non-deterministic. [sent-29, score-0.453]
20 1 My apologies to David Letterman and his stupid pet tricks; who knew they'd take over the internet? [sent-30, score-0.221]
wordName wordTfidf (topN-words)
[('scope', 0.453), ('scoping', 0.321), ('function', 0.31), ('lexical', 0.282), ('defined', 0.182), ('ff', 0.182), ('stupid', 0.17), ('depends', 0.165), ('rbinom', 0.141), ('resolution', 0.133), ('define', 0.131), ('closures', 0.128), ('dynamic', 0.125), ('determined', 0.119), ('value', 0.117), ('definition', 0.109), ('variable', 0.104), ('global', 0.104), ('context', 0.102), ('static', 0.099), ('return', 0.097), ('calling', 0.085), ('location', 0.081), ('trick', 0.078), ('set', 0.073), ('call', 0.072), ('ihaka', 0.064), ('binding', 0.064), ('ode', 0.064), ('rules', 0.061), ('modular', 0.061), ('dynamically', 0.061), ('undefined', 0.058), ('cran', 0.058), ('inner', 0.058), ('evolving', 0.058), ('frames', 0.056), ('program', 0.056), ('andrew', 0.055), ('known', 0.052), ('name', 0.052), ('coin', 0.051), ('passages', 0.051), ('apologies', 0.051), ('execution', 0.05), ('ross', 0.05), ('going', 0.049), ('run', 0.049), ('compile', 0.049), ('doc', 0.049)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 2190 andrew gelman stats-2014-01-29-Stupid R Tricks: Random Scope
Introduction: Andrew and I have been discussing how we’re going to define functions in Stan for defining systems of differential equations; see our evolving ode design doc ; comments welcome, of course. About Scope I mentioned to Andrew I would prefer pure lexical, static scoping, as found in languages like C++ and Java. If you’re not familiar with the alternatives, there’s a nice overview in the Wikipedia article on scope . Let me call out a few passages that will help set the context. A fundamental distinction in scoping is what “context” means – whether name resolution depends on the location in the source code (lexical scope, static scope, which depends on the lexical context) or depends on the program state when the name is encountered (dynamic scope, which depends on the execution context or calling context). Lexical resolution can be determined at compile time, and is also known as early binding, while dynamic resolution can in general only be determined at run time, and thus
2 0.1504854 1807 andrew gelman stats-2013-04-17-Data problems, coding errors…what can be done?
Introduction: This post is by Phil A recent post on this blog discusses a prominent case of an Excel error leading to substantially wrong results from a statistical analysis. Excel is notorious for this because it is easy to add a row or column of data (or intermediate results) but forget to update equations so that they correctly use the new data. That particular error is less common in a language like R because R programmers usually refer to data by variable name (or by applying functions to a named variable), so the same code works even if you add or remove data. Still, there is plenty of opportunity for errors no matter what language one uses. Andrew ran into problems fairly recently, and also blogged about another instance. I’ve never had to retract a paper, but that’s partly because I haven’t published a whole lot of papers. Certainly I have found plenty of substantial errors pretty late in some of my data analyses, and I obviously don’t have sufficient mechanisms in place to be sure
3 0.14176913 272 andrew gelman stats-2010-09-13-Ross Ihaka to R: Drop Dead
Introduction: Christian Robert posts these thoughts : I [Ross Ihaka] have been worried for some time that R isn’t going to provide the base that we’re going to need for statistical computation in the future. (It may well be that the future is already upon us.) There are certainly efficiency problems (speed and memory use), but there are more fundamental issues too. Some of these were inherited from S and some are peculiar to R. One of the worst problems is scoping. Consider the following little gem. f =function() { if (runif(1) > .5) x = 10 x } The x being returned by this function is randomly local or global. There are other examples where variables alternate between local and non-local throughout the body of a function. No sensible language would allow this. It’s ugly and it makes optimisation really difficult. This isn’t the only problem, even weirder things happen because of interactions between scoping and lazy evaluation. In light of this, I [Ihaka] have come to the c
4 0.10429926 418 andrew gelman stats-2010-11-17-ff
Introduction: Can somebody please fix the pdf reader so that it can correctly render “ff” when I cut and paste? This comes up when I’m copying sections of articles on to the blog. Thank you. P.S. I googled “ff pdf” but no help there. P.P.S. It’s a problem with “fi” also. P.P.P.S. Yes, I know about ligatures. But, if you already knew about ligatures, and I already know about ligatures, then presumably the pdf people already know about ligatures too. So why can’t their clever program, which can already find individual f’s, also find the ff’s and separate them? I assume it’s not so simple but I don’t quite understand why not.
5 0.095502742 557 andrew gelman stats-2011-02-05-Call for book proposals
Introduction: Rob Calver writes: Large and complex datasets are becoming prevalent in the social and behavioral sciences and statistical methods are crucial for the analysis and interpretation of such data. The Chapman & Hall/CRC Statistics in the Social and Behavioral Sciences Series aims to capture new developments in statistical methodology with particular relevance to applications in the social and behavioral sciences. It seeks to promote appropriate use of statistical, econometric and psychometric methods in these applied sciences by publishing a broad range of monographs, textbooks and handbooks. The scope of the series is wide, including applications of statistical methodology in sociology, psychology, economics, education, marketing research, political science, criminology, public policy, demography, survey methodology and official statistics. The titles included in the series are designed to appeal to applied statisticians, as well as students, researchers and practitioners from the
6 0.092561752 798 andrew gelman stats-2011-07-12-Sometimes a graph really is just ugly
7 0.090696618 1753 andrew gelman stats-2013-03-06-Stan 1.2.0 and RStan 1.2.0
8 0.090527222 500 andrew gelman stats-2011-01-03-Bribing statistics
9 0.081321456 2110 andrew gelman stats-2013-11-22-A Bayesian model for an increasing function, in Stan!
10 0.079761535 535 andrew gelman stats-2011-01-24-Bleg: Automatic Differentiation for Log Prob Gradients?
11 0.07474304 2345 andrew gelman stats-2014-05-24-An interesting mosaic of a data programming course
12 0.073363647 1286 andrew gelman stats-2012-04-28-Agreement Groups in US Senate and Dynamic Clustering
13 0.073156007 1955 andrew gelman stats-2013-07-25-Bayes-respecting experimental design and other things
14 0.073137507 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes
15 0.070853874 1070 andrew gelman stats-2011-12-19-The scope for snooping
16 0.069360062 1799 andrew gelman stats-2013-04-12-Stan 1.3.0 and RStan 1.3.0 Ready for Action
17 0.067355268 945 andrew gelman stats-2011-10-06-W’man < W’pedia, again
19 0.064955734 2208 andrew gelman stats-2014-02-12-How to think about “identifiability” in Bayesian inference?
20 0.061675407 46 andrew gelman stats-2010-05-21-Careers, one-hit wonders, and an offer of a free book
topicId topicWeight
[(0, 0.113), (1, 0.007), (2, -0.003), (3, 0.027), (4, 0.046), (5, -0.003), (6, 0.027), (7, -0.043), (8, 0.001), (9, -0.024), (10, -0.039), (11, -0.01), (12, -0.009), (13, -0.028), (14, 0.006), (15, 0.022), (16, 0.02), (17, -0.026), (18, -0.01), (19, -0.012), (20, 0.02), (21, 0.013), (22, 0.009), (23, -0.027), (24, 0.012), (25, 0.007), (26, 0.053), (27, 0.025), (28, 0.025), (29, -0.014), (30, 0.01), (31, 0.012), (32, 0.01), (33, 0.011), (34, -0.01), (35, -0.043), (36, -0.0), (37, 0.03), (38, 0.018), (39, 0.021), (40, -0.012), (41, -0.047), (42, -0.022), (43, -0.013), (44, -0.015), (45, 0.031), (46, 0.023), (47, 0.023), (48, 0.013), (49, 0.04)]
simIndex simValue blogId blogTitle
same-blog 1 0.96715987 2190 andrew gelman stats-2014-01-29-Stupid R Tricks: Random Scope
Introduction: Andrew and I have been discussing how we’re going to define functions in Stan for defining systems of differential equations; see our evolving ode design doc ; comments welcome, of course. About Scope I mentioned to Andrew I would prefer pure lexical, static scoping, as found in languages like C++ and Java. If you’re not familiar with the alternatives, there’s a nice overview in the Wikipedia article on scope . Let me call out a few passages that will help set the context. A fundamental distinction in scoping is what “context” means – whether name resolution depends on the location in the source code (lexical scope, static scope, which depends on the lexical context) or depends on the program state when the name is encountered (dynamic scope, which depends on the execution context or calling context). Lexical resolution can be determined at compile time, and is also known as early binding, while dynamic resolution can in general only be determined at run time, and thus
2 0.70725352 1716 andrew gelman stats-2013-02-09-iPython Notebook
Introduction: Burak Bayramli writes: I wanted to inform you on iPython Notebook technology – allowing markup, Python code to reside in one document. Someone ported one of your examples from ARM . iPynb file is actually a live document, can be downloaded and reran locally, hence change of code on document means change of images, results. Graphs (as well as text output) which are generated by the code, are placed inside the document automatically. No more referencing image files seperately. For now running notebooks locally require a notebook server, but that part can live “on the cloud” as part of an educational software. Viewers, such as nbviewer.ipython.org, do not even need that much, since all recent results of a notebook are embedded in the notebook itself. A lot of people are excited about this; Also out of nowhere, Alfred P. Sloan Foundation dropped a $1.15 million grant on the developers of ipython which provided some extra energy on the project. Cool. We’ll have to do that ex
3 0.69712794 2089 andrew gelman stats-2013-11-04-Shlemiel the Software Developer and Unknown Unknowns
Introduction: The Stan meeting today reminded me of Joel Spolsky’s recasting of the Yiddish joke about Shlemiel the Painter. Joel retold it on his blog, Joel on Software , in the post Back to Basics : Shlemiel gets a job as a street painter, painting the dotted lines down the middle of the road. On the first day he takes a can of paint out to the road and finishes 300 yards of the road. “That’s pretty good!” says his boss, “you’re a fast worker!” and pays him a kopeck. The next day Shlemiel only gets 150 yards done. “Well, that’s not nearly as good as yesterday, but you’re still a fast worker. 150 yards is respectable,” and pays him a kopeck. The next day Shlemiel paints 30 yards of the road. “Only 30!” shouts his boss. “That’s unacceptable! On the first day you did ten times that much work! What’s going on?” “I can’t help it,” says Shlemiel. “Every day I get farther and farther away from the paint can!” Joel used it as an example of the kind of string processing naive programmers ar
4 0.68957281 1807 andrew gelman stats-2013-04-17-Data problems, coding errors…what can be done?
Introduction: This post is by Phil A recent post on this blog discusses a prominent case of an Excel error leading to substantially wrong results from a statistical analysis. Excel is notorious for this because it is easy to add a row or column of data (or intermediate results) but forget to update equations so that they correctly use the new data. That particular error is less common in a language like R because R programmers usually refer to data by variable name (or by applying functions to a named variable), so the same code works even if you add or remove data. Still, there is plenty of opportunity for errors no matter what language one uses. Andrew ran into problems fairly recently, and also blogged about another instance. I’ve never had to retract a paper, but that’s partly because I haven’t published a whole lot of papers. Certainly I have found plenty of substantial errors pretty late in some of my data analyses, and I obviously don’t have sufficient mechanisms in place to be sure
5 0.68077141 418 andrew gelman stats-2010-11-17-ff
Introduction: Can somebody please fix the pdf reader so that it can correctly render “ff” when I cut and paste? This comes up when I’m copying sections of articles on to the blog. Thank you. P.S. I googled “ff pdf” but no help there. P.P.S. It’s a problem with “fi” also. P.P.P.S. Yes, I know about ligatures. But, if you already knew about ligatures, and I already know about ligatures, then presumably the pdf people already know about ligatures too. So why can’t their clever program, which can already find individual f’s, also find the ff’s and separate them? I assume it’s not so simple but I don’t quite understand why not.
6 0.67555195 266 andrew gelman stats-2010-09-09-The future of R
7 0.67384702 535 andrew gelman stats-2011-01-24-Bleg: Automatic Differentiation for Log Prob Gradients?
8 0.66800302 818 andrew gelman stats-2011-07-23-Parallel JAGS RNGs
9 0.66609603 1753 andrew gelman stats-2013-03-06-Stan 1.2.0 and RStan 1.2.0
10 0.65741801 1907 andrew gelman stats-2013-06-20-Amazing retro gnu graphics!
11 0.656802 272 andrew gelman stats-2010-09-13-Ross Ihaka to R: Drop Dead
12 0.65476787 1799 andrew gelman stats-2013-04-12-Stan 1.3.0 and RStan 1.3.0 Ready for Action
14 0.63992828 923 andrew gelman stats-2011-09-24-What is the normal range of values in a medical test?
15 0.63102603 1462 andrew gelman stats-2012-08-18-Standardizing regression inputs
16 0.62688875 1472 andrew gelman stats-2012-08-28-Migrating from dot to underscore
17 0.62503141 527 andrew gelman stats-2011-01-20-Cars vs. trucks
18 0.62439924 482 andrew gelman stats-2010-12-23-Capitalism as a form of voluntarism
19 0.61702669 2311 andrew gelman stats-2014-04-29-Bayesian Uncertainty Quantification for Differential Equations!
20 0.61166072 597 andrew gelman stats-2011-03-02-RStudio – new cross-platform IDE for R
topicId topicWeight
[(0, 0.067), (1, 0.115), (16, 0.074), (24, 0.142), (35, 0.023), (40, 0.016), (41, 0.016), (69, 0.011), (73, 0.041), (78, 0.016), (85, 0.014), (86, 0.042), (89, 0.014), (94, 0.05), (99, 0.23)]
simIndex simValue blogId blogTitle
same-blog 1 0.94725633 2190 andrew gelman stats-2014-01-29-Stupid R Tricks: Random Scope
Introduction: Andrew and I have been discussing how we’re going to define functions in Stan for defining systems of differential equations; see our evolving ode design doc ; comments welcome, of course. About Scope I mentioned to Andrew I would prefer pure lexical, static scoping, as found in languages like C++ and Java. If you’re not familiar with the alternatives, there’s a nice overview in the Wikipedia article on scope . Let me call out a few passages that will help set the context. A fundamental distinction in scoping is what “context” means – whether name resolution depends on the location in the source code (lexical scope, static scope, which depends on the lexical context) or depends on the program state when the name is encountered (dynamic scope, which depends on the execution context or calling context). Lexical resolution can be determined at compile time, and is also known as early binding, while dynamic resolution can in general only be determined at run time, and thus
2 0.92531538 1449 andrew gelman stats-2012-08-08-Gregor Mendel’s suspicious data
Introduction: Howard Wainer points me to a thoughtful discussion by Moti Nissani on “Psychological, Historical, and Ethical Reflections on the Mendelian Paradox.” The paradox, as Nissani defines it, is that Mendel’s data seem in many cases too good to be true, yet Mendel had a reputation for probity and it seems doubtful that he had a Mark-Hauser-style attitude toward reporting scientific data. Nissani writes: Taken together, the situation seems paradoxical. On the one hand, we have evidence that “the data of most, if not all, of the experiments have been falsified so as to agree closely with Mendel’s expectations.” We also have good reasons to believe that Mendel encountered linkage but failed to report it and that he may have taken the somewhat unusual step of having his scientific records destroyed shortly after his death. On the other hand, everything else we know about him/in addition to his undisputed genius/suggests a man of unimpeachable integrity, fine observational powers, and a pa
Introduction: We were having so much fun on this thread that I couldn’t resist linking to this news item by Adrian Chen. The good news is that Scott Adams (creater of the Dilbert comic strip) “has a certified genius IQ” and that he “can open jars with [his] bare hands.” He is also “able to lift heavy objects.” Cool! In all seriousness, I knew nothing about this aspect of Adams when I wrote the earlier blog. I was just surprised (and remain surprised) that he was so impressed with Charlie Sheen for being good-looking and being able to remember his lines. At the time I thought it was just a matter of Adams being overly-influenced by his direct experience, along with some satisfaction in separating himself from the general mass of Sheen-haters out there. But now I wonder if something more is going on, that maybe he feels that he and Sheen are on the same side in a culture war. In any case, the ultimate topic of interest here is not Sheen or Adams but rather more general questions of what
4 0.90512586 973 andrew gelman stats-2011-10-26-Antman again courts controversy
Introduction: Commenter Zbicyclist links to a fun article by Howard French on biologist E. O. Wilson: Wilson announced that his new book may be his last. It is not limited to the discussion of evolutionary biology, but ranges provocatively through the humanities, as well. . . . Generation after generation of students have suffered trying to “puzzle out” what great thinkers like Socrates, Plato, and Descartes had to say on the great questions of man’s nature, Wilson said, but this was of little use, because philosophy has been based on “failed models of the brain.” This reminds me of my recent remarks on the use of crude folk-psychology models as microfoundations for social sciences. The article also discusses Wilson’s recent crusade against selfish-gene-style simplifications of human and animal nature. I’m with Wilson 100% on this one. “Two brothers or eight cousins” is a cute line but it doesn’t seem to come close to describing how species or societies work, and it’s always seemed a
5 0.89779627 525 andrew gelman stats-2011-01-19-Thiel update
Introduction: A year or so ago I discussed the reasoning of zillionaire financier Peter Thiel, who seems to believe his own hype and, worse, seems to be able to convince reporters of his infallibility as well. Apparently he “possesses a preternatural ability to spot patterns that others miss.” More recently, Felix Salmon commented on Thiel’s financial misadventures: Peter Thiel’s hedge fund, Clarium Capital, ain’t doing so well. Its assets under management are down 90% from their peak, and total returns from the high point are -65%. Thiel is smart, successful, rich, well-connected, and on top of all that his calls have actually been right . . . None of that, clearly, was enough for Clarium to make money on its trades: the fund was undone by volatility and weakness in risk management. There are a few lessons to learn here. Firstly, just because someone is a Silicon Valley gazillionaire, or any kind of successful entrepreneur for that matter, doesn’t mean they should be trusted with oth
7 0.89457065 1019 andrew gelman stats-2011-11-19-Validation of Software for Bayesian Models Using Posterior Quantiles
8 0.89240575 697 andrew gelman stats-2011-05-05-A statistician rereads Bill James
10 0.88458085 4 andrew gelman stats-2010-04-26-Prolefeed
11 0.88458025 1154 andrew gelman stats-2012-02-04-“Turn a Boring Bar Graph into a 3D Masterpiece”
12 0.8810122 272 andrew gelman stats-2010-09-13-Ross Ihaka to R: Drop Dead
13 0.87981623 642 andrew gelman stats-2011-04-02-Bill James and the base-rate fallacy
14 0.87613904 582 andrew gelman stats-2011-02-20-Statisticians vs. everybody else
15 0.87540948 1760 andrew gelman stats-2013-03-12-Misunderstanding the p-value
16 0.8735503 872 andrew gelman stats-2011-08-26-Blog on applied probability modeling
18 0.87279832 906 andrew gelman stats-2011-09-14-Another day, another stats postdoc
19 0.87181544 807 andrew gelman stats-2011-07-17-Macro causality
20 0.87168729 1211 andrew gelman stats-2012-03-13-A personal bit of spam, just for me!