andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1461 knowledge-graph by maker-knowledge-mining

1461 andrew gelman stats-2012-08-17-Graphs showing uncertainty using lighter intensities for the lines that go further from the center, to de-emphasize the edges


meta infos for this blog

Source: html

Introduction: Following up on our recent discussion of visually-weighted displays of uncertainty in regression curves, Lucas Leeman sent in the following two graphs: First, the basic spaghetti-style plot showing inferential uncertainty in the E(y|x) curve: Then, a version using even lighter intensities for the lines that go further from the center, to further de-emphasize the edges: P.S. More (including code!) here .


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('intensities', 0.319), ('leeman', 0.319), ('uncertainty', 0.297), ('lucas', 0.286), ('lighter', 0.278), ('edges', 0.261), ('curves', 0.236), ('inferential', 0.222), ('displays', 0.22), ('curve', 0.204), ('following', 0.178), ('plot', 0.162), ('center', 0.161), ('showing', 0.154), ('lines', 0.154), ('version', 0.149), ('code', 0.148), ('basic', 0.137), ('sent', 0.132), ('graphs', 0.125), ('regression', 0.108), ('including', 0.107), ('recent', 0.087), ('discussion', 0.083), ('go', 0.076), ('using', 0.071), ('first', 0.066), ('two', 0.064), ('even', 0.051)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1461 andrew gelman stats-2012-08-17-Graphs showing uncertainty using lighter intensities for the lines that go further from the center, to de-emphasize the edges

Introduction: Following up on our recent discussion of visually-weighted displays of uncertainty in regression curves, Lucas Leeman sent in the following two graphs: First, the basic spaghetti-style plot showing inferential uncertainty in the E(y|x) curve: Then, a version using even lighter intensities for the lines that go further from the center, to further de-emphasize the edges: P.S. More (including code!) here .

2 0.35668623 1452 andrew gelman stats-2012-08-09-Visually weighting regression displays

Introduction: Solomon Hsiang writes : One of my colleagues suggested that I send you this very short note that I wrote on a new approach for displaying regression result uncertainty (attached). It’s very simple, and I’ve found it effective in one of my papers where I actually use it, but if you have a chance to glance over it and have any ideas for how to sell the approach or make it better, I’d be very interested to hear them. (Also, if you’ve seen that someone else has already made this point, I’d appreciate knowing that too.) Here’s an example: Hsiang writes: In Panel A, our eyes are drawn outward, away from the center of the display and toward the swirling confidence intervals at the edges. But in Panel B, our eyes are attracted to the middle of the regression line, where the high contrast between the line and the background is sharp and visually heavy. By using visual-weighting, we focus our readers’s attention on those portions of the regression that contain the most inform

3 0.19167976 1543 andrew gelman stats-2012-10-21-Model complexity as a function of sample size

Introduction: As we get more data, we can fit more model. But at some point we become so overwhelmed by data that, for computational reasons, we can barely do anything at all. Thus, the curve above could be thought of as the product of two curves: a steadily increasing curve showing the statistical ability to fit more complex models with more data, and a steadily decreasing curve showing the computational feasibility of doing so.

4 0.15534928 1934 andrew gelman stats-2013-07-11-Yes, worry about generalizing from data to population. But multilevel modeling is the solution, not the problem

Introduction: A sociologist writes in: Samuel Lucas has just published a paper in Quality and Quantity arguing that anything less than a full probability sample of higher levels in HLMs yields biased and unusable results. If I follow him correctly, he is arguing that not only are the SEs too small, but the parameter estimates themselves are biased and we cannot say in advance whether the bias is positive or negative. Lucas has thrown down a big gauntlet, advising us throw away our data unless the sample of macro units is right and ignore the published results that fail this standard. Extreme. Is there another conclusion to be drawn? Other advice to be given? A Bayesian path out of the valley? Heres’s the abstract to Lucas’s paper: The multilevel model has become a staple of social research. I textually and formally explicate sample design features that, I contend, are required for unbiased estimation of macro-level multilevel model parameters and the use of tools for statistical infe

5 0.12961958 1470 andrew gelman stats-2012-08-26-Graphs showing regression uncertainty: the code!

Introduction: After our discussion of visual displays of regression uncertainty, I asked Solomon Hsiang and Lucas Leeman to send me their code. Both of them replied. Solomon wrote: The matlab and stata functions I wrote, as well as the script that replicates my figures, are all posted on my website . Also, I just added options to the main matlab function (vwregress.m) to make it display the spaghetti plot (similar to what Lucas did, but a simple bootstrap) and the shaded CI that you suggested (see figs below). They’re good suggestions. Personally, I [Hsiang] like the shaded CI better, since I think that all the visual activity in the spaghetti plot is a little distracting and sometimes adds visual weight in places where I wouldn’t want it. But the option is there in case people like it. Solomon then followed up with: I just thought of this small adjustment to your filled CI idea that seems neat. Cartographers like map projections that conserve area. We can do som

6 0.12020929 1478 andrew gelman stats-2012-08-31-Watercolor regression

7 0.097527713 2299 andrew gelman stats-2014-04-21-Stan Model of the Week: Hierarchical Modeling of Supernovas

8 0.094404116 852 andrew gelman stats-2011-08-13-Checking your model using fake data

9 0.093460374 348 andrew gelman stats-2010-10-17-Joanne Gowa scooped me by 22 years in my criticism of Axelrod’s Evolution of Cooperation

10 0.090964265 146 andrew gelman stats-2010-07-14-The statistics and the science

11 0.089210428 480 andrew gelman stats-2010-12-21-Instead of “confidence interval,” let’s say “uncertainty interval”

12 0.084685296 293 andrew gelman stats-2010-09-23-Lowess is great

13 0.08035849 252 andrew gelman stats-2010-09-02-R needs a good function to make line plots

14 0.078804426 2311 andrew gelman stats-2014-04-29-Bayesian Uncertainty Quantification for Differential Equations!

15 0.077253945 1095 andrew gelman stats-2012-01-01-Martin and Liu: Probabilistic inference based on consistency of model with data

16 0.075797856 1283 andrew gelman stats-2012-04-26-Let’s play “Guess the smoother”!

17 0.07574413 1894 andrew gelman stats-2013-06-12-How to best graph the Beveridge curve, relating the vacancy rate in jobs to the unemployment rate?

18 0.074215002 372 andrew gelman stats-2010-10-27-A use for tables (really)

19 0.071170874 96 andrew gelman stats-2010-06-18-Course proposal: Bayesian and advanced likelihood statistical methods for zombies.

20 0.07035327 324 andrew gelman stats-2010-10-07-Contest for developing an R package recommendation system


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.083), (1, 0.014), (2, 0.007), (3, 0.023), (4, 0.101), (5, -0.071), (6, -0.05), (7, -0.022), (8, -0.009), (9, 0.0), (10, 0.013), (11, -0.006), (12, -0.014), (13, -0.002), (14, 0.006), (15, 0.018), (16, 0.02), (17, -0.003), (18, -0.024), (19, -0.026), (20, 0.056), (21, 0.07), (22, 0.028), (23, -0.017), (24, 0.029), (25, -0.014), (26, 0.038), (27, -0.062), (28, 0.003), (29, 0.008), (30, 0.052), (31, 0.01), (32, -0.06), (33, 0.004), (34, -0.008), (35, -0.067), (36, -0.022), (37, 0.042), (38, 0.034), (39, -0.065), (40, 0.041), (41, 0.048), (42, 0.038), (43, 0.01), (44, 0.065), (45, 0.006), (46, -0.069), (47, 0.072), (48, 0.034), (49, -0.05)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98602194 1461 andrew gelman stats-2012-08-17-Graphs showing uncertainty using lighter intensities for the lines that go further from the center, to de-emphasize the edges

Introduction: Following up on our recent discussion of visually-weighted displays of uncertainty in regression curves, Lucas Leeman sent in the following two graphs: First, the basic spaghetti-style plot showing inferential uncertainty in the E(y|x) curve: Then, a version using even lighter intensities for the lines that go further from the center, to further de-emphasize the edges: P.S. More (including code!) here .

2 0.82251257 1478 andrew gelman stats-2012-08-31-Watercolor regression

Introduction: Solomon Hsiang writes: Two small follow-ups based on the discussion (the second/bigger one is to address your comment about the 95% CI edges). 1. I realized that if we plot the confidence intervals as a solid color that fades (eg. using the “fixed ink” scheme from before) we can make sure the regression line also has heightened visual weight where confidence is high by plotting the line white. This makes the contrast (and thus visual weight) between the regression line and the CI highest when the CI is narrow and dark. As the CI fade near the edges, so does the contrast with the regression line. This is a small adjustment, but I like it because it is so simple and it makes the graph much nicer. (see “visually_weighted_fill_reverse” attached). My posted code has been updated to do this automatically. 2. You and your readers didn’t like that the edges of the filled CI were so sharp and arbitrary. But I didn’t like that the contrast between the spaghetti lines and the background

3 0.82207865 1452 andrew gelman stats-2012-08-09-Visually weighting regression displays

Introduction: Solomon Hsiang writes : One of my colleagues suggested that I send you this very short note that I wrote on a new approach for displaying regression result uncertainty (attached). It’s very simple, and I’ve found it effective in one of my papers where I actually use it, but if you have a chance to glance over it and have any ideas for how to sell the approach or make it better, I’d be very interested to hear them. (Also, if you’ve seen that someone else has already made this point, I’d appreciate knowing that too.) Here’s an example: Hsiang writes: In Panel A, our eyes are drawn outward, away from the center of the display and toward the swirling confidence intervals at the edges. But in Panel B, our eyes are attracted to the middle of the regression line, where the high contrast between the line and the background is sharp and visually heavy. By using visual-weighting, we focus our readers’s attention on those portions of the regression that contain the most inform

4 0.75579858 1470 andrew gelman stats-2012-08-26-Graphs showing regression uncertainty: the code!

Introduction: After our discussion of visual displays of regression uncertainty, I asked Solomon Hsiang and Lucas Leeman to send me their code. Both of them replied. Solomon wrote: The matlab and stata functions I wrote, as well as the script that replicates my figures, are all posted on my website . Also, I just added options to the main matlab function (vwregress.m) to make it display the spaghetti plot (similar to what Lucas did, but a simple bootstrap) and the shaded CI that you suggested (see figs below). They’re good suggestions. Personally, I [Hsiang] like the shaded CI better, since I think that all the visual activity in the spaghetti plot is a little distracting and sometimes adds visual weight in places where I wouldn’t want it. But the option is there in case people like it. Solomon then followed up with: I just thought of this small adjustment to your filled CI idea that seems neat. Cartographers like map projections that conserve area. We can do som

5 0.68005717 672 andrew gelman stats-2011-04-20-The R code for those time-use graphs

Introduction: By popular demand, here’s my R script for the time-use graphs : # The data a1 <- c(4.2,3.2,11.1,1.3,2.2,2.0) a2 <- c(3.9,3.2,10.0,0.8,3.1,3.1) a3 <- c(6.3,2.5,9.8,0.9,2.2,2.4) a4 <- c(4.4,3.1,9.8,0.8,3.3,2.7) a5 <- c(4.8,3.0,9.9,0.7,3.3,2.4) a6 <- c(4.0,3.4,10.5,0.7,3.3,2.1) a <- rbind(a1,a2,a3,a4,a5,a6) avg <- colMeans (a) avg.array <- t (array (avg, rev(dim(a)))) diff <- a - avg.array country.name <- c("France", "Germany", "Japan", "Britain", "USA", "Turkey") # The line plots par (mfrow=c(2,3), mar=c(4,4,2,.5), mgp=c(2,.7,0), tck=-.02, oma=c(3,0,4,0), bg="gray96", fg="gray30") for (i in 1:6){ plot (c(1,6), c(-1,1.7), xlab="", ylab="", xaxt="n", yaxt="n", bty="l", type="n") lines (1:6, diff[i,], col="blue") points (1:6, diff[i,], pch=19, col="black") if (i>3){ axis (1, c(1,3,5), c ("Work,\nstudy", "Eat,\nsleep", "Leisure"), mgp=c(2,1.5,0), tck=0, cex.axis=1.2) axis (1, c(2,4,6), c ("Unpaid\nwork", "Personal\nCare", "Other"), mgp=c(2,1.5,0),

6 0.66389805 1283 andrew gelman stats-2012-04-26-Let’s play “Guess the smoother”!

7 0.63740319 252 andrew gelman stats-2010-09-02-R needs a good function to make line plots

8 0.63523293 293 andrew gelman stats-2010-09-23-Lowess is great

9 0.60222059 1235 andrew gelman stats-2012-03-29-I’m looking for a quadrille notebook with faint lines

10 0.57207298 929 andrew gelman stats-2011-09-27-Visual diagnostics for discrete-data regressions

11 0.55756909 1090 andrew gelman stats-2011-12-28-“. . . extending for dozens of pages”

12 0.55574781 1498 andrew gelman stats-2012-09-16-Choices in graphing parallel time series

13 0.55272746 1258 andrew gelman stats-2012-04-10-Why display 6 years instead of 30?

14 0.55126673 144 andrew gelman stats-2010-07-13-Hey! Here’s a referee report for you!

15 0.54937679 296 andrew gelman stats-2010-09-26-A simple semigraphic display

16 0.54197824 324 andrew gelman stats-2010-10-07-Contest for developing an R package recommendation system

17 0.5368492 1609 andrew gelman stats-2012-12-06-Stephen Kosslyn’s principles of graphics and one more: There’s no need to cram everything into a single plot

18 0.53440523 1439 andrew gelman stats-2012-08-01-A book with a bunch of simple graphs

19 0.53171629 96 andrew gelman stats-2010-06-18-Course proposal: Bayesian and advanced likelihood statistical methods for zombies.

20 0.531268 146 andrew gelman stats-2010-07-14-The statistics and the science


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(14, 0.258), (16, 0.078), (21, 0.04), (24, 0.07), (51, 0.042), (89, 0.04), (90, 0.051), (99, 0.265)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.939937 755 andrew gelman stats-2011-06-09-Recently in the award-winning sister blog

Introduction: In case you haven’t been following: - Top ten excuses for plagiarism - Why I won’t be sad to see Anthony Weiner retire - U.S. voter participation has not fallen steadily over the past few decades - Scott Adams had an interesting idea

2 0.92133176 1809 andrew gelman stats-2013-04-17-NUTS discussed on Xi’an’s Og

Introduction: Xi’an’s Og (aka Christian Robert’s blog) is featuring a very nice presentation of NUTS by Marco Banterle, with discussion and some suggestions. I’m not even sure how they found Michael Betancourt’s paper on geometric NUTS — I don’t see it on the arXiv yet, or I’d provide a link.

3 0.91254222 1724 andrew gelman stats-2013-02-16-Zero Dark Thirty and Bayes’ theorem

Introduction: A moviegoing colleague writes: I just watched the movie Zero Dark Thirty about the hunt for Osama Bin Laden. What struck me about it was: (1) Bayes theorem underlies the whole movie; (2) CIA top brass do not know Bayes theorem (at least as portrayed in the movie). Obviously one does not need to know physics to play billiards, but it helps with the reasoning. Essentially, at some point the key CIA agent locates what she strongly believes is OBL’s hidding place in Pakistan. Then it takes the White House some 150 days to make the decision to attack the compound. Why so long? And why, even on the eve of the operation, were senior brass only some 60% OBL was there? Fear of false positives is the answer. After all, the compound could belong to a drug lord, or some other terrorist. Here is the math: There are two possibilities, according to movie: OBL is in a compound (C) in a city or he is in the mountains in tribal regions. Say P(OBL in C) = 0.5. A diagnosis is made on

same-blog 4 0.90194523 1461 andrew gelman stats-2012-08-17-Graphs showing uncertainty using lighter intensities for the lines that go further from the center, to de-emphasize the edges

Introduction: Following up on our recent discussion of visually-weighted displays of uncertainty in regression curves, Lucas Leeman sent in the following two graphs: First, the basic spaghetti-style plot showing inferential uncertainty in the E(y|x) curve: Then, a version using even lighter intensities for the lines that go further from the center, to further de-emphasize the edges: P.S. More (including code!) here .

5 0.88618594 824 andrew gelman stats-2011-07-26-Milo and Milo

Introduction: I recently finished two enjoyable novels that I was pretty sure I’d like, given that they were both sequels of a sort. The main characters of both books were named Milo, a name that in literature appears only (to my knowledge) in The Phantom Tollbooth and Catch-22. The Milos in the new books I just read are much different than the two classic literary Milos. One, featured in the new thriller by Olen Steinhauer , is a cool, effective CIA killing machine (but of the good-guy variety, also he has some little character flaws to make him tolerable but he’s basically a superhero). The other is not any sort of killing machine, more of more of a Sam Lipsyte character. Which makes sense since he’s the star of The Ask, the follow-up to Lipsyte’s hilarious lovable-loser saga, Home Land. I have two questions about The Ask. 1. The driver of the plot is as follows. Milo has just been fired from his crappy job at a college in NYC. Milo has a rich friend who asks him to do a favor; in re

6 0.88265246 130 andrew gelman stats-2010-07-07-A False Consensus about Public Opinion on Torture

7 0.88207597 1051 andrew gelman stats-2011-12-10-Towards a Theory of Trust in Networks of Humans and Computers

8 0.86195028 2344 andrew gelman stats-2014-05-23-The gremlins did it? Iffy statistics drive strong policy recommendations

9 0.85121536 1696 andrew gelman stats-2013-01-29-The latest in economics exceptionalism

10 0.84389418 1770 andrew gelman stats-2013-03-19-Retraction watch

11 0.82993323 245 andrew gelman stats-2010-08-31-Predicting marathon times

12 0.80220103 1303 andrew gelman stats-2012-05-06-I’m skeptical about this skeptical article about left-handedness

13 0.79596686 2117 andrew gelman stats-2013-11-29-The gradual transition to replicable science

14 0.79050595 1471 andrew gelman stats-2012-08-27-Why do we never see a full decision analysis for a clinical trial?

15 0.78122497 1236 andrew gelman stats-2012-03-29-Resolution of Diederik Stapel case

16 0.76732069 2059 andrew gelman stats-2013-10-12-Visualization, “big data”, and EDA

17 0.76167721 838 andrew gelman stats-2011-08-04-Retraction Watch

18 0.75401825 2237 andrew gelman stats-2014-03-08-Disagreeing to disagree

19 0.75267553 2114 andrew gelman stats-2013-11-26-“Please make fun of this claim”

20 0.75222182 252 andrew gelman stats-2010-09-02-R needs a good function to make line plots