andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1124 knowledge-graph by maker-knowledge-mining

1124 andrew gelman stats-2012-01-17-How to map geographically-detailed survey responses?


meta infos for this blog

Source: html

Introduction: David Sparks writes: I am experimenting with the mapping/visualization of survey response data, with a particular focus on using transparency to convey uncertainty. See some examples here . Do you think the examples are successful at communicating both local values of the variable of interest, as well as the lack of information in certain places? Also, do you have any general advice for choosing an approach to spatially smoothing the data in a way that preserves local features, but prevents individual respondents from standing out? I have experimented a lot with smoothing in these maps, and the cost of preventing the Midwest and West from looking “spotty” is the oversmoothing of the Northeast. My quick impression is that the graphs are more pretty than they are informative. But “pretty” is not such a bad thing! The conveying-information part is more difficult: to me, the graphs seem to be displaying a somewhat confusing mix of opinion level and population density. Consider


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 David Sparks writes: I am experimenting with the mapping/visualization of survey response data, with a particular focus on using transparency to convey uncertainty. [sent-1, score-0.386]

2 Do you think the examples are successful at communicating both local values of the variable of interest, as well as the lack of information in certain places? [sent-3, score-0.482]

3 Also, do you have any general advice for choosing an approach to spatially smoothing the data in a way that preserves local features, but prevents individual respondents from standing out? [sent-4, score-1.203]

4 I have experimented a lot with smoothing in these maps, and the cost of preventing the Midwest and West from looking “spotty” is the oversmoothing of the Northeast. [sent-5, score-0.386]

5 My quick impression is that the graphs are more pretty than they are informative. [sent-6, score-0.249]

6 The conveying-information part is more difficult: to me, the graphs seem to be displaying a somewhat confusing mix of opinion level and population density. [sent-8, score-0.515]

7 Consider, for example, the bright red color in Dallas. [sent-9, score-0.236]

8 There must be areas in the countryside that are also heavily Republican but Dallas stands out because there are a lot of people there. [sent-10, score-0.396]

9 In some ways this makes sense—that’s where the voters are—but to me it makes the map a bit confusing. [sent-11, score-0.352]

10 I’m also bothered by the blurriness of the entire northeast—I assume this is happening because all the cities are in each others’ penumbras. [sent-12, score-0.464]

11 That’s just one problem though; really, what’s bugging me more is the overlay of intensity with density. [sent-13, score-0.258]

12 I think I’d prefer something simpler such as putting a colored circle in each county with the size of the circle proportional to population. [sent-14, score-1.019]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('smoothing', 0.253), ('circle', 0.239), ('spatially', 0.169), ('blurriness', 0.169), ('countryside', 0.169), ('dallas', 0.169), ('preserves', 0.169), ('local', 0.161), ('experimenting', 0.16), ('sparks', 0.16), ('prevents', 0.147), ('northeast', 0.147), ('bright', 0.139), ('midwest', 0.139), ('preventing', 0.133), ('bugging', 0.131), ('colored', 0.131), ('intensity', 0.127), ('graphs', 0.125), ('transparency', 0.125), ('pretty', 0.124), ('communicating', 0.117), ('examples', 0.116), ('stands', 0.116), ('county', 0.116), ('standing', 0.114), ('cities', 0.112), ('confusing', 0.111), ('displaying', 0.111), ('heavily', 0.111), ('proportional', 0.11), ('west', 0.105), ('simpler', 0.102), ('choosing', 0.102), ('convey', 0.101), ('maps', 0.098), ('color', 0.097), ('bothered', 0.095), ('map', 0.093), ('respondents', 0.088), ('happening', 0.088), ('mix', 0.088), ('successful', 0.088), ('voters', 0.087), ('features', 0.087), ('makes', 0.086), ('putting', 0.082), ('republican', 0.082), ('places', 0.081), ('somewhat', 0.08)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1124 andrew gelman stats-2012-01-17-How to map geographically-detailed survey responses?

Introduction: David Sparks writes: I am experimenting with the mapping/visualization of survey response data, with a particular focus on using transparency to convey uncertainty. See some examples here . Do you think the examples are successful at communicating both local values of the variable of interest, as well as the lack of information in certain places? Also, do you have any general advice for choosing an approach to spatially smoothing the data in a way that preserves local features, but prevents individual respondents from standing out? I have experimented a lot with smoothing in these maps, and the cost of preventing the Midwest and West from looking “spotty” is the oversmoothing of the Northeast. My quick impression is that the graphs are more pretty than they are informative. But “pretty” is not such a bad thing! The conveying-information part is more difficult: to me, the graphs seem to be displaying a somewhat confusing mix of opinion level and population density. Consider

2 0.12875101 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

Introduction: I continue to struggle to convey my thoughts on statistical graphics so I’ll try another approach, this time giving my own story. For newcomers to this discussion: the background is that Antony Unwin and I wrote an article on the different goals embodied in information visualization and statistical graphics, but I have difficulty communicating on this point with the infovis people. Maybe if I tell my own story, and then they tell their stories, this will point a way forward to a more constructive discussion. So here goes. I majored in physics in college and I worked in a couple of research labs during the summer. Physicists graph everything. I did most of my plotting on graph paper–this continued through my second year of grad school–and became expert at putting points at 1/5, 2/5, 3/5, and 4/5 between the x and y grid lines. In grad school in statistics, I continued my physics habits and graphed everything I could. I did notice, though, that the faculty and the other

3 0.12337342 1156 andrew gelman stats-2012-02-06-Bayesian model-building by pure thought: Some principles and examples

Introduction: This is one of my favorite papers: In applications, statistical models are often restricted to what produces reasonable estimates based on the data at hand. In many cases, however, the principles that allow a model to be restricted can be derived theoretically, in the absence of any data and with minimal applied context. We illustrate this point with three well-known theoretical examples from spatial statistics and time series. First, we show that an autoregressive model for local averages violates a principle of invariance under scaling. Second, we show how the Bayesian estimate of a strictly-increasing time series, using a uniform prior distribution, depends on the scale of estimation. Third, we interpret local smoothing of spatial lattice data as Bayesian estimation and show why uniform local smoothing does not make sense. In various forms, the results presented here have been derived in previous work; our contribution is to draw out some principles that can be derived theoretic

4 0.11364242 2201 andrew gelman stats-2014-02-06-Bootstrap averaging: Examples where it works and where it doesn’t work

Introduction: Aki and I write : The very generality of the boostrap creates both opportunity and peril, allowing researchers to solve otherwise intractable problems but also sometimes leading to an answer with an inappropriately high level of certainty. We demonstrate with two examples from our own research: one problem where bootstrap smoothing was effective and led us to an improved method, and another case where bootstrap smoothing would not solve the underlying problem. Our point in these examples is not to disparage bootstrapping but rather to gain insight into where it will be more or less effective as a smoothing tool. An example where bootstrap smoothing works well Bayesian posterior distributions are commonly summarized using Monte Carlo simulations, and inferences for scalar parameters or quantities of interest can be summarized using 50% or 95% intervals. A interval for a continuous quantity is typically constructed either as a central probability interval (with probabili

5 0.10915153 536 andrew gelman stats-2011-01-24-Trends in partisanship by state

Introduction: Matthew Yglesias discusses how West Virginia used to be a Democratic state but is now solidly Republican. I thought it would be helpful to expand this to look at trends since 1948 (rather than just 1988) and all 50 states (rather than just one). This would represent a bit of work, except that I already did it a couple years ago, so here it is (right-click on the image to see the whole thing): I cheated a bit to get reasonable-looking groupings, for example putting Indiana in the Border South rather than Midwest, and putting Alaska in Mountain West and Hawaii in West Coast. Also, it would help to distinguish states by color (to be able to disentangle New Jersey and Delaware, for example) but we didn’t do this because the book is mostly black and white. In any case, the picture makes it clear that there have been strong regional trends all over during the past sixty years. P.S. My graph comes from Red State Blue State so no 2008 data, but 2008 was pretty much a shift

6 0.1043209 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update

7 0.10175755 787 andrew gelman stats-2011-07-05-Different goals, different looks: Infovis and the Chris Rock effect

8 0.098125868 2288 andrew gelman stats-2014-04-10-Small multiples of lineplots > maps (ok, not always, but yes in this case)

9 0.09136571 2255 andrew gelman stats-2014-03-19-How Americans vote

10 0.089127228 428 andrew gelman stats-2010-11-24-Flawed visualization of U.S. voting maybe has some good features

11 0.083539225 770 andrew gelman stats-2011-06-15-Still more Mr. P in public health

12 0.082317747 182 andrew gelman stats-2010-08-03-Nebraska never looked so appealing: anatomy of a zombie attack. Oops, I mean a recession.

13 0.080438912 1635 andrew gelman stats-2012-12-22-More Pinker Pinker Pinker

14 0.078175113 1275 andrew gelman stats-2012-04-22-Please stop me before I barf again

15 0.077833861 627 andrew gelman stats-2011-03-24-How few respondents are reasonable to use when calculating the average by county?

16 0.077515349 1684 andrew gelman stats-2013-01-20-Ugly ugly ugly

17 0.074983865 492 andrew gelman stats-2010-12-30-That puzzle-solving feeling

18 0.074111678 1848 andrew gelman stats-2013-05-09-A tale of two discussion papers

19 0.072775759 1096 andrew gelman stats-2012-01-02-Graphical communication for legal scholarship

20 0.072430961 1695 andrew gelman stats-2013-01-28-Economists argue about Bayes


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.154), (1, -0.017), (2, 0.058), (3, 0.046), (4, 0.061), (5, -0.062), (6, -0.052), (7, 0.028), (8, -0.011), (9, 0.01), (10, 0.021), (11, -0.039), (12, -0.024), (13, 0.01), (14, 0.018), (15, 0.004), (16, 0.018), (17, -0.013), (18, 0.022), (19, -0.014), (20, -0.006), (21, -0.011), (22, -0.003), (23, 0.015), (24, 0.011), (25, -0.023), (26, 0.04), (27, 0.003), (28, 0.009), (29, 0.045), (30, 0.062), (31, -0.01), (32, -0.025), (33, 0.004), (34, 0.005), (35, 0.014), (36, -0.003), (37, 0.018), (38, -0.025), (39, 0.04), (40, 0.011), (41, 0.008), (42, -0.029), (43, -0.021), (44, -0.013), (45, 0.022), (46, -0.022), (47, 0.01), (48, -0.003), (49, 0.049)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96155345 1124 andrew gelman stats-2012-01-17-How to map geographically-detailed survey responses?

Introduction: David Sparks writes: I am experimenting with the mapping/visualization of survey response data, with a particular focus on using transparency to convey uncertainty. See some examples here . Do you think the examples are successful at communicating both local values of the variable of interest, as well as the lack of information in certain places? Also, do you have any general advice for choosing an approach to spatially smoothing the data in a way that preserves local features, but prevents individual respondents from standing out? I have experimented a lot with smoothing in these maps, and the cost of preventing the Midwest and West from looking “spotty” is the oversmoothing of the Northeast. My quick impression is that the graphs are more pretty than they are informative. But “pretty” is not such a bad thing! The conveying-information part is more difficult: to me, the graphs seem to be displaying a somewhat confusing mix of opinion level and population density. Consider

2 0.79324305 1684 andrew gelman stats-2013-01-20-Ugly ugly ugly

Introduction: Denis Cote sends the following , under the heading, “Some bad graphs for your enjoyment”: To start with, they don’t know how to spell “color.” Seriously, though, the graph is a mess. The circular display implies a circular or periodic structure that isn’t actually in the data, the cramped display requires the use of an otherwise-unnecessary color code that makes it difficult to find or make sense of the information, the alphabetical ordering (without even supplying state names, only abbreviations) makes it further difficult to find any patterns. It would be so much better, and even easier, to just display a set of small maps shading states on whether they have different laws. But that’s part of the problem—the clearer graph would also be easier to make! To get a distinctive graph, there needs to be some degree of difficulty. The designers continue with these monstrosities: Here they decide to display only 5 states at a time so that it’s really hard to see any big pi

3 0.78753978 1896 andrew gelman stats-2013-06-13-Against the myth of the heroic visualization

Introduction: Alberto Cairo tells a fascinating story about John Snow, H. W. Acland, and the Mythmaking Problem: Every human community—nations, ethnic and cultural groups, professional guilds—inevitably raises a few of its members to the status of heroes and weaves myths around them. . . . The visual display of information is no stranger to heroes and myth. In fact, being a set of disciplines with a relatively small amount of practitioners and researchers, it has generated a staggering number of heroes, perhaps as a morale-enhancing mechanism. Most of us have heard of the wonders of William Playfair’s Commercial and Political Atlas, Florence Nightingale’s coxcomb charts, Charles Joseph Minard’s Napoleon’s march diagram, and Henry Beck’s 1933 redesign of the London Underground map. . . . Cairo’s goal, I think, is not to disparage these great pioneers of graphics but rather to put their work in perspective, recognizing the work of their excellent contemporaries. I would like to echo Cairo’

4 0.78719711 1609 andrew gelman stats-2012-12-06-Stephen Kosslyn’s principles of graphics and one more: There’s no need to cram everything into a single plot

Introduction: Jerzy Wieczorek has an interesting review of the book Graph Design for the Eye and Mind by psychology researcher Stephen Kosslyn. I recommend you read all of Wieczorek’s review (and maybe Kosslyn’s book, but that I haven’t seen), but here I’ll just focus on one point. Here’s Wieczorek summarizing Kosslyn: p. 18-19: the horizontal axis should be for the variable with the “most important part of the data.” See Kosslyn’s Figure 1.6 and 1.7 below. Figure 1.6 clearly shows that one of the sex-by-income groups reacts to age differently than the other three groups do. Figure 1.7 uses sex as the x-axis variable, making it much harder to see this same effect in the data. As a statistician exploring the data, I might make several plots using different groupings… but for communicating my results to an audience, I would choose the one plot that shows the findings most clearly. Those who know me well (or who have read the title of this post) will guess my reaction, whic

5 0.78468704 182 andrew gelman stats-2010-08-03-Nebraska never looked so appealing: anatomy of a zombie attack. Oops, I mean a recession.

Introduction: One can quibble about the best way to display county-level unemployment data on a map, since a small, populous county gets much less visual weight than a large, sparsely populated one. Even so, I think we can agree that this animated map by LaToya Egwuekwe is pretty cool. It says it shows the unemployment rate by county, as a function of time, but anyone with even the slightest knowledge of what happens during a zombie attack will recognize it for what it is.

6 0.77345884 488 andrew gelman stats-2010-12-27-Graph of the year

7 0.75832152 1894 andrew gelman stats-2013-06-12-How to best graph the Beveridge curve, relating the vacancy rate in jobs to the unemployment rate?

8 0.75521886 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

9 0.75373971 61 andrew gelman stats-2010-05-31-A data visualization manifesto

10 0.74448347 2288 andrew gelman stats-2014-04-10-Small multiples of lineplots > maps (ok, not always, but yes in this case)

11 0.74022579 404 andrew gelman stats-2010-11-09-“Much of the recent reported drop in interstate migration is a statistical artifact”

12 0.74014395 502 andrew gelman stats-2011-01-04-Cash in, cash out graph

13 0.73660302 787 andrew gelman stats-2011-07-05-Different goals, different looks: Infovis and the Chris Rock effect

14 0.73538482 2246 andrew gelman stats-2014-03-13-An Economist’s Guide to Visualizing Data

15 0.73258913 536 andrew gelman stats-2011-01-24-Trends in partisanship by state

16 0.73183417 289 andrew gelman stats-2010-09-21-“How segregated is your city?”: A story of why every graph, no matter how clear it seems to be, needs a caption to anchor the reader in some numbers

17 0.7317853 372 andrew gelman stats-2010-10-27-A use for tables (really)

18 0.72887659 2308 andrew gelman stats-2014-04-27-White stripes and dead armadillos

19 0.72781938 2059 andrew gelman stats-2013-10-12-Visualization, “big data”, and EDA

20 0.72740269 1653 andrew gelman stats-2013-01-04-Census dotmap


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.013), (6, 0.014), (16, 0.051), (21, 0.028), (24, 0.161), (66, 0.011), (77, 0.277), (84, 0.012), (86, 0.032), (89, 0.012), (96, 0.015), (99, 0.271)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.93843168 1784 andrew gelman stats-2013-04-01-Wolfram on Mandelbrot

Introduction: The most perfect pairing of author and subject since Nicholson Baker and John Updike. Here’s Wolfram on the great researcher of fractals : In his way, Mandelbrot paid me some great compliments. When I was in my 20s, and he in his 60s, he would ask about my scientific work: “How can so many people take someone so young so seriously?” In 2002, my book “A New Kind of Science”—in which I argued that many phenomena across science are the complex results of relatively simple, program-like rules—appeared. Mandelbrot seemed to see it as a direct threat, once declaring that “Wolfram’s ‘science’ is not new except when it is clearly wrong; it deserves to be completely disregarded.” In private, though, several mutual friends told me, he fretted that in the long view of history it would overwhelm his work. In retrospect, I don’t think Mandelbrot had much to worry about on this account. The link from the above review came from Peter Woit, who also points to a review by Brian Hayes wit

2 0.93234098 1684 andrew gelman stats-2013-01-20-Ugly ugly ugly

Introduction: Denis Cote sends the following , under the heading, “Some bad graphs for your enjoyment”: To start with, they don’t know how to spell “color.” Seriously, though, the graph is a mess. The circular display implies a circular or periodic structure that isn’t actually in the data, the cramped display requires the use of an otherwise-unnecessary color code that makes it difficult to find or make sense of the information, the alphabetical ordering (without even supplying state names, only abbreviations) makes it further difficult to find any patterns. It would be so much better, and even easier, to just display a set of small maps shading states on whether they have different laws. But that’s part of the problem—the clearer graph would also be easier to make! To get a distinctive graph, there needs to be some degree of difficulty. The designers continue with these monstrosities: Here they decide to display only 5 states at a time so that it’s really hard to see any big pi

3 0.91561341 1604 andrew gelman stats-2012-12-04-An epithet I can live with

Introduction: Here . Indeed, I’d much rather be a legend than a myth. I just want to clarify one thing. Walter Hickey writes: [Antony Unwin and Andrew Gelman] collaborated on this presentation where they take a hard look at what’s wrong with the recent trends of data visualization and infographics. The takeaway is that while there have been great leaps in visualization technology, some of the visualizations that have garnered the highest praises have actually been lacking in a number of key areas. Specifically, the pair does a takedown of the top visualizations of 2008 as decided by the popular statistics blog Flowing Data. This is a fair summary, but I want to emphasize that, although our dislike of some award-winning visualizations is central to our argument, it is only the first part of our story. As Antony and I worked more on our paper, and especially after seeing the discussions by Robert Kosara, Stephen Few, Hadley Wickham, and Paul Murrell (all to appear in Journal of Computati

4 0.9137181 978 andrew gelman stats-2011-10-28-Cool job opening with brilliant researchers at Yahoo

Introduction: Duncan Watts writes: The Human Social Dynamics Group in Yahoo Research is seeking highly qualified candidates for a post-doctoral research scientist position. The Human and Social Dynamics group is devoted to understanding the interplay between individual-level behavior (e.g. how people make decisions about what music they like, which dates to go on, or which groups to join) and the social environment in which individual behavior necessarily plays itself out. In particular, we are interested in: * Structure and evolution of social groups and networks * Decision making, social influence, diffusion, and collective decisions * Networking and collaborative problem solving. The intrinsically multi-disciplinary and cross-cutting nature of the subject demands an eclectic range of researchers, both in terms of domain-expertise (e.g. decision sciences, social psychology, sociology) and technical skills (e.g. statistical analysis, mathematical modeling, computer simulations, design o

5 0.90958691 1481 andrew gelman stats-2012-09-04-Cool one-day miniconference at Columbia Fri 12 Oct on computational and online social science

Introduction: One thing we do here at the Applied Statistics Center is hold mini-conferences. The next one looks really cool. It’s organized by Sharad Goel and Jake Hofman (Microsoft Research, formerly at Yahoo Research), David Park (Columbia University), and Sergei Vassilvitskii (Google). As with our other conferences, one of our goals is to mix the academic and nonacademic research communities. Here’s the website for the workshop, and here’s the announcement from the organizers: With an explosion of data on every aspect of our everyday existence — from what we buy, to where we travel, to who we know — we are able to observe human behavior with granularity largely thought impossible just a decade ago. The growth of such online activity has further facilitated the design of web-based experiments, enhancing both the scale and efficiency of traditional methods. Together these advances have created an unprecedented opportunity to address longstanding questions in the social sciences, rang

same-blog 6 0.9079423 1124 andrew gelman stats-2012-01-17-How to map geographically-detailed survey responses?

7 0.9073112 1373 andrew gelman stats-2012-06-09-Cognitive psychology research helps us understand confusion of Jonathan Haidt and others about working-class voters

8 0.89533538 911 andrew gelman stats-2011-09-15-More data tools worth using from Google

9 0.88595557 230 andrew gelman stats-2010-08-24-Kaggle forcasting update

10 0.87489891 57 andrew gelman stats-2010-05-29-Roth and Amsterdam

11 0.86348605 562 andrew gelman stats-2011-02-06-Statistician cracks Toronto lottery

12 0.86053205 1071 andrew gelman stats-2011-12-19-“NYU Professor Claims He Was Fired for Giving James Franco a D”

13 0.85689235 1561 andrew gelman stats-2012-11-04-Someone is wrong on the internet

14 0.85582078 380 andrew gelman stats-2010-10-29-“Bluntly put . . .”

15 0.85346061 401 andrew gelman stats-2010-11-08-Silly old chi-square!

16 0.83920038 1438 andrew gelman stats-2012-07-31-What is a Bayesian?

17 0.83677721 93 andrew gelman stats-2010-06-17-My proposal for making college admissions fairer

18 0.83231097 1976 andrew gelman stats-2013-08-10-The birthday problem

19 0.83033037 2054 andrew gelman stats-2013-10-07-Bing is preferred to Google by people who aren’t like me

20 0.82742327 207 andrew gelman stats-2010-08-14-Pourquoi Google search est devenu plus raisonnable?