andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-1048 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Super cool post from Jamis Buck on mazemaking algorithms. It’s set up so you can click and see the maze being formed, for each of 11 different algorithms! When I was about 12, I was really into making mazes. I’d make them on graph paper and give many of them out to my friends. Somewhere along the way I lost most of them, but I remember it was a fun challenge to figure out how to make them difficult. I don’t know about these automatic maze generation algorithms, but handmade mazes (of the sort that used to appear in puzzle books) often had the problem that they were really easy to solve if you started at the end and worked back to the beginning. I didn’t want that, so when I designed my own mazes, I’d start from each of the two ends and then work out the middle. I remember one maze I was particularly proud of that had no dead ends. The dud paths just looped back to the beginning area, and you had to find the path that made it all the way through. I never tried to formalize t
sentIndex sentText sentNum sentScore
1 Super cool post from Jamis Buck on mazemaking algorithms. [sent-1, score-0.074]
2 It’s set up so you can click and see the maze being formed, for each of 11 different algorithms! [sent-2, score-0.547]
3 When I was about 12, I was really into making mazes. [sent-3, score-0.099]
4 I’d make them on graph paper and give many of them out to my friends. [sent-4, score-0.13]
5 Somewhere along the way I lost most of them, but I remember it was a fun challenge to figure out how to make them difficult. [sent-5, score-0.657]
6 I don’t know about these automatic maze generation algorithms, but handmade mazes (of the sort that used to appear in puzzle books) often had the problem that they were really easy to solve if you started at the end and worked back to the beginning. [sent-6, score-1.86]
7 I didn’t want that, so when I designed my own mazes, I’d start from each of the two ends and then work out the middle. [sent-7, score-0.232]
8 I remember one maze I was particularly proud of that had no dead ends. [sent-8, score-0.925]
9 The dud paths just looped back to the beginning area, and you had to find the path that made it all the way through. [sent-9, score-0.412]
10 I never tried to formalize the algorithm I used to make mazes, and now I don’t remember how I used to do it. [sent-10, score-0.764]
11 Jamis Buck also has a family blog full of adorable quotes from his kids. [sent-13, score-0.219]
wordName wordTfidf (topN-words)
[('maze', 0.465), ('mazes', 0.465), ('jamis', 0.34), ('buck', 0.255), ('remember', 0.191), ('algorithms', 0.185), ('formalize', 0.125), ('super', 0.125), ('proud', 0.118), ('formed', 0.118), ('used', 0.112), ('automatic', 0.106), ('paths', 0.105), ('puzzle', 0.102), ('generation', 0.098), ('dead', 0.092), ('designed', 0.091), ('path', 0.091), ('ends', 0.09), ('quotes', 0.086), ('beginning', 0.085), ('lost', 0.082), ('click', 0.082), ('back', 0.082), ('algorithm', 0.082), ('challenge', 0.081), ('kids', 0.079), ('somewhere', 0.079), ('family', 0.077), ('make', 0.076), ('solve', 0.076), ('cool', 0.074), ('area', 0.067), ('fun', 0.067), ('started', 0.067), ('tried', 0.066), ('appear', 0.065), ('worked', 0.063), ('books', 0.062), ('figure', 0.06), ('particularly', 0.059), ('full', 0.056), ('easy', 0.054), ('graph', 0.054), ('end', 0.053), ('really', 0.052), ('start', 0.051), ('along', 0.051), ('way', 0.049), ('making', 0.047)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 1048 andrew gelman stats-2011-12-09-Maze generation algorithms!
Introduction: Super cool post from Jamis Buck on mazemaking algorithms. It’s set up so you can click and see the maze being formed, for each of 11 different algorithms! When I was about 12, I was really into making mazes. I’d make them on graph paper and give many of them out to my friends. Somewhere along the way I lost most of them, but I remember it was a fun challenge to figure out how to make them difficult. I don’t know about these automatic maze generation algorithms, but handmade mazes (of the sort that used to appear in puzzle books) often had the problem that they were really easy to solve if you started at the end and worked back to the beginning. I didn’t want that, so when I designed my own mazes, I’d start from each of the two ends and then work out the middle. I remember one maze I was particularly proud of that had no dead ends. The dud paths just looped back to the beginning area, and you had to find the path that made it all the way through. I never tried to formalize t
2 0.37749338 459 andrew gelman stats-2010-12-09-Solve mazes by starting at the exit
Introduction: It worked on this one . Good maze designers know this trick and are careful to design multiple branches in each direction. Back when I was in junior high, I used to make huge mazes, and the basic idea was to anticipate what the solver might try to do and to make the maze difficult by postponing the point at which he would realize a path was going nowhere. For example, you might have 6 branches: one dead end, two pairs that form loops going back to the start, and one that is the correct solution. You do this from both directions and add some twists and turns, and there you are. But the maze designer aiming for the naive solver–the sap who starts from the entrance and goes toward the exit–can simplify matters by just having 6 branches: five dead ends and one winner. This sort of thing is easy to solve in the reverse direction. I’m surprised the Times didn’t do better for their special puzzle issue.
3 0.075690143 419 andrew gelman stats-2010-11-18-Derivative-based MCMC as a breakthrough technique for implementing Bayesian statistics
Introduction: John Salvatier pointed me to this blog on derivative based MCMC algorithms (also sometimes called “hybrid” or “Hamiltonian” Monte Carlo) and automatic differentiation as the future of MCMC. This all makes sense to me and is consistent both with my mathematical intuition from studying Metropolis algorithms and my experience with Matt using hybrid MCMC when fitting hierarchical spline models. In particular, I agree with Salvatier’s point about the potential for computation of analytic derivatives of the log-density function. As long as we’re mostly snapping together our models using analytically-simple pieces, the same part of the program that handles the computation of log-posterior densities should also be able to compute derivatives analytically. I’ve been a big fan of automatic derivative-based MCMC methods since I started hearing about them a couple years ago (I’m thinking of the DREAM project and of Mark Girolami’s paper), and I too wonder why they haven’t been used more. I
4 0.068355441 434 andrew gelman stats-2010-11-28-When Small Numbers Lead to Big Errors
Introduction: My column in Scientific American . Check out the comments. I have to remember never ever to write about guns.
5 0.065888576 390 andrew gelman stats-2010-11-02-Fragment of statistical autobiography
Introduction: I studied math and physics at MIT. To be more precise, I started in math as default–ever since I was two years old, I’ve thought of myself as a mathematician, and I always did well in math class, so it seemed like a natural fit. But I was concerned. In high school I’d been in the U.S. Mathematical Olympiad training program, and there I’d met kids who were clearly much much better at math than I was. In retrospect, I don’t think I was as bad as I’d thought at the time: there were 24 kids in the program, and I was probably around #20, if that, but I think a lot of the other kids had more practice working on “math olympiad”-type problems. Maybe I was really something like the tenth-best in the group. Tenth-best or twentieth-best, whatever it was, I reached a crisis of confidence around my sophomore or junior year in college. At MIT, I started right off taking advanced math classes, and somewhere along the way I realized I wasn’t seeing the big picture. I was able to do the homework pr
7 0.058740363 1006 andrew gelman stats-2011-11-12-Val’s Number Scroll: Helping kids visualize math
8 0.05718071 1338 andrew gelman stats-2012-05-23-Advice on writing research articles
9 0.057070266 61 andrew gelman stats-2010-05-31-A data visualization manifesto
10 0.056569424 2014 andrew gelman stats-2013-09-09-False memories and statistical analysis
11 0.055833876 1190 andrew gelman stats-2012-02-29-Why “Why”?
12 0.055793341 421 andrew gelman stats-2010-11-19-Just chaid
13 0.054173172 769 andrew gelman stats-2011-06-15-Mr. P by another name . . . is still great!
14 0.051832236 734 andrew gelman stats-2011-05-28-Funniest comment ever
15 0.051487833 620 andrew gelman stats-2011-03-19-Online James?
16 0.050650194 2154 andrew gelman stats-2013-12-30-Bill Gates’s favorite graph of the year
17 0.050418992 970 andrew gelman stats-2011-10-24-Bell Labs
18 0.050030578 1428 andrew gelman stats-2012-07-25-The problem with realistic advice?
19 0.049807657 499 andrew gelman stats-2011-01-03-5 books
20 0.049534183 1668 andrew gelman stats-2013-01-11-My talk at the NY data visualization meetup this Monday!
topicId topicWeight
[(0, 0.1), (1, -0.027), (2, -0.026), (3, 0.036), (4, 0.05), (5, -0.031), (6, 0.013), (7, -0.008), (8, 0.017), (9, -0.008), (10, 0.002), (11, 0.005), (12, -0.001), (13, -0.015), (14, 0.007), (15, -0.005), (16, -0.006), (17, -0.007), (18, 0.013), (19, 0.012), (20, 0.007), (21, -0.027), (22, -0.007), (23, 0.019), (24, 0.013), (25, -0.002), (26, -0.022), (27, 0.007), (28, 0.017), (29, 0.004), (30, 0.007), (31, 0.002), (32, 0.006), (33, -0.04), (34, 0.001), (35, -0.038), (36, 0.007), (37, -0.007), (38, 0.029), (39, 0.005), (40, -0.021), (41, 0.04), (42, 0.018), (43, 0.026), (44, 0.002), (45, -0.024), (46, -0.018), (47, 0.0), (48, 0.002), (49, 0.014)]
simIndex simValue blogId blogTitle
same-blog 1 0.95448208 1048 andrew gelman stats-2011-12-09-Maze generation algorithms!
Introduction: Super cool post from Jamis Buck on mazemaking algorithms. It’s set up so you can click and see the maze being formed, for each of 11 different algorithms! When I was about 12, I was really into making mazes. I’d make them on graph paper and give many of them out to my friends. Somewhere along the way I lost most of them, but I remember it was a fun challenge to figure out how to make them difficult. I don’t know about these automatic maze generation algorithms, but handmade mazes (of the sort that used to appear in puzzle books) often had the problem that they were really easy to solve if you started at the end and worked back to the beginning. I didn’t want that, so when I designed my own mazes, I’d start from each of the two ends and then work out the middle. I remember one maze I was particularly proud of that had no dead ends. The dud paths just looped back to the beginning area, and you had to find the path that made it all the way through. I never tried to formalize t
2 0.74867237 459 andrew gelman stats-2010-12-09-Solve mazes by starting at the exit
Introduction: It worked on this one . Good maze designers know this trick and are careful to design multiple branches in each direction. Back when I was in junior high, I used to make huge mazes, and the basic idea was to anticipate what the solver might try to do and to make the maze difficult by postponing the point at which he would realize a path was going nowhere. For example, you might have 6 branches: one dead end, two pairs that form loops going back to the start, and one that is the correct solution. You do this from both directions and add some twists and turns, and there you are. But the maze designer aiming for the naive solver–the sap who starts from the entrance and goes toward the exit–can simplify matters by just having 6 branches: five dead ends and one winner. This sort of thing is easy to solve in the reverse direction. I’m surprised the Times didn’t do better for their special puzzle issue.
3 0.73596728 1764 andrew gelman stats-2013-03-15-How do I make my graphs?
Introduction: Someone who wishes to remain anonymous writes: I’ve been following your blog a long time and enjoy your posts on visualization/statistical graphics matters. I don’t recall however you ever describing the details of your setup for plotting. I’m a new R user (convert from matplotlib) and would love to know your thoughts on the ideal setup: do you use mainly the R base? Do you use lattice? What do you think of ggplot2? etc. I found ggplot2 nearly indecipherable until a recent eureka moment, and I think its default theme is a waste tremendous ink (all those silly grey backgrounds and grids are really unnecessary), but if you customize that away it can be made to look like ordinary, pretty statistical graphs. Feel free to respond on your blog, but if you do, please remove my name from the post (my colleagues already make fun of me for thinking about visualization too much.) I love that last bit! Anyway, my response is that I do everything in base graphics (using my
4 0.73469293 970 andrew gelman stats-2011-10-24-Bell Labs
Introduction: Sining Chen told me they’re hiring in the statistics group at Bell Labs . I’ll do my bit for economic stimulus by announcing this job (see below). I love Bell Labs. I worked there for three summers, in a physics lab in 1985-86 under the supervision of Loren Pfeiffer, and by myself in the statistics group in 1990. I learned a lot working for Loren. He was a really smart and driven guy. His lab was a small set of rooms—in Bell Labs, everything’s in a small room, as they value the positive externality of close physical proximity of different labs, which you get by making each lab compact—and it was Loren, his assistant (a guy named Ken West who kept everything running in the lab), and three summer students: me, Gowton Achaibar, and a girl whose name I’ve forgotten. Gowtan and I had a lot of fun chatting in the lab. One day I made a silly comment about Gowton’s accent—he was from Guyana and pronounced “three” as “tree”—and then I apologized and said: Hey, here I am making fun o
5 0.73302627 1190 andrew gelman stats-2012-02-29-Why “Why”?
Introduction: In old books (and occasionally new books), you see the word “Why” used to indicate a pause or emphasis in dialogue. For example, from 1952: “Why, how perfectly simple!” she said to herself. “The way to save Wilbur’s life is to play a trick on Zuckerman. “If I can fool a bug,” thought Charlotte, “I can surely fool a man. People are not as smart as bugs.” That line about people and bugs was cute, but what really jumped out at me was the “Why.” I don’t think I’ve ever ever heard anyone use “Why” in that way in conversation, but I see it all the time in books, and every time it’s jarring. What’s the deal? Is it that people used to talk that way? Or is a Wasp thing, some regional speech pattern that was captured in books because it was considered standard conversational speech? I suppose one way to learn more would be to watch a bunch of old movies. I could sort of imagine Jimmy Stewart beginning his sentences with “Why” all the time. Does anyone know more? P.S. I use
7 0.71767205 1747 andrew gelman stats-2013-03-03-More research on the role of puzzles in processing data graphics
8 0.71281409 61 andrew gelman stats-2010-05-31-A data visualization manifesto
9 0.70290369 266 andrew gelman stats-2010-09-09-The future of R
10 0.70106387 1734 andrew gelman stats-2013-02-23-Life in the C-suite: A graph that is both ugly and bad, and an unrelated story
11 0.70079219 2319 andrew gelman stats-2014-05-05-Can we make better graphs of global temperature history?
12 0.69986928 1463 andrew gelman stats-2012-08-19-It is difficult to convey intonation in typed speech
13 0.6974709 983 andrew gelman stats-2011-10-31-Skepticism about skepticism of global warming skepticism skepticism
14 0.69665098 741 andrew gelman stats-2011-06-02-At least he didn’t prove a false theorem
16 0.69567525 1938 andrew gelman stats-2013-07-14-Learning how to speak
17 0.69175363 727 andrew gelman stats-2011-05-23-My new writing strategy
18 0.68652469 1661 andrew gelman stats-2013-01-08-Software is as software does
19 0.68279034 1154 andrew gelman stats-2012-02-04-“Turn a Boring Bar Graph into a 3D Masterpiece”
20 0.68276364 1831 andrew gelman stats-2013-04-29-The Great Race
topicId topicWeight
[(15, 0.013), (21, 0.014), (22, 0.018), (24, 0.101), (61, 0.016), (79, 0.306), (86, 0.011), (89, 0.025), (99, 0.367)]
simIndex simValue blogId blogTitle
1 0.96449399 939 andrew gelman stats-2011-10-03-DBQQ rounding for labeling charts and communicating tolerances
Introduction: This is a mini research note, not deserving of a paper, but perhaps useful to others. It reinvents what has already appeared on this blog. Let’s say we have a line chart with numbers between 152.134 and 210.823, with the mean of 183.463. How should we label the chart with about 3 tics? Perhaps 152.132, 181.4785 and 210.823? Don’t do it! Objective is to fit about 3-7 tics at the optimal level of rounding. I use the following sequence: decimal rounding : fitting integer power and single-digit decimal i , rounding to i * 10^ power (example: 100 200 300) binary having power , fitting single-digit decimal i and binary b , rounding to 2* i /(1+ b ) * 10^ power (150 200 250) (optional) quaternary having power , fitting single-digit decimal i and quaternary q (0,1,2,3) round to 4* i /(1+ q ) * 10^ power (150 175 200) quinary having power , fitting single-digit decimal i and quinary f (0,1,2,3,4) round to 5* i /(1+ f ) * 10^ power (160 180 200)
2 0.95789444 1538 andrew gelman stats-2012-10-17-Rust
Introduction: I happened to be referring to the path sampling paper today and took a look at Appendix A.2: I’m sure I could reconstruct all of this if I had to, but I certainly can’t read this sort of thing cold anymore.
3 0.93510246 1515 andrew gelman stats-2012-09-29-Jost Haidt
Introduction: Research psychologist John Jost reviews the recent book, “The Righteous Mind,” by research psychologist Jonathan Haidt. Some of my thoughts on Haidt’s book are here . And here’s some of Jost’s review: Haidt’s book is creative, interesting, and provocative. . . . The book shines a new light on moral psychology and presents a bold, confrontational message. From a scientific perspective, however, I worry that his theory raises more questions than it answers. Why do some individuals feel that it is morally good (or necessary) to obey authority, favor the ingroup, and maintain purity, whereas others are skeptical? (Perhaps parenting style is relevant after all.) Why do some people think that it is morally acceptable to judge or even mistreat others such as gay or lesbian couples or, only a generation ago, interracial couples because they dislike or feel disgusted by them, whereas others do not? Why does the present generation “care about violence toward many more classes of victims
4 0.92171103 845 andrew gelman stats-2011-08-08-How adoption speed affects the abandonment of cultural tastes
Introduction: Interesting article by Jonah Berger and Gael Le Mens: Products, styles, and social movements often catch on and become popular, but little is known about why such identity-relevant cultural tastes and practices die out. We demonstrate that the velocity of adoption may affect abandonment: Analysis of over 100 years of data on first-name adoption in both France and the United States illustrates that cultural tastes that have been adopted quickly die faster (i.e., are less likely to persist). Mirroring this aggregate pattern, at the individual level, expecting parents are more hesitant to adopt names that recently experienced sharper increases in adoption. Further analysis indicate that these effects are driven by concerns about symbolic value: Fads are perceived negatively, so people avoid identity-relevant items with sharply increasing popularity because they believe that they will be short lived. Ancillary analyses also indicate that, in contrast to conventional wisdom, identity-r
5 0.91081703 1126 andrew gelman stats-2012-01-18-Bob on Stan
Introduction: Thurs 19 Jan 7pm at the NYC Machine Learning meetup. Stan ‘s entirely publicly funded and open-source and it has no secrets . Ask us about it and we’ll tell you everything you might want to know. P.S. And here ‘s the talk.
6 0.90910441 469 andrew gelman stats-2010-12-16-2500 people living in a park in Chicago?
same-blog 7 0.90628409 1048 andrew gelman stats-2011-12-09-Maze generation algorithms!
8 0.87640965 1172 andrew gelman stats-2012-02-17-Rare name analysis and wealth convergence
9 0.87150049 863 andrew gelman stats-2011-08-21-Bad graph
10 0.86008519 1786 andrew gelman stats-2013-04-03-Hierarchical array priors for ANOVA decompositions
11 0.85994291 1825 andrew gelman stats-2013-04-25-It’s binless! A program for computing normalizing functions
13 0.84638286 1379 andrew gelman stats-2012-06-14-Cool-ass signal processing using Gaussian processes (birthdays again)
14 0.82619345 1384 andrew gelman stats-2012-06-19-Slick time series decomposition of the birthdays data
16 0.81998003 1229 andrew gelman stats-2012-03-25-Same old story
17 0.81614369 1041 andrew gelman stats-2011-12-04-David MacKay and Occam’s Razor
18 0.80601686 2274 andrew gelman stats-2014-03-30-Adjudicating between alternative interpretations of a statistical interaction?
19 0.80065304 1714 andrew gelman stats-2013-02-09-Partial least squares path analysis
20 0.79711878 823 andrew gelman stats-2011-07-26-Including interactions or not