andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1824 knowledge-graph by maker-knowledge-mining

1824 andrew gelman stats-2013-04-25-Fascinating graphs from facebook data


meta infos for this blog

Source: html

Introduction: Yair points us to this page full of wonderful graphs from the Stephen Wolfram blog. Here are a few: And some words: People talk less about video games as they get older, and more about politics and the weather. Men typically talk more about sports and technology than women—and, somewhat surprisingly to me, they also talk more about movies, television and music. Women talk more about pets+animals, family+friends, relationships—and, at least after they reach child-bearing years, health. . . . Some of this is rather depressingly stereotypical. And most of it isn’t terribly surprising to anyone who’s known a reasonable diversity of people of different ages. But what to me is remarkable is how we can see everything laid out in such quantitative detail in the pictures above—kind of a signature of people’s thinking as they go through life. Of course, the pictures above are all based on aggregate data, carefully anonymized. But if we start looking at individuals, we’ll s


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Yair points us to this page full of wonderful graphs from the Stephen Wolfram blog. [sent-1, score-0.261]

2 Here are a few: And some words: People talk less about video games as they get older, and more about politics and the weather. [sent-2, score-0.332]

3 Men typically talk more about sports and technology than women—and, somewhat surprisingly to me, they also talk more about movies, television and music. [sent-3, score-0.662]

4 Women talk more about pets+animals, family+friends, relationships—and, at least after they reach child-bearing years, health. [sent-4, score-0.238]

5 And most of it isn’t terribly surprising to anyone who’s known a reasonable diversity of people of different ages. [sent-9, score-0.439]

6 But what to me is remarkable is how we can see everything laid out in such quantitative detail in the pictures above—kind of a signature of people’s thinking as they go through life. [sent-10, score-0.47]

7 Of course, the pictures above are all based on aggregate data, carefully anonymized. [sent-11, score-0.251]

8 That’s why I’m posting this, in order to spread the word, to inspire others to do this sort of statistical exploration. [sent-18, score-0.24]

9 I wonder who did the analysis, who made the graphs, and who wrote the text. [sent-23, score-0.098]

10 It’s posted on the Stephen Wolfram Blog, but Wolfram is known for contracting out his research. [sent-25, score-0.223]

11 It’s funny: in academia, allocation of credit and attribution of authorship is huge. [sent-27, score-0.495]

12 As an academic, I’d like to give credit to whoever made these pretty graphs, but perhaps from Wolfram’s perspective, whoever made the graphs is just doing a job, just like whoever sweeps the floors in the lab or whoever cleans the erasers in the classroom. [sent-29, score-2.391]

13 Even if he didn’t do any of the work on this, it takes skill to hire the right people to do the job. [sent-31, score-0.24]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('wolfram', 0.475), ('whoever', 0.359), ('credit', 0.195), ('graphs', 0.187), ('pictures', 0.174), ('talk', 0.169), ('stephen', 0.148), ('pets', 0.127), ('women', 0.124), ('contracting', 0.12), ('sweeps', 0.12), ('floors', 0.12), ('allocation', 0.115), ('signature', 0.115), ('terribly', 0.105), ('known', 0.103), ('inspire', 0.1), ('authorship', 0.098), ('made', 0.098), ('television', 0.094), ('job', 0.093), ('laid', 0.091), ('skill', 0.091), ('movies', 0.091), ('remarkable', 0.09), ('animals', 0.09), ('diversity', 0.089), ('attribution', 0.087), ('video', 0.086), ('yair', 0.085), ('academia', 0.083), ('flexible', 0.083), ('surprisingly', 0.082), ('relationships', 0.08), ('industry', 0.08), ('hire', 0.079), ('aggregate', 0.077), ('older', 0.077), ('games', 0.077), ('technology', 0.076), ('wonderful', 0.074), ('spread', 0.073), ('surprising', 0.072), ('sports', 0.072), ('lab', 0.07), ('people', 0.07), ('reach', 0.069), ('computing', 0.068), ('give', 0.067), ('posting', 0.067)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 1824 andrew gelman stats-2013-04-25-Fascinating graphs from facebook data

Introduction: Yair points us to this page full of wonderful graphs from the Stephen Wolfram blog. Here are a few: And some words: People talk less about video games as they get older, and more about politics and the weather. Men typically talk more about sports and technology than women—and, somewhat surprisingly to me, they also talk more about movies, television and music. Women talk more about pets+animals, family+friends, relationships—and, at least after they reach child-bearing years, health. . . . Some of this is rather depressingly stereotypical. And most of it isn’t terribly surprising to anyone who’s known a reasonable diversity of people of different ages. But what to me is remarkable is how we can see everything laid out in such quantitative detail in the pictures above—kind of a signature of people’s thinking as they go through life. Of course, the pictures above are all based on aggregate data, carefully anonymized. But if we start looking at individuals, we’ll s

2 0.24103373 1207 andrew gelman stats-2012-03-10-A quick suggestion

Introduction: Next time Stephen Wolfram is on the phone , maybe he could call the head of Human Resources at his company and get this guy fired?

3 0.22822642 735 andrew gelman stats-2011-05-28-New app for learning intro statistics

Introduction: Carol Cronin writes: The new Wolfram Statistics Course Assistant App, which was released today for the iPhone, iPod touch, and iPad. Optimized for mobile devices, the Wolfram Statistics Course Assistant App helps students understand concepts such as mean, median, mode, standard deviation, probabilities, data points, random integers, random real numbers, and more. To see some examples of how you and your readers can use the app, I’d like to encourage you to check out this post on the Wolfram|Alpha Blog. If anybody out there with an i-phone etc. wants to try this out, please let me know how it works. I’m always looking for statistics-learning tools for students. I’m not really happy with the whole “mean, median, mode” thing (see above), but if the app has good things, then an instructor could pick and choose what to recommend, I assume. P.S. This looks better than the last Wolfram initiative we encountered.

4 0.19995901 1421 andrew gelman stats-2012-07-19-Alexa, Maricel, and Marty: Three cellular automata who got on my nerves

Introduction: I received the following two emails within fifteen minutes of each other. First, from “Alexa Russell,” subject line “An idea for a blog post: The Role, Importance, and Power of Words”: Hi Andrew, I’m a researcher/writer for a resource covering the importance of English proficiency in today’s workplace. I came across your blog andrewgelman.com as I was conducting research and I’m interested in contributing an article to your blog because I found the topics you cover very engaging. I’m thinking about writing an article that looks at how the Internet has changed the way English is used today; not only has its syntax changed as a result of the Internet Revolution, but the amount of job opportunities has also shifted as a result of this shift. I’d be happy to work with you on the topic if you have any insights. Thanks, and I look forward to hearing from you soon. Best, Alexa Second, From “Maricel Anderson,” subject line “An idea for a blog post: Healthcare Management and Geri

5 0.17381959 28 andrew gelman stats-2010-05-12-Alert: Incompetent colleague wastes time of hardworking Wolfram Research publicist

Introduction: Marty McKee at Wolfram Research appears to have a very very stupid colleague. McKee wrote to Christian Robert: Your article, “Evidence and Evolution: A review”, caught the attention of one of my colleagues, who thought that it could be developed into an interesting Demonstration to add to the Wolfram Demonstrations Project. As Christian points out, adapting his book review into a computer demonstration would be quite a feat! I wonder what McKee’s colleague could be thinking? I recommend that Wolfram fire McKee’s colleague immediately: what an idiot! P.S. I’m not actually sure that McKee was the author of this email; I’m guessing this was the case because this other very similar email was written under his name. P.P.S. To head off the inevitable comments: Yes, yes, I know this is no big deal and I shouldn’t get bent out of shape about it. But . . . Wolfram Research has contributed such great things to the world, that I hate to think of them wasting any money paying

6 0.14297305 545 andrew gelman stats-2011-01-30-New innovations in spam

7 0.12843049 1784 andrew gelman stats-2013-04-01-Wolfram on Mandelbrot

8 0.12688293 574 andrew gelman stats-2011-02-14-“The best data visualizations should stand on their own”? I don’t think so.

9 0.1159436 680 andrew gelman stats-2011-04-26-My talk at Berkeley on Wednesday

10 0.11520679 1668 andrew gelman stats-2013-01-11-My talk at the NY data visualization meetup this Monday!

11 0.11198982 1450 andrew gelman stats-2012-08-08-My upcoming talk for the data visualization meetup

12 0.10657461 407 andrew gelman stats-2010-11-11-Data Visualization vs. Statistical Graphics

13 0.10258508 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update

14 0.10231629 548 andrew gelman stats-2011-02-01-What goes around . . .

15 0.10011046 1584 andrew gelman stats-2012-11-19-Tradeoffs in information graphics

16 0.099201903 2275 andrew gelman stats-2014-03-31-Just gave a talk

17 0.098672807 252 andrew gelman stats-2010-09-02-R needs a good function to make line plots

18 0.095328905 481 andrew gelman stats-2010-12-22-The Jumpstart financial literacy survey and the different purposes of tests

19 0.091582268 1848 andrew gelman stats-2013-05-09-A tale of two discussion papers

20 0.090755165 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.155), (1, -0.079), (2, -0.052), (3, 0.05), (4, 0.075), (5, -0.055), (6, -0.029), (7, 0.033), (8, -0.02), (9, -0.011), (10, -0.003), (11, -0.018), (12, 0.011), (13, 0.006), (14, -0.011), (15, -0.022), (16, 0.015), (17, -0.044), (18, 0.033), (19, 0.012), (20, -0.056), (21, -0.071), (22, 0.053), (23, -0.013), (24, -0.025), (25, 0.012), (26, -0.061), (27, -0.016), (28, 0.002), (29, -0.006), (30, -0.007), (31, 0.052), (32, 0.006), (33, -0.027), (34, 0.062), (35, 0.045), (36, 0.057), (37, -0.005), (38, 0.019), (39, 0.017), (40, -0.002), (41, -0.029), (42, 0.004), (43, 0.007), (44, -0.003), (45, -0.024), (46, 0.001), (47, -0.01), (48, 0.023), (49, -0.018)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96635544 1824 andrew gelman stats-2013-04-25-Fascinating graphs from facebook data

Introduction: Yair points us to this page full of wonderful graphs from the Stephen Wolfram blog. Here are a few: And some words: People talk less about video games as they get older, and more about politics and the weather. Men typically talk more about sports and technology than women—and, somewhat surprisingly to me, they also talk more about movies, television and music. Women talk more about pets+animals, family+friends, relationships—and, at least after they reach child-bearing years, health. . . . Some of this is rather depressingly stereotypical. And most of it isn’t terribly surprising to anyone who’s known a reasonable diversity of people of different ages. But what to me is remarkable is how we can see everything laid out in such quantitative detail in the pictures above—kind of a signature of people’s thinking as they go through life. Of course, the pictures above are all based on aggregate data, carefully anonymized. But if we start looking at individuals, we’ll s

2 0.85396785 492 andrew gelman stats-2010-12-30-That puzzle-solving feeling

Introduction: Since this blog in November, I’ve given my talk on infovis vs. statistical graphics about five times: once in person (at the visualization meetup in NYC, a blog away from Num Pang!) and the rest via telephone conferencing or skype. The live presentation was best, but the remote talks have been improving, and I’m looking forward to doing more of these in the future to save time and reduce pollution. Here are the powerpoints of the talk. Now that I’ve got it working well (mostly by cutting lots of words on the slides), my next step will be to improve the interactive experience. At the very least, I need to allocate time after the talk for discussion. People usually don’t ask a lot of questions when I speak, so maybe the best strategy is to allow a half hour following the talk for people to speak with me individually. It could be set up so that I’m talking with one person but the others who are hanging out could hear the conversation too. Anyway, one of the times I gave th

3 0.80773491 407 andrew gelman stats-2010-11-11-Data Visualization vs. Statistical Graphics

Introduction: I have this great talk on the above topic but nowhere to give it. Here’s the story. Several months ago, I was invited to speak at IEEE VisWeek. It sounded like a great opportunity. The organizer told me that there were typically about 700 people in the audience, and these are people in the visualization community whom I’d like to reach but normally wouldn’t have the opportunity to encounter. It sounded great, but I didn’t want to fly most of the way across the country by myself, so I offered to give the talk by videolink. I was surprised to get a No response: I’d think that a visualization conference, of all things, would welcome a video talk. In the meantime, though, I’d thought a lot about what I’d talk about and had started preparing something. Once I found out I wouldn’t be giving the talk, I channeled the efforts into an article which, with the collaboration of Antony Unwin, was completed about a month ago. It would take very little effort to adapt this graph-laden a

4 0.76405007 1598 andrew gelman stats-2012-11-30-A graphics talk with no visuals!

Introduction: So, I’m at MIT, twenty minutes into my talk on tradeoffs in information graphics to the computer scientists, when the power goes out. They had some dim backup lighting so we weren’t all sitting there in the dark, but the projector wasn’t working. So I took questions for the remaining 40 minutes. It went well, perhaps better than the actual talk would’ve gone, even though they didn’t get to see most of my slides .

5 0.75955969 794 andrew gelman stats-2011-07-09-The quest for the holy graph

Introduction: Eytan Adar writes: I was just going through the latest draft of your paper with Anthony Unwin . I heard part of it at the talk you gave (remotely) here at UMich. I’m curious about your discussion of the Baby Name Voyager . The tool in itself is simple, attractive, and useful. No argument from me there. It’s an awesome demonstration of how subtle interactions can be very helpful (click and it zooms, type and it filters… falls perfectly into the Shneiderman visualization mantra). It satisfies a very common use case: finding appropriate names for children. That said, I can’t help but feeling that what you are really excited about is the very static analysis on last letters (you spend most of your time on this). This analysis, incidentally, is not possible to infer from the interactive application (which doesn’t support this type of filtering and pivoting). In a sense, the two visualizations don’t have anything to do with each other (other than a shared context/dataset).

6 0.74820346 438 andrew gelman stats-2010-11-30-I just skyped in from Kentucky, and boy are my arms tired

7 0.74819177 1673 andrew gelman stats-2013-01-15-My talk last night at the visualization meetup

8 0.71536177 2275 andrew gelman stats-2014-03-31-Just gave a talk

9 0.7055999 1050 andrew gelman stats-2011-12-10-Presenting at the econ seminar

10 0.70474094 816 andrew gelman stats-2011-07-22-“Information visualization” vs. “Statistical graphics”

11 0.70447034 319 andrew gelman stats-2010-10-04-“Who owns Congress”

12 0.70420736 2065 andrew gelman stats-2013-10-17-Cool dynamic demographic maps provide beautiful illustration of Chris Rock effect

13 0.70380193 1125 andrew gelman stats-2012-01-18-Beautiful Line Charts

14 0.70018327 913 andrew gelman stats-2011-09-16-Groundhog day in August?

15 0.69701451 847 andrew gelman stats-2011-08-10-Using a “pure infographic” to explore differences between information visualization and statistical graphics

16 0.6965577 2323 andrew gelman stats-2014-05-07-Cause he thinks he’s so-phisticated

17 0.69296485 855 andrew gelman stats-2011-08-16-Infovis and statgraphics update update

18 0.68431568 1734 andrew gelman stats-2013-02-23-Life in the C-suite: A graph that is both ugly and bad, and an unrelated story

19 0.68242741 798 andrew gelman stats-2011-07-12-Sometimes a graph really is just ugly

20 0.68086869 1143 andrew gelman stats-2012-01-29-G+ > Skype


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(5, 0.012), (15, 0.019), (16, 0.134), (21, 0.185), (22, 0.011), (24, 0.163), (30, 0.01), (65, 0.028), (70, 0.011), (86, 0.018), (93, 0.013), (95, 0.028), (99, 0.262)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.97733748 1826 andrew gelman stats-2013-04-26-“A Vast Graveyard of Undead Theories: Publication Bias and Psychological Science’s Aversion to the Null”

Introduction: Erin Jonaitis points us to this article by Christopher Ferguson and Moritz Heene, who write: Publication bias remains a controversial issue in psychological science. . . . that the field often constructs arguments to block the publication and interpretation of null results and that null results may be further extinguished through questionable researcher practices. Given that science is dependent on the process of falsification, we argue that these problems reduce psychological science’s capability to have a proper mechanism for theory falsification, thus resulting in the promulgation of numerous “undead” theories that are ideologically popular but have little basis in fact. They mention the infamous Daryl Bem article. It is pretty much only because Bem’s claims are (presumably) false that they got published in a major research journal. Had the claims been true—that is, had Bem run identical experiments, analyzed his data more carefully and objectively, and reported that the r

2 0.9670366 1275 andrew gelman stats-2012-04-22-Please stop me before I barf again

Introduction: Pointing to some horrible graphs, Kaiser writes, “The Earth Institute needs a graphics adviser.” I agree. The graphs are corporate standard, neither pretty or innovative enough to qualify as infographics, not informational enough to be good statistical data displays. Some examples include the above exploding pie chart, which, as Kaiser notes, is not merely ugly and ridiculously difficult to read (given that it is conveying only nine data points) but also invites suspicion of its numbers, and pages and pages of graphs that could be better compressed into a compact displays (see pages 25-65 of the report). Yes, this is all better than tables of numbers, but I don’t see that much thought went into displaying patterns of information or telling a story. It’s more graph-as-data-dump. To be fair, the report does have some a clean scatterplot (on page 65). But, overall, the graphs are not well-integrated with the messages in the text. I feel a little bit bad about this, beca

3 0.96472692 1615 andrew gelman stats-2012-12-10-A defense of Tom Wolfe based on the impossibility of the law of small numbers in network structure

Introduction: A tall thin young man came to my office today to talk about one of my current pet topics: stories and social science. I brought up Tom Wolfe and his goal of compressing an entire city into a single novel, and how this reminded me of the psychologists Kahneman and Tversky’s concept of “the law of small numbers,” the idea that we expect any small sample to replicate all the properties of the larger population that it represents. Strictly speaking, the law of small numbers is impossible—any small sample necessarily has its own unique features—but this is even more true if we consider network properties. The average American knows about 700 people (depending on how you define “know”) and this defines a social network over the population. Now suppose you look at a few hundred people and all their connections. This mini-network will almost necessarily look much much sparser than the national network, as we’re removing the connections to the people not in the sample. Now consider how

4 0.95955354 2306 andrew gelman stats-2014-04-26-Sleazy sock puppet can’t stop spamming our discussion of compressed sensing and promoting the work of Xiteng Liu

Introduction: Some asshole who has a bug up his ass about compressed sensing is spamming our comments with a bunch of sock puppets. All from the same IP address: “George Stoneriver,” Scott Wolfe,” and just plain “Paul,” all saying pretty much the same thing in the same sort of broken English (except for Paul, whose post was too short to do a dialect analysis). “Scott Wolfe” is a generic sort of name, but a quick google search reveals nothing related to this topic. “George Stoneriver” seems to have no internet presence at all (besides the comments at this blog). As for “Paul,” I don’t know, maybe the spammer was too lazy to invent a last name? Our spammer spends about half his time slamming the field of compressed sensing and the other half pumping up the work of someone named Xiteng Liu. There’s no excuse for this behavior. It’s horrible, a true abuse of our scholarly community. If Scott Adams wants to use a sock puppet, fine, the guy’s an artist and we should cut him some slack. If tha

5 0.94887948 514 andrew gelman stats-2011-01-13-News coverage of statistical issues…how did I do?

Introduction: This post is by Phil Price. A reporter once told me that the worst-kept secret of journalism is that every story has errors. And it’s true that just about every time I know about something first-hand, the news stories about it have some mistakes. Reporters aren’t subject-matter experts, they have limited time, and they generally can’t keep revisiting the things they are saying and checking them for accuracy. Many of us have published papers with errors — my most recent paper has an incorrect figure — and that’s after working on them carefully for weeks! One way that reporters can try to get things right is by quoting experts. Even then, there are problems with taking quotes out of context, or with making poor choices about what material to include or exclude, or, of course, with making a poor selection of experts. Yesterday, I was interviewed by an NPR reporter about the risks of breathing radon (a naturally occurring radioactive gas): who should test for it, how dangerous

6 0.94696391 432 andrew gelman stats-2010-11-27-Neumann update

7 0.94681954 1675 andrew gelman stats-2013-01-15-“10 Things You Need to Know About Causal Effects”

8 0.94607633 1728 andrew gelman stats-2013-02-19-The grasshopper wins, and Greg Mankiw’s grandmother would be “shocked and appalled” all over again

9 0.94548041 1401 andrew gelman stats-2012-06-30-David Hogg on statistics

same-blog 10 0.94544315 1824 andrew gelman stats-2013-04-25-Fascinating graphs from facebook data

11 0.94293511 659 andrew gelman stats-2011-04-13-Jim Campbell argues that Larry Bartels’s “Unequal Democracy” findings are not robust

12 0.94083917 62 andrew gelman stats-2010-06-01-Two Postdoc Positions Available on Bayesian Hierarchical Modeling

13 0.93761432 2037 andrew gelman stats-2013-09-25-Classical probability does not apply to quantum systems (causal inference edition)

14 0.93703103 672 andrew gelman stats-2011-04-20-The R code for those time-use graphs

15 0.93550539 151 andrew gelman stats-2010-07-16-Wanted: Probability distributions for rank orderings

16 0.93426192 810 andrew gelman stats-2011-07-20-Adding more information can make the variance go up (depending on your model)

17 0.93039644 894 andrew gelman stats-2011-09-07-Hipmunk FAIL: Graphics without content is not enough

18 0.92831743 537 andrew gelman stats-2011-01-25-Postdoc Position #1: Missing-Data Imputation, Diagnostics, and Applications

19 0.92422318 789 andrew gelman stats-2011-07-07-Descriptive statistics, causal inference, and story time

20 0.92373586 1755 andrew gelman stats-2013-03-09-Plaig