andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-491 knowledge-graph by maker-knowledge-mining

491 andrew gelman stats-2010-12-29-Don’t try this at home


meta infos for this blog

Source: html

Introduction: Malecki’s right, this is very cool indeed. P.S. Is it really true that “4.5 million Parisians” ride the Metro every day? Even setting aside that not all the riders are Parisians, I’m guessing that 4.5 million is the number of rides, not the number of people who ride.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Even setting aside that not all the riders are Parisians, I’m guessing that 4. [sent-6, score-0.663]

2 5 million is the number of rides, not the number of people who ride. [sent-7, score-0.644]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('parisians', 0.622), ('ride', 0.4), ('million', 0.273), ('metro', 0.255), ('riders', 0.255), ('rides', 0.247), ('malecki', 0.228), ('number', 0.166), ('guessing', 0.152), ('cool', 0.135), ('aside', 0.132), ('setting', 0.124), ('day', 0.093), ('every', 0.088), ('true', 0.088), ('right', 0.064), ('really', 0.048), ('even', 0.043), ('people', 0.039)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 491 andrew gelman stats-2010-12-29-Don’t try this at home

Introduction: Malecki’s right, this is very cool indeed. P.S. Is it really true that “4.5 million Parisians” ride the Metro every day? Even setting aside that not all the riders are Parisians, I’m guessing that 4.5 million is the number of rides, not the number of people who ride.

2 0.12221757 474 andrew gelman stats-2010-12-18-The kind of frustration we could all use more of

Introduction: Nate writes : The Yankees have offered Jeter $45 million over three years — or $15 million per year. . . But that doesn’t mean that the process won’t be frustrating for Jeter, or that there won’t be a few hurt feelings along the way. . . . $45 million, huh? Even after taxes , that’s a lot of money!

3 0.10728284 1006 andrew gelman stats-2011-11-12-Val’s Number Scroll: Helping kids visualize math

Introduction: This looks cool.

4 0.092205547 1845 andrew gelman stats-2013-05-07-Is Felix Salmon wrong on free TV?

Introduction: Mark Palko writes : Salmon is dismissive of the claim that there are fifty million over-the-air television viewers: The 50 million number, by the way, should not be considered particularly reliable: it’s Aereo’s guess as to the number of people who ever watch free-to-air TV, even if they mainly watch cable or satellite. (Maybe they have a hut somewhere with an old rabbit-ear TV in it.) And he strongly suggests the number is not only smaller but shrinking. By comparison, here’s a story from the broadcasting news site TV News Check from June of last year (if anyone has more recent numbers please let me know): According to new research by GfK Media, the number of Americans now relying solely on over-the-air (OTA) television reception increased to almost 54 million, up from 46 million just a year ago. The recently completed survey also found that the demographics of broadcast-only households skew towards younger adults, minorities and lower-income families. As Palko says,

5 0.086023003 1356 andrew gelman stats-2012-05-31-Question 21 of my final exam for Design and Analysis of Sample Surveys

Introduction: 21. A country is divided into three regions with populations of 2 million, 2 million, and 0.5 million, respectively. A survey is done asking about foreign policy opinions.. Somebody proposes taking a sample of 50 people from each reason. Give a reason why this non-proportional sample would not usually be done, and also a reason why it might actually be a good idea. Solution to question 20 From yesterday : 20. Explain in two sentences why we expect survey respondents to be honest about vote preferences but possibly dishonest about reporting unhealty behaviors. Solution: Respondents tend to be sincere about vote preferences because this affects the outcome of the poll, and people are motivated to have their candidate poll well. This motivation is typically not present in reporting behaviors; you have no particular reason for wanting to affect the average survey response.

6 0.082978718 2219 andrew gelman stats-2014-02-21-The world’s most popular languages that the Mac documentation hasn’t been translated into

7 0.075849339 526 andrew gelman stats-2011-01-19-“If it saves the life of a single child…” and other nonsense

8 0.074716404 1147 andrew gelman stats-2012-01-30-Statistical Murder

9 0.074592672 1358 andrew gelman stats-2012-06-01-Question 22 of my final exam for Design and Analysis of Sample Surveys

10 0.072853707 1064 andrew gelman stats-2011-12-16-The benefit of the continuous color scale

11 0.072005667 1546 andrew gelman stats-2012-10-24-Hey—has anybody done this study yet?

12 0.071915247 975 andrew gelman stats-2011-10-27-Caffeine keeps your Mac awake

13 0.067145757 1905 andrew gelman stats-2013-06-18-There are no fat sprinters

14 0.066650853 1938 andrew gelman stats-2013-07-14-Learning how to speak

15 0.064356312 448 andrew gelman stats-2010-12-03-This is a footnote in one of my papers

16 0.063948885 924 andrew gelman stats-2011-09-24-“Income can’t be used to predict political opinion”

17 0.063463151 925 andrew gelman stats-2011-09-26-Ethnicity and Population Structure in Personal Naming Networks

18 0.061025057 153 andrew gelman stats-2010-07-17-Tenure-track position at U. North Carolina in survey methods and social statistics

19 0.0587634 894 andrew gelman stats-2011-09-07-Hipmunk FAIL: Graphics without content is not enough

20 0.057498448 2270 andrew gelman stats-2014-03-28-Creating a Lenin-style democracy


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.04), (1, -0.02), (2, 0.019), (3, 0.01), (4, 0.01), (5, 0.003), (6, 0.021), (7, -0.003), (8, -0.004), (9, -0.027), (10, -0.013), (11, -0.025), (12, 0.006), (13, 0.007), (14, -0.022), (15, 0.007), (16, 0.032), (17, -0.01), (18, 0.03), (19, 0.0), (20, -0.012), (21, -0.001), (22, -0.006), (23, 0.002), (24, -0.034), (25, -0.008), (26, -0.02), (27, 0.011), (28, 0.001), (29, 0.011), (30, -0.014), (31, -0.044), (32, 0.036), (33, -0.051), (34, 0.016), (35, -0.024), (36, -0.015), (37, -0.001), (38, -0.025), (39, 0.003), (40, -0.027), (41, -0.028), (42, -0.023), (43, -0.023), (44, 0.006), (45, -0.002), (46, -0.02), (47, -0.013), (48, 0.013), (49, -0.005)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96365821 491 andrew gelman stats-2010-12-29-Don’t try this at home

Introduction: Malecki’s right, this is very cool indeed. P.S. Is it really true that “4.5 million Parisians” ride the Metro every day? Even setting aside that not all the riders are Parisians, I’m guessing that 4.5 million is the number of rides, not the number of people who ride.

2 0.71813554 474 andrew gelman stats-2010-12-18-The kind of frustration we could all use more of

Introduction: Nate writes : The Yankees have offered Jeter $45 million over three years — or $15 million per year. . . But that doesn’t mean that the process won’t be frustrating for Jeter, or that there won’t be a few hurt feelings along the way. . . . $45 million, huh? Even after taxes , that’s a lot of money!

3 0.63166487 68 andrew gelman stats-2010-06-03-…pretty soon you’re talking real money.

Introduction: A New York Times article reports the opening of a half-mile section of bike path, recently built along the west side of Manhattan at a cost of $16M, or roughly $30 million per mile. That’s about $5700 per linear foot. Kinda sounds like a lot, doesn’t it? Well, $30 million per mile for about one car-lane mile is a lot, but it’s not out of line compared to other urban highway construction costs. The Doyle Drive project in San Francisco — a freeway to replace the current old and deteriorating freeway approach to the Golden Gate Bridge — is currently under way at $1 billion for 1.6 miles…but hey, it will have six lanes each way, so that isn’t so bad, at $50 million per lane-mile. And there are other components to the project, too, not just building the highway (there will also be bike paths, landscaping, on- and off-ramps, and so on). All in all it seems roughly in line with the New York bike lane project. Speaking of the Doyle Drive project, one expense was the cost of movin

4 0.63050187 1342 andrew gelman stats-2012-05-24-The Used TV Price is Too Damn High

Introduction: Rohin Dhar points me to this post : At Priceonomics, we’ve learned that our users don’t want to buy used products. Rather, they want to buy inexpensive products, and used items happen to be inexpensive. Let someone else eat the initial depreciation, Priceonomics users will swoop in later and get a good deal. . . . But if you want to buy a used television, you are in for a world of hurt. As you peruse through the Craigslist listings for used TVs, you may notice something surprising – the prices are kind of high. Do a quick check on Amazon and your suspicions will be confirmed; lots of people try to sell their used television for more than that same TV would cost brand new. . . . To test our suspicions that something was amiss in the used television market, we compared used TV prices to the prices of buying them new instead. . . . It turns out, people have very inflated expectations for how much they call sell their used TV. Only 3 of the 26 televisions we analyzed were discounte

5 0.61502725 2219 andrew gelman stats-2014-02-21-The world’s most popular languages that the Mac documentation hasn’t been translated into

Introduction: I was updating my Mac and noticed the following: Lots of obscure European languages there. That got me wondering: what’s the least obscure language not on the above list? Igbo? Swahili? Or maybe Tagalog? I did a quick google and found this list of languages by number of native speakers. Once you see the list, the answer is obvious: Hindi, first language of 295 million people, is not on Apple’s list. The next most popular languages not included: Bengali, Punjabi, Javanese, Wu, Telegu, Marathi, Tamil, Urdu. Wow: most of these are Indian! Then comes Persian and a bunch of others. It turns out that Tagalog, Igbo, and Swahili, are way down on this list with 28 million, 24 million, and 26 million native speakers, respectively. Only 26 million for Swahili? This made me want to check the list of languages by total number of speakers . The ranking of most of the languages isn’t much different, but Swahili is now #10, at 140 million. Hindi and Bengali are still th

6 0.60790771 1546 andrew gelman stats-2012-10-24-Hey—has anybody done this study yet?

7 0.60682291 1038 andrew gelman stats-2011-12-02-Donate Your Data to Science!

8 0.59691173 1845 andrew gelman stats-2013-05-07-Is Felix Salmon wrong on free TV?

9 0.58922768 1245 andrew gelman stats-2012-04-03-Redundancy and efficiency: In praise of Penn Station

10 0.58517617 1731 andrew gelman stats-2013-02-21-If a lottery is encouraging addictive gambling, don’t expand it!

11 0.58001906 1127 andrew gelman stats-2012-01-18-The Fixie Bike Index

12 0.5712958 1536 andrew gelman stats-2012-10-16-Using economics to reduce bike theft

13 0.56964302 513 andrew gelman stats-2011-01-12-“Tied for Warmest Year On Record”

14 0.55209225 894 andrew gelman stats-2011-09-07-Hipmunk FAIL: Graphics without content is not enough

15 0.55194587 1057 andrew gelman stats-2011-12-14-Hey—I didn’t know that!

16 0.54415107 737 andrew gelman stats-2011-05-30-Memorial Day question

17 0.54409963 489 andrew gelman stats-2010-12-28-Brow inflation

18 0.53483975 1147 andrew gelman stats-2012-01-30-Statistical Murder

19 0.53033894 2238 andrew gelman stats-2014-03-09-Hipmunk worked

20 0.5294838 1619 andrew gelman stats-2012-12-11-There are four ways to get fired from Caesars: (1) theft, (2) sexual harassment, (3) running an experiment without a control group, and (4) keeping a gambling addict away from the casino


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(9, 0.064), (16, 0.088), (24, 0.127), (53, 0.032), (86, 0.04), (92, 0.402), (99, 0.039)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.89128613 491 andrew gelman stats-2010-12-29-Don’t try this at home

Introduction: Malecki’s right, this is very cool indeed. P.S. Is it really true that “4.5 million Parisians” ride the Metro every day? Even setting aside that not all the riders are Parisians, I’m guessing that 4.5 million is the number of rides, not the number of people who ride.

2 0.6278832 1166 andrew gelman stats-2012-02-13-Recently in the sister blog

Introduction: Lingsanity! What the sophisticates thought in September 2008 Political opinions of U.S. military The origin of essentialist reasoning

3 0.4900803 287 andrew gelman stats-2010-09-20-Paul Rosenbaum on those annoying pre-treatment variables that are sort-of instruments and sort-of covariates

Introduction: Last year we discussed an important challenge in causal inference: The standard advice (given in many books, including ours) for causal inference is to control for relevant pre-treatment variables as much as possible. But, as Judea Pearl has pointed out, instruments (as in “instrumental variables”) are pre-treatment variables that we would not want to “control for” in a matching or regression sense. At first, this seems like a minor modification, with the new recommendation being to apply instrumental variables estimation using all pre-treatment instruments, and to control for all other pre-treatment variables. But that can’t really work as general advice. What about weak instruments or covariates that have some instrumental aspects? I asked Paul Rosenbaum for his thoughts on the matter, and he wrote the following: In section 18.2 of Design of Observational Studies (DOS), I [Rosenbaum] discuss “seemingly innocuous confounding” defined to be a covariate that predicts a su

4 0.47621644 2024 andrew gelman stats-2013-09-15-Swiss Jonah Lehrer update

Introduction: Nassim Taleb adds this link to the Dobelli story . I’m confused. I thought Swiss dudes were supposed to plagiarize their own stuff, not rip off other people’s. Whassup with that?

5 0.45977312 1720 andrew gelman stats-2013-02-12-That claim that Harvard admissions discriminate in favor of Jews? After seeing the statistics, I don’t see it.

Introduction: A few months ago we discussed Ron Unz’s claim that Jews are massively overrepresented in Ivy League college admissions, not just in comparison to the general population of college-age Americans, but even in comparison to other white kids with comparable academic ability and preparation. Most of Unz’s article concerns admissions of Asian-Americans, and he also has a proposal to admit certain students at random (see my discussion in the link above). In the present post, I concentrate on the statistics about Jewish students, because this is where I have learned that his statistics are particularly suspect, with various numbers being off by factors of 2 or 4 or more. Unz’s article was discussed, largely favorably, by academic bloggers Tyler Cowen , Steve Hsu , and . . . me! Hsu writes: “Don’t miss the statistical supplement.” But a lot of our trust in those statistics seems to be misplaced. Some people have sent me some information showing serious problems with Unz’s methods

6 0.42674014 1563 andrew gelman stats-2012-11-05-Someone is wrong on the internet, part 2

7 0.40590209 1004 andrew gelman stats-2011-11-11-Kaiser Fung on how not to critique models

8 0.35562143 2073 andrew gelman stats-2013-10-22-Ivy Jew update

9 0.34849879 20 andrew gelman stats-2010-05-07-Bayesian hierarchical model for the prediction of soccer results

10 0.3475543 1697 andrew gelman stats-2013-01-29-Where 36% of all boys end up nowadays

11 0.34422445 1108 andrew gelman stats-2012-01-09-Blogging, polemical and otherwise

12 0.33815426 442 andrew gelman stats-2010-12-01-bayesglm in Stata?

13 0.33743361 1751 andrew gelman stats-2013-03-06-Janet Mertz’s response to “The Myth of American Meritocracy”

14 0.33562535 1785 andrew gelman stats-2013-04-02-So much artistic talent

15 0.33029297 2231 andrew gelman stats-2014-03-03-Running into a Stan Reference by Accident

16 0.32392284 1258 andrew gelman stats-2012-04-10-Why display 6 years instead of 30?

17 0.32361653 1008 andrew gelman stats-2011-11-13-Student project competition

18 0.31910408 1293 andrew gelman stats-2012-05-01-Huff the Magic Dragon

19 0.31687808 1046 andrew gelman stats-2011-12-07-Neutral noninformative and informative conjugate beta and gamma prior distributions

20 0.31620273 2225 andrew gelman stats-2014-02-26-A good comment on one of my papers