andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-533 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Mark Palko writes : You lose information when you go from a vector to a scalar. But what about this trick, which they told me about in high school? Combine two dimensions into one by interleaving the decimals. For example, if a=.11111 and b=.22222, then (a,b) = .1212121212.
sentIndex sentText sentNum sentScore
1 Mark Palko writes : You lose information when you go from a vector to a scalar. [sent-1, score-1.073]
2 But what about this trick, which they told me about in high school? [sent-2, score-0.401]
3 Combine two dimensions into one by interleaving the decimals. [sent-3, score-0.51]
wordName wordTfidf (topN-words)
[('vector', 0.394), ('combine', 0.375), ('trick', 0.345), ('dimensions', 0.345), ('palko', 0.326), ('lose', 0.313), ('mark', 0.244), ('told', 0.232), ('school', 0.22), ('high', 0.169), ('information', 0.151), ('go', 0.129), ('two', 0.108), ('example', 0.095), ('writes', 0.086), ('one', 0.057)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 533 andrew gelman stats-2011-01-23-The scalarization of America
Introduction: Mark Palko writes : You lose information when you go from a vector to a scalar. But what about this trick, which they told me about in high school? Combine two dimensions into one by interleaving the decimals. For example, if a=.11111 and b=.22222, then (a,b) = .1212121212.
2 0.29064482 1318 andrew gelman stats-2012-05-13-Stolen jokes
Introduction: Fun stories here (from Kliph Nesteroff, link from Mark Palko).
Introduction: Mark Palko waxes indignant about corporate postmodernism.
Introduction: Greg Campbell writes: I am a Canadian archaeologist (BSc in Chemistry) researching the past human use of European Atlantic shellfish. After two decades of practice I am finally getting a MA in archaeology at Reading. I am seeing if the habitat or size of harvested mussels (Mytilus edulis) can be reconstructed from measurements of the umbo (the pointy end, and the only bit that survives well in archaeological deposits) using log-transformed measurements (or allometry; relationships between dimensions are more likely exponential than linear). Of course multivariate regressions in most statistics packages (Minitab, SPSS, SAS) assume you are trying to predict one variable from all the others (a Model I regression), and use ordinary least squares to fit the regression line. For organismal dimensions this makes little sense, since all the dimensions are (at least in theory) free to change their mutual proportions during growth. So there is no predictor and predicted, mutual variation of
5 0.13429764 943 andrew gelman stats-2011-10-04-Flip it around
Introduction: Mark Palko discusses a radio interview on the effect of parents on children’s education. In short, the interviewer (Stephen Dubner of Freakonomics fame) claims that the research shows that parents don’t have much influence on whether their children go to college. The evidence is based on a comparison of adopted and non-adopted children. Palko makes a convincing case that the statistical analysis (by economist Bruce Sacerdote) doesn’t show what Dubner says it shows. I looked over the linked transcript, and overall I’m less unhappy than Palko is about the interview. I agree that some of the causal implications are sloppy, and I think it’s a bit silly for the interviewer (Kai Ryssdal) to use celebrities as a benchmark. (Ryssdal says, “if [a certain parenting style is] good enough for Steven Levitt, it’s good enough for me.” But Levitt is a multimillionaire—he’ll always have a huge financial cushion. It’s not clear that what works for him would work for others who are not so wel
6 0.1290431 1224 andrew gelman stats-2012-03-21-Teaching velocity and acceleration
7 0.12491895 992 andrew gelman stats-2011-11-05-Deadwood in the math curriculum
8 0.11602065 554 andrew gelman stats-2011-02-04-An addition to the model-makers’ oath
9 0.11342151 1683 andrew gelman stats-2013-01-19-“Confirmation, on the other hand, is not sexy”
10 0.10887952 2365 andrew gelman stats-2014-06-09-I hate polynomials
11 0.10848128 1767 andrew gelman stats-2013-03-17-The disappearing or non-disappearing middle class
12 0.1049949 1646 andrew gelman stats-2013-01-01-Back when fifty years was a long time ago
13 0.10406066 842 andrew gelman stats-2011-08-07-Hey, I’m just like Picasso (but without all the babes)!
14 0.10108919 529 andrew gelman stats-2011-01-21-“City Opens Inquiry on Grading Practices at a Top-Scoring Bronx School”
15 0.09596011 279 andrew gelman stats-2010-09-15-Electability and perception of electability
16 0.094124213 99 andrew gelman stats-2010-06-19-Paired comparisons
17 0.090077773 93 andrew gelman stats-2010-06-17-My proposal for making college admissions fairer
18 0.088105179 1165 andrew gelman stats-2012-02-13-Philosophy of Bayesian statistics: my reactions to Wasserman
20 0.087273672 1674 andrew gelman stats-2013-01-15-Prior Selection for Vector Autoregressions
topicId topicWeight
[(0, 0.072), (1, -0.009), (2, 0.012), (3, 0.01), (4, 0.033), (5, 0.038), (6, 0.046), (7, 0.055), (8, -0.022), (9, 0.01), (10, -0.021), (11, 0.031), (12, -0.034), (13, -0.034), (14, -0.014), (15, -0.013), (16, 0.023), (17, 0.049), (18, -0.001), (19, -0.0), (20, -0.012), (21, -0.021), (22, -0.0), (23, -0.027), (24, 0.039), (25, -0.02), (26, 0.026), (27, 0.041), (28, -0.011), (29, 0.022), (30, 0.006), (31, 0.034), (32, 0.054), (33, 0.004), (34, 0.039), (35, 0.036), (36, -0.039), (37, 0.08), (38, 0.057), (39, 0.087), (40, -0.018), (41, 0.062), (42, 0.037), (43, -0.022), (44, -0.034), (45, 0.046), (46, -0.061), (47, 0.032), (48, -0.055), (49, -0.058)]
simIndex simValue blogId blogTitle
same-blog 1 0.97407079 533 andrew gelman stats-2011-01-23-The scalarization of America
Introduction: Mark Palko writes : You lose information when you go from a vector to a scalar. But what about this trick, which they told me about in high school? Combine two dimensions into one by interleaving the decimals. For example, if a=.11111 and b=.22222, then (a,b) = .1212121212.
Introduction: Mark Palko waxes indignant about corporate postmodernism.
3 0.73084158 842 andrew gelman stats-2011-08-07-Hey, I’m just like Picasso (but without all the babes)!
Introduction: So says Mark Liberman.
4 0.68176293 2365 andrew gelman stats-2014-06-09-I hate polynomials
Introduction: A recent discussion with Mark Palko [scroll down to the comments at this link ] reminds me that I think that polynomials are way way overrated, and I think a lot of damage has arisen from the old-time approach of introducing polynomial functions as a canonical example of linear regressions ( for example ). There are very few settings I can think of where it makes sense to fit a general polynomial of degree higher than 2. I think that millions of students have been brainwashed into thinking of these as the canonical functions and that this has caused endless trouble later on. I’m not sure how I’d change the high school math curriculum to deal with this, but I do think it’s an issue.
5 0.6786809 529 andrew gelman stats-2011-01-21-“City Opens Inquiry on Grading Practices at a Top-Scoring Bronx School”
Introduction: Sharon Otterman reports : When report card grades were released in the fall for the city’s 455 high schools, the highest score went to a small school in a down-and-out section of the Bronx . . . A stunning 94 percent of its seniors graduated, more than 30 points above the citywide average. . . . “When I interviewed for the school,” said Sam Buchbinder, a history teacher, “it was made very clear: this is a school that doesn’t believe in anyone failing.” That statement was not just an exhortation to excellence. It was school policy. By order of the principal, codified in the school’s teacher handbook, all teachers should grade their classes in the same way: 30 percent of students should earn a grade in the A range, 40 percent B’s, 25 percent C’s, and no more than 5 percent D’s. As long as they show up, they should not fail. Hey, that sounds like Harvard and Columbia^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H various selective northeastern colleges I’ve known. Of course, we^H^H
6 0.67078173 542 andrew gelman stats-2011-01-28-Homework and treatment levels
7 0.67052358 1318 andrew gelman stats-2012-05-13-Stolen jokes
8 0.62550873 992 andrew gelman stats-2011-11-05-Deadwood in the math curriculum
9 0.61917657 2122 andrew gelman stats-2013-12-03-Objects of the class “Lawrence Summers”: Arne Duncan edition
10 0.61209363 874 andrew gelman stats-2011-08-27-What’s “the definition of a professional career”?
11 0.59898877 1265 andrew gelman stats-2012-04-15-Progress in U.S. education; also, a discussion of what it takes to hit the op-ed pages
12 0.59664977 606 andrew gelman stats-2011-03-10-It’s no fun being graded on a curve
13 0.59428465 1803 andrew gelman stats-2013-04-14-Why girls do better in school
14 0.57847798 2202 andrew gelman stats-2014-02-07-Outrage of the week
15 0.56154716 344 andrew gelman stats-2010-10-15-Story time
16 0.54242861 93 andrew gelman stats-2010-06-17-My proposal for making college admissions fairer
17 0.5363006 1088 andrew gelman stats-2011-12-28-Argument in favor of Ddulites
18 0.53162605 261 andrew gelman stats-2010-09-07-The $900 kindergarten teacher
19 0.50405914 1620 andrew gelman stats-2012-12-12-“Teaching effectiveness” as another dimension in cognitive ability
20 0.49531782 718 andrew gelman stats-2011-05-18-Should kids be able to bring their own lunches to school?
topicId topicWeight
[(9, 0.115), (24, 0.296), (85, 0.188), (99, 0.191)]
simIndex simValue blogId blogTitle
same-blog 1 0.95001864 533 andrew gelman stats-2011-01-23-The scalarization of America
Introduction: Mark Palko writes : You lose information when you go from a vector to a scalar. But what about this trick, which they told me about in high school? Combine two dimensions into one by interleaving the decimals. For example, if a=.11111 and b=.22222, then (a,b) = .1212121212.
2 0.91257977 1534 andrew gelman stats-2012-10-15-The strange reappearance of Matthew Klam
Introduction: A few years ago I asked what happened to Matthew Klam, a talented writer who has a bizarrely professional-looking webpage but didn’t seem to be writing anymore. Good news! He published a new story in the New Yorker! Confusingly, he wrote it under the name “Justin Taylor,” but I’m not fooled (any more than I was fooled when that posthumous Updike story was published under the name “ Antonya Nelson “). I’m glad to see that Klam is back in action and look forward to seeing some stories under his own name as well.
3 0.85022694 938 andrew gelman stats-2011-10-03-Comparing prediction errors
Introduction: Someone named James writes: I’m working on a classification task, sentence segmentation. The classifier algorithm we use (BoosTexter, a boosted learning algorithm) classifies each word independently conditional on its features, i.e. a bag-of-words model, so any contextual clues need to be encoded into the features. The feature extraction system I am proposing in my thesis uses a heteroscedastic LDA to transform data to produce the features the classifier runs on. The HLDA system has a couple parameters I’m testing, and I’m running a 3×2 full factorial experiment. That’s the background which may or may not be relevant to the question. The output of each trial is a class (there are only 2 classes, right now) for every word in the dataset. Because of the nature of the task, one class strongly predominates, say 90-95% of the data. My question is this: in terms of overall performance (we use F1 score), many of these trials are pretty close together, which leads me to ask whethe
4 0.84610939 278 andrew gelman stats-2010-09-15-Advice that might make sense for individuals but is negative-sum overall
Introduction: There’s a lot of free advice out there. As I wrote a couple years ago, it’s usually presented as advice to individuals, but it’s also interesting to consider the possible total effects if the advice is taken. For example, Nassim Taleb has a webpage that includes a bunch of one-line bits of advice (scroll to item 132 on the linked page). Here’s his final piece of advice: If you dislike someone, leave him alone or eliminate him; don’t attack him verbally. I’m a big Taleb fan (search this blog to see), but this seems like classic negative-sum advice. I can see how it can be a good individual strategy to keep your mouth shut, bide your time, and then sandbag your enemies. But it can’t be good if lots of people are doing this. Verbal attacks are great, as long as there’s a chance to respond. I’ve been in environments where people follow Taleb’s advice, saying nothing and occasionally trying to “eliminate” people, and it’s not pretty. I much prefer for people to be open
5 0.8455317 241 andrew gelman stats-2010-08-29-Ethics and statistics in development research
Introduction: From Bannerjee and Duflo, “The Experimental Approach to Development Economics,” Annual Review of Economics (2009): One issue with the explicit acknowledgment of randomization as a fair way to allocate the program is that implementers may find that the easiest way to present it to the community is to say that an expansion of the program is planned for the control areas in the future (especially when such is indeed the case, as in phased-in design). I can’t quite figure out whether Bannerjee and Duflo are saying that they would lie and tell people that an expansion is planned when it isn’t, or whether they’re deploring that other people do it. I’m not bothered by a lot of the deception in experimental research–for example, I think the Milgram obedience experiment was just fine–but somehow the above deception bothers me. It just seems wrong to tell people that an expansion is planned if it’s not. P.S. Overall the article is pretty good. My only real problem with it is that
6 0.84546459 38 andrew gelman stats-2010-05-18-Breastfeeding, infant hyperbilirubinemia, statistical graphics, and modern medicine
8 0.84453529 1092 andrew gelman stats-2011-12-29-More by Berger and me on weakly informative priors
9 0.84222513 1479 andrew gelman stats-2012-09-01-Mothers and Moms
10 0.84131384 482 andrew gelman stats-2010-12-23-Capitalism as a form of voluntarism
11 0.84095722 743 andrew gelman stats-2011-06-03-An argument that can’t possibly make sense
13 0.84041029 1978 andrew gelman stats-2013-08-12-Fixing the race, ethnicity, and national origin questions on the U.S. Census
14 0.83933836 643 andrew gelman stats-2011-04-02-So-called Bayesian hypothesis testing is just as bad as regular hypothesis testing
15 0.83650583 545 andrew gelman stats-2011-01-30-New innovations in spam
16 0.83569825 610 andrew gelman stats-2011-03-13-Secret weapon with rare events
17 0.83521938 1368 andrew gelman stats-2012-06-06-Question 27 of my final exam for Design and Analysis of Sample Surveys
18 0.83439809 1376 andrew gelman stats-2012-06-12-Simple graph WIN: the example of birthday frequencies
19 0.83371395 843 andrew gelman stats-2011-08-07-Non-rant
20 0.83289367 1224 andrew gelman stats-2012-03-21-Teaching velocity and acceleration