andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-156 knowledge-graph by maker-knowledge-mining

156 andrew gelman stats-2010-07-20-Burglars are local


meta infos for this blog

Source: html

Introduction: This makes sense: In the land of fiction, it’s the criminal’s modus operandi – his method of entry, his taste for certain jewellery and so forth – that can be used by detectives to identify his handiwork. The reality according to a new analysis of solved burglaries in the Northamptonshire region of England is that these aspects of criminal behaviour are on their own unreliable as identifying markers, most likely because they are dictated by circumstances rather than the criminal’s taste and style. However, the geographical spread and timing of a burglar’s crimes are distinctive, and could help with police investigations. And, as a bonus, more Tourette’s pride! P.S. On yet another unrelated topic from the same blog, I wonder if the researchers in this study are aware that the difference between “significant” and “not significant” is not itself statistically significant .


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 This makes sense: In the land of fiction, it’s the criminal’s modus operandi – his method of entry, his taste for certain jewellery and so forth – that can be used by detectives to identify his handiwork. [sent-1, score-1.121]

2 The reality according to a new analysis of solved burglaries in the Northamptonshire region of England is that these aspects of criminal behaviour are on their own unreliable as identifying markers, most likely because they are dictated by circumstances rather than the criminal’s taste and style. [sent-2, score-2.22]

3 However, the geographical spread and timing of a burglar’s crimes are distinctive, and could help with police investigations. [sent-3, score-0.794]

4 On yet another unrelated topic from the same blog, I wonder if the researchers in this study are aware that the difference between “significant” and “not significant” is not itself statistically significant . [sent-7, score-1.031]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('criminal', 0.448), ('taste', 0.26), ('significant', 0.236), ('detectives', 0.209), ('markers', 0.197), ('dictated', 0.182), ('behaviour', 0.172), ('distinctive', 0.168), ('unreliable', 0.164), ('geographical', 0.161), ('pride', 0.161), ('land', 0.161), ('timing', 0.158), ('fiction', 0.153), ('crimes', 0.149), ('bonus', 0.149), ('region', 0.141), ('england', 0.139), ('unrelated', 0.139), ('police', 0.138), ('solved', 0.134), ('identifying', 0.134), ('circumstances', 0.13), ('forth', 0.123), ('spread', 0.12), ('reality', 0.112), ('entry', 0.106), ('identify', 0.105), ('aspects', 0.102), ('aware', 0.101), ('statistically', 0.087), ('according', 0.086), ('certain', 0.084), ('yet', 0.078), ('wonder', 0.078), ('method', 0.076), ('difference', 0.072), ('however', 0.07), ('topic', 0.069), ('help', 0.068), ('likely', 0.067), ('researchers', 0.063), ('study', 0.058), ('makes', 0.053), ('used', 0.05), ('another', 0.05), ('sense', 0.05), ('blog', 0.046), ('analysis', 0.045), ('rather', 0.043)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 156 andrew gelman stats-2010-07-20-Burglars are local

Introduction: This makes sense: In the land of fiction, it’s the criminal’s modus operandi – his method of entry, his taste for certain jewellery and so forth – that can be used by detectives to identify his handiwork. The reality according to a new analysis of solved burglaries in the Northamptonshire region of England is that these aspects of criminal behaviour are on their own unreliable as identifying markers, most likely because they are dictated by circumstances rather than the criminal’s taste and style. However, the geographical spread and timing of a burglar’s crimes are distinctive, and could help with police investigations. And, as a bonus, more Tourette’s pride! P.S. On yet another unrelated topic from the same blog, I wonder if the researchers in this study are aware that the difference between “significant” and “not significant” is not itself statistically significant .

2 0.16716714 1072 andrew gelman stats-2011-12-19-“The difference between . . .”: It’s not just p=.05 vs. p=.06

Introduction: The title of this post by Sanjay Srivastava illustrates an annoying misconception that’s crept into the (otherwise delightful) recent publicity related to my article with Hal Stern, he difference between “significant” and “not significant” is not itself statistically significant. When people bring this up, they keep referring to the difference between p=0.05 and p=0.06, making the familiar (and correct) point about the arbitrariness of the conventional p-value threshold of 0.05. And, sure, I agree with this, but everybody knows that already. The point Hal and I were making was that even apparently large differences in p-values are not statistically significant. For example, if you have one study with z=2.5 (almost significant at the 1% level!) and another with z=1 (not statistically significant at all, only 1 se from zero!), then their difference has a z of about 1 (again, not statistically significant at all). So it’s not just a comparison of 0.05 vs. 0.06, even a differenc

3 0.13271871 1942 andrew gelman stats-2013-07-17-“Stop and frisk” statistics

Introduction: Washington Post columnist Richard Cohen brings up one of my research topics: In New York City, blacks make up a quarter of the population, yet they represent 78 percent of all shooting suspects — almost all of them young men. We know them from the nightly news. Those statistics represent the justification for New York City’s controversial stop-and-frisk program, which amounts to racial profiling writ large. After all, if young black males are your shooters, then it ought to be young black males whom the police stop and frisk. I have two comments on this. First, my research with Jeff Fagan and Alex Kiss (based on data from the late 1990s, so maybe things have changed) found that the NYPD was stopping blacks and hispanics at a rate higher than their previous arrest rates: To briefly summarize our findings, blacks and Hispanics represented 51% and 33% of the stops while representing only 26% and 24% of the New York City population. Compared with the number of arrests of

4 0.1043745 899 andrew gelman stats-2011-09-10-The statistical significance filter

Introduction: I’ve talked about this a bit but it’s never had its own blog entry (until now). Statistically significant findings tend to overestimate the magnitude of effects. This holds in general (because E(|x|) > |E(x)|) but even more so if you restrict to statistically significant results. Here’s an example. Suppose a true effect of theta is unbiasedly estimated by y ~ N (theta, 1). Further suppose that we will only consider statistically significant results, that is, cases in which |y| > 2. The estimate “|y| conditional on |y|>2″ is clearly an overestimate of |theta|. First off, if |theta|<2, the estimate |y| conditional on statistical significance is not only too high in expectation, it's always too high. This is a problem, given that |theta| is in reality probably is less than 2. (The low-hangning fruit have already been picked, remember?) But even if |theta|>2, the estimate |y| conditional on statistical significance will still be too high in expectation. For a discussion o

5 0.094314627 106 andrew gelman stats-2010-06-23-Scientists can read your mind . . . as long as the’re allowed to look at more than one place in your brain and then make a prediction after seeing what you actually did

Introduction: Maggie Fox writes : Brain scans may be able to predict what you will do better than you can yourself . . . They found a way to interpret “real time” brain images to show whether people who viewed messages about using sunscreen would actually use sunscreen during the following week. The scans were more accurate than the volunteers were, Emily Falk and colleagues at the University of California Los Angeles reported in the Journal of Neuroscience. . . . About half the volunteers had correctly predicted whether they would use sunscreen. The research team analyzed and re-analyzed the MRI scans to see if they could find any brain activity that would do better. Activity in one area of the brain, a particular part of the medial prefrontal cortex, provided the best information. “From this region of the brain, we can predict for about three-quarters of the people whether they will increase their use of sunscreen beyond what they say they will do,” Lieberman said. “It is the one re

6 0.088572264 1251 andrew gelman stats-2012-04-07-Mathematical model of vote operations

7 0.080443881 920 andrew gelman stats-2011-09-22-Top 10 blog obsessions

8 0.079129942 1852 andrew gelman stats-2013-05-12-Crime novels for economists

9 0.075744055 1838 andrew gelman stats-2013-05-03-Setting aside the politics, the debate over the new health-care study reveals that we’re moving to a new high standard of statistical journalism

10 0.075578846 758 andrew gelman stats-2011-06-11-Hey, good news! Your p-value just passed the 0.05 threshold!

11 0.075368904 806 andrew gelman stats-2011-07-17-6 links

12 0.075180754 2026 andrew gelman stats-2013-09-16-He’s adult entertainer, Child educator, King of the crossfader, He’s the greatest of the greater, He’s a big bad wolf in your neighborhood, Not bad meaning bad but bad meaning good

13 0.074509174 897 andrew gelman stats-2011-09-09-The difference between significant and not significant…

14 0.066723675 57 andrew gelman stats-2010-05-29-Roth and Amsterdam

15 0.063349068 1417 andrew gelman stats-2012-07-15-Some decision analysis problems are pretty easy, no?

16 0.062802941 310 andrew gelman stats-2010-10-02-The winner’s curse

17 0.062720366 2042 andrew gelman stats-2013-09-28-Difficulties of using statistical significance (or lack thereof) to sift through and compare research hypotheses

18 0.058390308 1150 andrew gelman stats-2012-02-02-The inevitable problems with statistical significance and 95% intervals

19 0.056924921 2039 andrew gelman stats-2013-09-25-Harmonic convergence

20 0.056501176 401 andrew gelman stats-2010-11-08-Silly old chi-square!


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.075), (1, -0.006), (2, 0.008), (3, -0.056), (4, -0.005), (5, -0.027), (6, -0.0), (7, 0.009), (8, -0.004), (9, 0.002), (10, -0.026), (11, -0.011), (12, 0.031), (13, -0.023), (14, 0.021), (15, 0.017), (16, 0.01), (17, -0.011), (18, 0.016), (19, -0.006), (20, 0.015), (21, 0.01), (22, -0.005), (23, -0.009), (24, 0.009), (25, 0.012), (26, 0.024), (27, -0.026), (28, 0.021), (29, -0.046), (30, 0.019), (31, 0.032), (32, 0.012), (33, 0.003), (34, 0.039), (35, 0.043), (36, -0.045), (37, -0.007), (38, -0.017), (39, 0.02), (40, -0.002), (41, -0.01), (42, -0.028), (43, 0.027), (44, 0.013), (45, -0.041), (46, -0.014), (47, -0.008), (48, 0.02), (49, -0.021)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97766423 156 andrew gelman stats-2010-07-20-Burglars are local

Introduction: This makes sense: In the land of fiction, it’s the criminal’s modus operandi – his method of entry, his taste for certain jewellery and so forth – that can be used by detectives to identify his handiwork. The reality according to a new analysis of solved burglaries in the Northamptonshire region of England is that these aspects of criminal behaviour are on their own unreliable as identifying markers, most likely because they are dictated by circumstances rather than the criminal’s taste and style. However, the geographical spread and timing of a burglar’s crimes are distinctive, and could help with police investigations. And, as a bonus, more Tourette’s pride! P.S. On yet another unrelated topic from the same blog, I wonder if the researchers in this study are aware that the difference between “significant” and “not significant” is not itself statistically significant .

2 0.89961958 1072 andrew gelman stats-2011-12-19-“The difference between . . .”: It’s not just p=.05 vs. p=.06

Introduction: The title of this post by Sanjay Srivastava illustrates an annoying misconception that’s crept into the (otherwise delightful) recent publicity related to my article with Hal Stern, he difference between “significant” and “not significant” is not itself statistically significant. When people bring this up, they keep referring to the difference between p=0.05 and p=0.06, making the familiar (and correct) point about the arbitrariness of the conventional p-value threshold of 0.05. And, sure, I agree with this, but everybody knows that already. The point Hal and I were making was that even apparently large differences in p-values are not statistically significant. For example, if you have one study with z=2.5 (almost significant at the 1% level!) and another with z=1 (not statistically significant at all, only 1 se from zero!), then their difference has a z of about 1 (again, not statistically significant at all). So it’s not just a comparison of 0.05 vs. 0.06, even a differenc

3 0.80457431 1557 andrew gelman stats-2012-11-01-‘Researcher Degrees of Freedom’

Introduction: False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant [I]t is unacceptably easy to publish “statistically significant” evidence consistent with any hypothesis. The culprit is a construct we refer to as researcher degrees of freedom. In the course of collecting and analyzing data, researchers have many decisions to make: Should more data be collected? Should some observations be excluded? Which conditions should be combined and which ones compared? Which control variables should be considered? Should specific measures be combined or transformed or both? It is rare, and sometimes impractical, for researchers to make all these decisions beforehand. Rather, it is common (and accepted practice) for researchers to explore various analytic alternatives, to search for a combination that yields “statistical significance,” and to then report only what “worked.” The problem, of course, is that the likelihood of at leas

4 0.73277807 106 andrew gelman stats-2010-06-23-Scientists can read your mind . . . as long as the’re allowed to look at more than one place in your brain and then make a prediction after seeing what you actually did

Introduction: Maggie Fox writes : Brain scans may be able to predict what you will do better than you can yourself . . . They found a way to interpret “real time” brain images to show whether people who viewed messages about using sunscreen would actually use sunscreen during the following week. The scans were more accurate than the volunteers were, Emily Falk and colleagues at the University of California Los Angeles reported in the Journal of Neuroscience. . . . About half the volunteers had correctly predicted whether they would use sunscreen. The research team analyzed and re-analyzed the MRI scans to see if they could find any brain activity that would do better. Activity in one area of the brain, a particular part of the medial prefrontal cortex, provided the best information. “From this region of the brain, we can predict for about three-quarters of the people whether they will increase their use of sunscreen beyond what they say they will do,” Lieberman said. “It is the one re

5 0.73013812 899 andrew gelman stats-2011-09-10-The statistical significance filter

Introduction: I’ve talked about this a bit but it’s never had its own blog entry (until now). Statistically significant findings tend to overestimate the magnitude of effects. This holds in general (because E(|x|) > |E(x)|) but even more so if you restrict to statistically significant results. Here’s an example. Suppose a true effect of theta is unbiasedly estimated by y ~ N (theta, 1). Further suppose that we will only consider statistically significant results, that is, cases in which |y| > 2. The estimate “|y| conditional on |y|>2″ is clearly an overestimate of |theta|. First off, if |theta|<2, the estimate |y| conditional on statistical significance is not only too high in expectation, it's always too high. This is a problem, given that |theta| is in reality probably is less than 2. (The low-hangning fruit have already been picked, remember?) But even if |theta|>2, the estimate |y| conditional on statistical significance will still be too high in expectation. For a discussion o

6 0.72602433 1971 andrew gelman stats-2013-08-07-I doubt they cheated

7 0.7098881 1171 andrew gelman stats-2012-02-16-“False-positive psychology”

8 0.70986497 1944 andrew gelman stats-2013-07-18-You’ll get a high Type S error rate if you use classical statistical methods to analyze data from underpowered studies

9 0.68009752 310 andrew gelman stats-2010-10-02-The winner’s curse

10 0.67532378 2159 andrew gelman stats-2014-01-04-“Dogs are sensitive to small variations of the Earth’s magnetic field”

11 0.67439461 593 andrew gelman stats-2011-02-27-Heat map

12 0.67381835 1776 andrew gelman stats-2013-03-25-The harm done by tests of significance

13 0.66684508 2049 andrew gelman stats-2013-10-03-On house arrest for p-hacking

14 0.66655421 1893 andrew gelman stats-2013-06-11-Folic acid and autism

15 0.66577005 2030 andrew gelman stats-2013-09-19-Is coffee a killer? I don’t think the effect is as high as was estimated from the highest number that came out of a noisy study

16 0.65539867 146 andrew gelman stats-2010-07-14-The statistics and the science

17 0.65513486 933 andrew gelman stats-2011-09-30-More bad news: The (mis)reporting of statistical results in psychology journals

18 0.65448922 1838 andrew gelman stats-2013-05-03-Setting aside the politics, the debate over the new health-care study reveals that we’re moving to a new high standard of statistical journalism

19 0.65011615 2156 andrew gelman stats-2014-01-01-“Though They May Be Unaware, Newlyweds Implicitly Know Whether Their Marriage Will Be Satisfying”

20 0.64672619 410 andrew gelman stats-2010-11-12-The Wald method has been the subject of extensive criticism by statisticians for exaggerating results”


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(9, 0.038), (16, 0.013), (24, 0.13), (29, 0.022), (45, 0.024), (62, 0.218), (77, 0.07), (86, 0.072), (95, 0.035), (98, 0.022), (99, 0.232)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95903289 156 andrew gelman stats-2010-07-20-Burglars are local

Introduction: This makes sense: In the land of fiction, it’s the criminal’s modus operandi – his method of entry, his taste for certain jewellery and so forth – that can be used by detectives to identify his handiwork. The reality according to a new analysis of solved burglaries in the Northamptonshire region of England is that these aspects of criminal behaviour are on their own unreliable as identifying markers, most likely because they are dictated by circumstances rather than the criminal’s taste and style. However, the geographical spread and timing of a burglar’s crimes are distinctive, and could help with police investigations. And, as a bonus, more Tourette’s pride! P.S. On yet another unrelated topic from the same blog, I wonder if the researchers in this study are aware that the difference between “significant” and “not significant” is not itself statistically significant .

2 0.87932789 668 andrew gelman stats-2011-04-19-The free cup and the extra dollar: A speculation in philosophy

Introduction: The following is an essay into a topic I know next to nothing about. As part of our endless discussion of Dilbert and Charlie Sheen, commenter Fraac linked to a blog by philosopher Edouard Machery, who tells a fascinating story : How do we think about the intentional nature of actions? And how do people with an impaired mindreading capacity think about it? Consider the following probes: The Free-Cup Case Joe was feeling quite dehydrated, so he stopped by the local smoothie shop to buy the largest sized drink available. Before ordering, the cashier told him that if he bought a Mega-Sized Smoothie he would get it in a special commemorative cup. Joe replied, ‘I don’t care about a commemorative cup, I just want the biggest smoothie you have.’ Sure enough, Joe received the Mega-Sized Smoothie in a commemorative cup. Did Joe intentionally obtain the commemorative cup? The Extra-Dollar Case Joe was feeling quite dehydrated, so he stopped by the local smoothie shop to buy

3 0.85191864 715 andrew gelman stats-2011-05-16-“It doesn’t matter if you believe in God. What matters is if God believes in you.”

Introduction: Mark Chaves sent me this great article on religion and religious practice: After reading a book or article in the scientific study of religion, I [Chaves] wonder if you ever find yourself thinking, “I just don’t believe it.” I have this experience uncomfortably often, and I think it’s because of a pervasive problem in the scientific study of religion. I want to describe that problem and how to overcome it. The problem is illustrated in a story told by Meyer Fortes. He once asked a rainmaker in a native culture he was studying to perform the rainmaking ceremony for him. The rainmaker refused, replying: “Don’t be a fool, whoever makes a rain-making ceremony in the dry season?” The problem is illustrated in a different way in a story told by Jay Demerath. He was in Israel, visiting friends for a Sabbath dinner. The man of the house, a conservative rabbi, stopped in the middle of chanting the prayers to say cheerfully: “You know, we don’t believe in any of this. But then in Judai

4 0.8363905 107 andrew gelman stats-2010-06-24-PPS in Georgia

Introduction: Lucy Flynn writes: I’m working at a non-profit organization called CRRC in the Republic of Georgia. I’m having a methodological problem and I saw the syllabus for your sampling class online and thought I might be able to ask you about it? We do a lot of complex surveys nationwide; our typical sample design is as follows: - stratify by rural/urban/capital - sub-stratify the rural and urban strata into NE/NW/SE/SW geographic quadrants - select voting precincts as PSUs - select households as SSUs - select individual respondents as TSUs I’m relatively new here, and past practice has been to sample voting precincts with probability proportional to size. It’s desirable because it’s not logistically feasible for us to vary the number of interviews per precinct with precinct size, so it makes the selection probabilities for households more even across precinct sizes. However, I have a complex sampling textbook (Lohr 1999), and it explains how complex it is to calculate sel

5 0.82307488 260 andrew gelman stats-2010-09-07-QB2

Introduction: Dave Berri writes: Saw you had a post on the research I did with Rob Simmons on the NFL draft. I have attached the article. This article has not officially been published, so please don’t post this on-line. The post you linked to states the following: “On his blog, Berri says he restricts the analysis to QBs who have played more than 500 downs, or for 5 years. He also looks at per-play statistics, like touchdowns per game, to counter what he considers an opportunity bias.” Two points: First of all, we did not look at touchdowns per game (that is not a per play stat). More importantly — as this post indicates — we did far more than just look at data after five years. We did mention the five year result, but directly below that discussion (and I mean, directly below), the following sentences appear. Our data set runs from 1970 to 2007 (adjustments were made for how performance changed over time). We also looked at career performance after 2, 3, 4, 6, 7, and 8 years

6 0.81793678 1414 andrew gelman stats-2012-07-12-Steven Pinker’s unconvincing debunking of group selection

7 0.81277919 2082 andrew gelman stats-2013-10-30-Berri Gladwell Loken football update

8 0.77794075 1082 andrew gelman stats-2011-12-25-Further evidence of a longstanding principle of statistics

9 0.77749157 986 andrew gelman stats-2011-11-01-MacKay update: where 12 comes from

10 0.77591109 704 andrew gelman stats-2011-05-10-Multiple imputation and multilevel analysis

11 0.77327287 1904 andrew gelman stats-2013-06-18-Job opening! Come work with us!

12 0.77286708 1220 andrew gelman stats-2012-03-19-Sorry, no ARM solutions

13 0.76576948 2021 andrew gelman stats-2013-09-13-Swiss Jonah Lehrer

14 0.76290715 530 andrew gelman stats-2011-01-22-MS-Bayes?

15 0.75951958 1746 andrew gelman stats-2013-03-02-Fishing for cherries

16 0.75932944 1944 andrew gelman stats-2013-07-18-You’ll get a high Type S error rate if you use classical statistical methods to analyze data from underpowered studies

17 0.75926173 1438 andrew gelman stats-2012-07-31-What is a Bayesian?

18 0.7588532 2145 andrew gelman stats-2013-12-24-Estimating and summarizing inference for hierarchical variance parameters when the number of groups is small

19 0.75827849 1124 andrew gelman stats-2012-01-17-How to map geographically-detailed survey responses?

20 0.75775695 562 andrew gelman stats-2011-02-06-Statistician cracks Toronto lottery