andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-150 knowledge-graph by maker-knowledge-mining

150 andrew gelman stats-2010-07-16-Gaydar update: Additional research on estimating small fractions of the population


meta infos for this blog

Source: html

Introduction: Gary Gates writes the following in response to the discussion of my recent blog on the difficulty of using “gaydar” to estimate the frequencies of gays in a population: First, here’s a better (I think, anyway) method than using AIDS deaths from the NY Times (yikes!) to estimate the % of the military that is gay or lesbian. Gates estimates 2.2%, with, unsurprisingly, a higher rate among women than men. He continues: Here’s a tale of the false positive problem affecting who gets counted as same-sex couples in the Census and attached is a working paper that updates those analyses (with better methods, I think) using ACS data. In this paper, Gates (along with Dan Black, Seth Sanders, and Lowell Taylor) finds: Our work indicates that over 40 percent of same-sex “unmarried partner” couples in the 2000 U.S. Decennial Census are likely misclassified different-sex couples. 40% misclassification. Wow.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Gary Gates writes the following in response to the discussion of my recent blog on the difficulty of using “gaydar” to estimate the frequencies of gays in a population: First, here’s a better (I think, anyway) method than using AIDS deaths from the NY Times (yikes! [sent-1, score-1.063]

2 ) to estimate the % of the military that is gay or lesbian. [sent-2, score-0.335]

3 2%, with, unsurprisingly, a higher rate among women than men. [sent-4, score-0.301]

4 He continues: Here’s a tale of the false positive problem affecting who gets counted as same-sex couples in the Census and attached is a working paper that updates those analyses (with better methods, I think) using ACS data. [sent-5, score-1.568]

5 In this paper, Gates (along with Dan Black, Seth Sanders, and Lowell Taylor) finds: Our work indicates that over 40 percent of same-sex “unmarried partner” couples in the 2000 U. [sent-6, score-0.485]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('gates', 0.411), ('couples', 0.288), ('census', 0.226), ('acs', 0.186), ('decennial', 0.186), ('gaydar', 0.176), ('unmarried', 0.168), ('sanders', 0.162), ('aids', 0.157), ('tale', 0.157), ('unsurprisingly', 0.153), ('partner', 0.15), ('frequencies', 0.147), ('affecting', 0.147), ('gays', 0.144), ('counted', 0.141), ('taylor', 0.139), ('updates', 0.139), ('ny', 0.135), ('seth', 0.13), ('deaths', 0.123), ('wow', 0.121), ('finds', 0.119), ('using', 0.116), ('estimate', 0.116), ('indicates', 0.112), ('gay', 0.111), ('gary', 0.11), ('attached', 0.109), ('military', 0.108), ('black', 0.1), ('dan', 0.099), ('continues', 0.091), ('women', 0.091), ('difficulty', 0.089), ('false', 0.086), ('percent', 0.085), ('better', 0.081), ('analyses', 0.081), ('paper', 0.078), ('anyway', 0.076), ('rate', 0.073), ('positive', 0.073), ('higher', 0.073), ('gets', 0.072), ('population', 0.07), ('method', 0.068), ('estimates', 0.064), ('among', 0.064), ('response', 0.063)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 150 andrew gelman stats-2010-07-16-Gaydar update: Additional research on estimating small fractions of the population

Introduction: Gary Gates writes the following in response to the discussion of my recent blog on the difficulty of using “gaydar” to estimate the frequencies of gays in a population: First, here’s a better (I think, anyway) method than using AIDS deaths from the NY Times (yikes!) to estimate the % of the military that is gay or lesbian. Gates estimates 2.2%, with, unsurprisingly, a higher rate among women than men. He continues: Here’s a tale of the false positive problem affecting who gets counted as same-sex couples in the Census and attached is a working paper that updates those analyses (with better methods, I think) using ACS data. In this paper, Gates (along with Dan Black, Seth Sanders, and Lowell Taylor) finds: Our work indicates that over 40 percent of same-sex “unmarried partner” couples in the 2000 U.S. Decennial Census are likely misclassified different-sex couples. 40% misclassification. Wow.

2 0.19119811 1320 andrew gelman stats-2012-05-14-Question 4 of my final exam for Design and Analysis of Sample Surveys

Introduction: 4. Researchers have found that survey respondents overreport church attendance. Thus, naive estimates from surveys overstate the percentage of Americans who attend church regularly. Does this have a large impact on estimates of time trends in religious attendance? Solution to question 3 From yesterday : 3. We discussed in class the best currently available method for estimating the proportion of military servicemembers who are gay. What is that method? (Recall the problems with the direct approach: there is no simple way to survey servicemembers at random, nor is it likely that they would answer such a question honestly.) Solution: I was talking about the work of Gary Gates, combining an estimate of the percentage of gays in the population with an estimate of the probability that someone is in the military, given that he or she is gay.

3 0.12340862 944 andrew gelman stats-2011-10-05-How accurate is your gaydar?

Introduction: Sanjay Srivastava reports : In a typical study, half of the targets are gay/lesbian and half are straight, so a purely random guesser (i.e., someone with no gaydar) would be around 50%. The reported accuracy rates in the articles . . . say that people guess correctly about 65% of the time. . . . Let’s assume that the 65% accuracy rate is symmetric — that guessers are just as good at correctly identifying gays/lesbians as they are in identifying straight people. Let’s also assume that 5% of people are actually gay/lesbian. From those numbers, a quick calculation tells us that for a randomly-selected member of the population, if your gaydar says “GAY” there is a 9% chance that you are right. Eerily accurate? Not so much. If you rely too much on your gaydar, you are going to make a lot of dumb mistakes. . . . It’s the classic problem of combining direct evidence with base rates.

4 0.12135904 2313 andrew gelman stats-2014-04-30-Seth Roberts

Introduction: I met Seth back in the early 1990s when we were both professors at the University of California. He sometimes came to the statistics department seminar and we got to talking about various things; in particular we shared an interest in statistical graphics. Much of my work in this direction eventually went toward the use of graphical displays to understand fitted models. Seth went in another direction and got interested in the role of exploratory data analysis in science, the idea that we could use graphs not just to test or even understand a model but also as the source of new hypotheses. We continued to discuss these issues over the years; see here , for example. At some point when we were at Berkeley the administration was encouraging the faculty to teach freshman seminars, and I had the idea of teaching a course on left-handedness. I’d just read the book by Stanley Coren and thought it would be fun to go through it with a class, chapter by chapter. But my knowledge of psych

5 0.11528709 730 andrew gelman stats-2011-05-25-Rechecking the census

Introduction: Sam Roberts writes : The Census Bureau [reported] that though New York City’s population reached a record high of 8,175,133 in 2010, the gain of 2 percent, or 166,855 people, since 2000 fell about 200,000 short of what the bureau itself had estimated. Public officials were incredulous that a city that lures tens of thousands of immigrants each year and where a forest of new buildings has sprouted could really have recorded such a puny increase. How, they wondered, could Queens have grown by only one-tenth of 1 percent since 2000? How, even with a surge in foreclosures, could the number of vacant apartments have soared by nearly 60 percent in Queens and by 66 percent in Brooklyn? That does seem a bit suspicious. So the newspaper did its own survey: Now, a house-to-house New York Times survey of three representative square blocks where the Census Bureau said vacancies had increased and the population had declined since 2000 suggests that the city’s outrage is somewhat ju

6 0.10408081 1404 andrew gelman stats-2012-07-03-Counting gays

7 0.10180934 2154 andrew gelman stats-2013-12-30-Bill Gates’s favorite graph of the year

8 0.10151067 405 andrew gelman stats-2010-11-10-Estimation from an out-of-date census

9 0.099697091 105 andrew gelman stats-2010-06-23-More on those divorce prediction statistics, including a discussion of the innumeracy of (some) mathematicians

10 0.097530469 455 andrew gelman stats-2010-12-07-Some ideas on communicating risks to the general public

11 0.094035015 142 andrew gelman stats-2010-07-12-God, Guns, and Gaydar: The Laws of Probability Push You to Overestimate Small Groups

12 0.087660514 1295 andrew gelman stats-2012-05-02-Selection bias, or, How you can think the experts don’t check their models, if you simply don’t look at what the experts actually are doing

13 0.080053851 688 andrew gelman stats-2011-04-30-Why it’s so relaxing to think about social issues

14 0.078150965 1978 andrew gelman stats-2013-08-12-Fixing the race, ethnicity, and national origin questions on the U.S. Census

15 0.077317633 446 andrew gelman stats-2010-12-03-Is 0.05 too strict as a p-value threshold?

16 0.072674505 370 andrew gelman stats-2010-10-25-Who gets wedding announcements in the Times?

17 0.071660191 381 andrew gelman stats-2010-10-30-Sorry, Senator DeMint: Most Americans Don’t Want to Ban Gays from the Classroom

18 0.070455767 853 andrew gelman stats-2011-08-14-Preferential admissions for children of elite colleges

19 0.067880079 69 andrew gelman stats-2010-06-04-A Wikipedia whitewash

20 0.066812813 1654 andrew gelman stats-2013-01-04-“Don’t think of it as duplication. Think of it as a single paper in a superposition of two quantum journals.”


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.099), (1, -0.0), (2, 0.059), (3, -0.043), (4, 0.019), (5, -0.011), (6, -0.006), (7, 0.004), (8, -0.003), (9, -0.026), (10, 0.001), (11, -0.016), (12, -0.011), (13, 0.052), (14, -0.003), (15, 0.034), (16, 0.005), (17, 0.017), (18, -0.028), (19, -0.005), (20, -0.005), (21, 0.01), (22, -0.035), (23, 0.011), (24, 0.027), (25, -0.022), (26, -0.056), (27, 0.014), (28, 0.057), (29, -0.005), (30, 0.021), (31, 0.004), (32, 0.007), (33, -0.038), (34, -0.021), (35, -0.012), (36, 0.017), (37, -0.005), (38, 0.01), (39, -0.009), (40, -0.02), (41, -0.01), (42, 0.008), (43, -0.018), (44, -0.023), (45, -0.006), (46, -0.016), (47, 0.052), (48, -0.014), (49, -0.0)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9411943 150 andrew gelman stats-2010-07-16-Gaydar update: Additional research on estimating small fractions of the population

Introduction: Gary Gates writes the following in response to the discussion of my recent blog on the difficulty of using “gaydar” to estimate the frequencies of gays in a population: First, here’s a better (I think, anyway) method than using AIDS deaths from the NY Times (yikes!) to estimate the % of the military that is gay or lesbian. Gates estimates 2.2%, with, unsurprisingly, a higher rate among women than men. He continues: Here’s a tale of the false positive problem affecting who gets counted as same-sex couples in the Census and attached is a working paper that updates those analyses (with better methods, I think) using ACS data. In this paper, Gates (along with Dan Black, Seth Sanders, and Lowell Taylor) finds: Our work indicates that over 40 percent of same-sex “unmarried partner” couples in the 2000 U.S. Decennial Census are likely misclassified different-sex couples. 40% misclassification. Wow.

2 0.72823411 849 andrew gelman stats-2011-08-11-The Reliability of Cluster Surveys of Conflict Mortality: Violent Deaths and Non-Violent Deaths

Introduction: Mike Spagat sends in an interesting explanation for the noted problems with conflict mortality studies (a topic we’ve discussed on occasion on this blog). Spagat writes: This analysis is based on the fact that conflict violence does not spread out at all uniformly across a map but, rather, tends to concentrate in a few areas. This means that small, headline-grabbing violence surveys are extremely unreliable. There is a second point, based on the work of David Hemenway which you’ve also cited on your blog. Even within exceptionally violent environments most households will still not have a violent death. So a very small false positive rate in a household survey will cause substantial upward bias in violence estimates.

3 0.71543002 142 andrew gelman stats-2010-07-12-God, Guns, and Gaydar: The Laws of Probability Push You to Overestimate Small Groups

Introduction: Earlier today, Nate criticized a U.S. military survey that asks troops the question, “Do you currently serve with a male or female Service member you believe to be homosexual.” [emphasis added] As Nate points out, by asking this question in such a speculative way, “it would seem that you’ll be picking up a tremendous number of false positives–soldiers who are believed to be gay, but aren’t–and that these false positives will swamp any instances in which soldiers (in spite of DADT) are actually somewhat open about their same-sex attractions.” This is a general problem in survey research. In an article in Chance magazine in 1997, “The myth of millions of annual self-defense gun uses: a case study of survey overestimates of rare events” [see here for related references], David Hemenway uses the false-positive, false-negative reasoning to explain this bias in terms of probability theory. Misclassifications that induce seemingly minor biases in estimates of certain small probab

4 0.70643532 12 andrew gelman stats-2010-04-30-More on problems with surveys estimating deaths in war zones

Introduction: Andrew Mack writes: There was a brief commentary from the Benetech folk on the Human Security Report Project’s, “The Shrinking Costs of War” report on your blog in January. But the report has since generated a lot of public controversy . Since the report–like the current discussion in your blog on Mike Spagat’s new paper on Iraq–deals with controversies generated by survey-based excess death estimates, we thought your readers might be interested. Our responses to the debate were posted on our website last week. “Shrinking Costs” had discussed the dramatic decline in death tolls from wartime violence since the end of World War II –and its causes. We also argued that deaths from war-exacerbated disease and malnutrition had declined. (The exec. summary is here .) One of the most striking findings was that mortality rates (we used under-five mortality data) decline during most wars. Indeed our latest research indicates that of the total number of years that countries w

5 0.70236915 730 andrew gelman stats-2011-05-25-Rechecking the census

Introduction: Sam Roberts writes : The Census Bureau [reported] that though New York City’s population reached a record high of 8,175,133 in 2010, the gain of 2 percent, or 166,855 people, since 2000 fell about 200,000 short of what the bureau itself had estimated. Public officials were incredulous that a city that lures tens of thousands of immigrants each year and where a forest of new buildings has sprouted could really have recorded such a puny increase. How, they wondered, could Queens have grown by only one-tenth of 1 percent since 2000? How, even with a surge in foreclosures, could the number of vacant apartments have soared by nearly 60 percent in Queens and by 66 percent in Brooklyn? That does seem a bit suspicious. So the newspaper did its own survey: Now, a house-to-house New York Times survey of three representative square blocks where the Census Bureau said vacancies had increased and the population had declined since 2000 suggests that the city’s outrage is somewhat ju

6 0.6411081 405 andrew gelman stats-2010-11-10-Estimation from an out-of-date census

7 0.63776612 947 andrew gelman stats-2011-10-08-GiveWell sez: Cost-effectiveness of de-worming was overstated by a factor of 100 (!) due to a series of sloppy calculations

8 0.62365985 1404 andrew gelman stats-2012-07-03-Counting gays

9 0.61652595 5 andrew gelman stats-2010-04-27-Ethical and data-integrity problems in a study of mortality in Iraq

10 0.61604857 1500 andrew gelman stats-2012-09-17-“2% per degree Celsius . . . the magic number for how worker productivity responds to warm-hot temperatures”

11 0.61337107 108 andrew gelman stats-2010-06-24-Sometimes the raw numbers are better than a percentage

12 0.6129508 1558 andrew gelman stats-2012-11-02-Not so fast on levees and seawalls for NY harbor?

13 0.60974604 1679 andrew gelman stats-2013-01-18-Is it really true that only 8% of people who buy Herbalife products are Herbalife distributors?

14 0.6073854 2056 andrew gelman stats-2013-10-09-Mister P: What’s its secret sauce?

15 0.60176516 1312 andrew gelman stats-2012-05-11-Are our referencing errors undermining our scholarship and credibility? The case of expatriate failure rates

16 0.60090536 2328 andrew gelman stats-2014-05-10-What property is important in a risk prediction model? Discrimination or calibration?

17 0.58886343 1397 andrew gelman stats-2012-06-27-Stand Your Ground laws and homicides

18 0.58807546 2061 andrew gelman stats-2013-10-14-More on Mister P and how it does what it does

19 0.58434314 1295 andrew gelman stats-2012-05-02-Selection bias, or, How you can think the experts don’t check their models, if you simply don’t look at what the experts actually are doing

20 0.58046472 925 andrew gelman stats-2011-09-26-Ethnicity and Population Structure in Personal Naming Networks


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(6, 0.244), (9, 0.02), (12, 0.019), (15, 0.029), (24, 0.078), (32, 0.038), (36, 0.02), (45, 0.038), (53, 0.036), (55, 0.015), (61, 0.015), (93, 0.021), (95, 0.055), (99, 0.265)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.9226979 650 andrew gelman stats-2011-04-05-Monitor the efficiency of your Markov chain sampler using expected squared jumped distance!

Introduction: Marc Tanguay writes in with a specific question that has a very general answer. First, the question: I [Tanguay] am currently running a MCMC for which I have 3 parameters that are restricted to a specific space. 2 are bounded between 0 and 1 while the third is binary and updated by a Beta-Binomial. Since my priors are also bounded, I notice that, conditional on All the rest (which covers both data and other parameters), the density was not varying a lot within the space of the parameters. As a result, the acceptance rate is high, about 85%, and this despite the fact that all the parameter’s space is explore. Since in your book, the optimal acceptance rates prescribed are lower that 50% (in case of multiple parameters), do you think I should worry about getting 85%. Or is this normal given the restrictions on the parameters? First off: Yes, my guess is that you should be taking bigger jumps. 85% seems like too high an acceptance rate for Metropolis jumping. More generally, t

same-blog 2 0.90985966 150 andrew gelman stats-2010-07-16-Gaydar update: Additional research on estimating small fractions of the population

Introduction: Gary Gates writes the following in response to the discussion of my recent blog on the difficulty of using “gaydar” to estimate the frequencies of gays in a population: First, here’s a better (I think, anyway) method than using AIDS deaths from the NY Times (yikes!) to estimate the % of the military that is gay or lesbian. Gates estimates 2.2%, with, unsurprisingly, a higher rate among women than men. He continues: Here’s a tale of the false positive problem affecting who gets counted as same-sex couples in the Census and attached is a working paper that updates those analyses (with better methods, I think) using ACS data. In this paper, Gates (along with Dan Black, Seth Sanders, and Lowell Taylor) finds: Our work indicates that over 40 percent of same-sex “unmarried partner” couples in the 2000 U.S. Decennial Census are likely misclassified different-sex couples. 40% misclassification. Wow.

3 0.88964772 221 andrew gelman stats-2010-08-21-Busted!

Introduction: I’m just glad that universities don’t sanction professors for publishing false theorems. If the guy really is nailed by the feds for fraud, I hope they don’t throw him in prison. In general, prison time seems like a brutal, expensive, and inefficient way to punish people. I’d prefer if the government just took 95% of his salary for several years, made him do community service (cleaning equipment at the local sewage treatment plant, perhaps; a lab scientist should be good at this sort of thing, no?), etc. If restriction of this dude’s personal freedom is judged be part of the sentence, he could be given some sort of electronic tag that would send a message to the police if he were ever more than 3 miles from his home. But no need to bill the taxpayers for the cost of keeping him in prison.

4 0.88794053 1710 andrew gelman stats-2013-02-06-The new Stan 1.1.1, featuring Gaussian processes!

Introduction: We just released Stan 1.1.1 and RStan 1.1.1 As usual, you can find download and install instructions at: http://mc-stan.org/ This is a patch release and is fully backward compatible with Stan and RStan 1.1.0. The main thing you should notice is that the multivariate models should be much faster and all the bugs reported for 1.1.0 have been fixed. We’ve also added a bit more functionality. The substantial changes are listed in the following release notes. v1.1.1 (5 February 2012) ====================================================================== Bug Fixes ———————————- * fixed bug in comparison operators, which swapped operator< with operator<= and swapped operator> with operator>= semantics * auto-initialize all variables to prevent segfaults * atan2 gradient propagation fixed * fixed off-by-one in NUTS treedepth bound so NUTS goes at most to specified tree depth rather than specified depth + 1 * various compiler compatibility and minor consistency issues * f

5 0.88256437 1638 andrew gelman stats-2012-12-25-Diving chess

Introduction: Knowing of my interest in Turing run-around-the-house chess , David Lockhart points me to this : Diving Chess is a chess variant, which is played in a swimming pool. Instead of using chess clocks, each player must submerge themselves underwater during their turn, only to resurface when they are ready to make a move. Players must make a move within 5 seconds of resurfacing (they will receive a warning if not, and three warnings will result in a forfeit). Diving Chess was invented by American Chess Master Etan Ilfeld; the very first exhibition game took place between Ilfeld and former British Chess Champion William Hartston at the Thirdspace gym in Soho on August 2nd, 2011. Hartston won the match which lasted almost two hours such that each player was underwater for an entire hour.

6 0.85887563 819 andrew gelman stats-2011-07-24-Don’t idealize “risk aversion”

7 0.8530407 1906 andrew gelman stats-2013-06-19-“Behind a cancer-treatment firm’s rosy survival claims”

8 0.82134986 618 andrew gelman stats-2011-03-18-Prior information . . . about the likelihood

9 0.80607569 2332 andrew gelman stats-2014-05-12-“The results (not shown) . . .”

10 0.80183309 2098 andrew gelman stats-2013-11-12-Plaig!

11 0.80170351 1625 andrew gelman stats-2012-12-15-“I coach the jumpers here at Boise State . . .”

12 0.79972672 2316 andrew gelman stats-2014-05-03-“The graph clearly shows that mammography adds virtually nothing to survival and if anything, decreases survival (and increases cost and provides unnecessary treatment)”

13 0.79672396 1924 andrew gelman stats-2013-07-03-Kuhn, 1-f noise, and the fractal nature of scientific revolutions

14 0.79602295 1148 andrew gelman stats-2012-01-31-“the forces of native stupidity reinforced by that blind hostility to criticism, reform, new ideas and superior ability which is human as well as academic nature”

15 0.79158688 1409 andrew gelman stats-2012-07-08-Is linear regression unethical in that it gives more weight to cases that are far from the average?

16 0.78427386 2165 andrew gelman stats-2014-01-09-San Fernando Valley cityscapes: An example of the benefits of fractal devastation?

17 0.7836777 1489 andrew gelman stats-2012-09-09-Commercial Bayesian inference software is popping up all over

18 0.78111583 851 andrew gelman stats-2011-08-12-year + (1|year)

19 0.77000105 563 andrew gelman stats-2011-02-07-Evaluating predictions of political events

20 0.76430333 2293 andrew gelman stats-2014-04-16-Looking for Bayesian expertise in India, for the purpose of analysis of sarcoma trials