andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1679 knowledge-graph by maker-knowledge-mining

1679 andrew gelman stats-2013-01-18-Is it really true that only 8% of people who buy Herbalife products are Herbalife distributors?


meta infos for this blog

Source: html

Introduction: A reporter emailed me the other day with a question about a case I’d never heard of before, a company called Herbalife that is being accused of being a pyramid scheme. The reporter pointed me to this document which describes a survey conducted by “a third party firm called Lieberman Research”: Two independent studies took place using real time (aka “river”) sampling, in which respondents were intercepted across a wide array of websites Sample size of 2,000 adults 18+ matched to U.S. census on age, gender, income, region and ethnicity “River sampling” in this case appears to mean, according to the reporter, that “people were invited into it through online ads.” The survey found that 5% of U.S. households had purchased Herbalife products during the past three months (with a “0.8% margin of error,” ha ha ha). They they did a multiplication and a division to estimate that only 8% of households who bought these products were Herbalife distributors: 480,000 active distributor


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 A reporter emailed me the other day with a question about a case I’d never heard of before, a company called Herbalife that is being accused of being a pyramid scheme. [sent-1, score-0.487]

2 households had purchased Herbalife products during the past three months (with a “0. [sent-7, score-0.695]

3 They they did a multiplication and a division to estimate that only 8% of households who bought these products were Herbalife distributors: 480,000 active distributorships / (0. [sent-9, score-0.606]

4 The reporter asked what I thought about that 8%. [sent-14, score-0.153]

5 Here are some of my reactions (again, recall that I had not heard about this case before, and these reactions are based on the information provided to me): I think there are some serious problems here. [sent-15, score-0.196]

6 Generalizing from 2000 to the whole population is not a problem—as long as the sample is a representative sample, and as long as the survey responses are correct. [sent-17, score-0.175]

7 I did not look at the details of the survey so I don’t know about the issue of representativeness. [sent-18, score-0.111]

8 I wonder, though, exactly how they “fished” for respondents in their river sampling. [sent-19, score-0.414]

9 It could be that they were more likely to get the sort of person who was interested in dietary supplements. [sent-20, score-0.063]

10 However the sampling was done, I worry about the survey responses. [sent-23, score-0.193]

11 For example (from an article by David Hemenway from 1997): “The National Rifle reports 3 million dues-paying members, or about 1. [sent-25, score-0.113]

12 In national random telephone surveys, however, 4-10% of respondents claim that they are dues-paying NRA members. [sent-27, score-0.363]

13 Similarly, although Sports Illustrated reports that fewer than 3% of American households purchase the magazine, in national surveys 15% of respondents claim that they are current subscribers. [sent-28, score-0.723]

14 ” The mathematics is that if there is a small rate of error, it can show up as a large error in the estimate of a small population. [sent-29, score-0.117]

15 So, just because 5% of respondents say they used Herbalife products in the previous three months, that doesn’t mean that 5% of respondents actually used Herbalife products in the previous three months. [sent-30, score-1.194]

16 I assume that “distributors” buy a lot more Herbalife products than non-distributors. [sent-33, score-0.281]

17 Hence, even if 8% of the Herbalife consumers are distributors, I’d guess that a lot more than 8% of Herbalife products are bought by distributors. [sent-34, score-0.416]

18 ” But that’s exactly what you’d expect of a still-active pyramid scheme, no? [sent-39, score-0.239]

19 Existing members want new people below them on the pyramid. [sent-40, score-0.066]

20 I’m not saying this means it is a pyramid scheme, but it doesn’t seem like evidence against the hypothesis! [sent-41, score-0.19]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('herbalife', 0.629), ('products', 0.281), ('distributors', 0.254), ('households', 0.237), ('respondents', 0.199), ('pyramid', 0.19), ('ha', 0.19), ('river', 0.166), ('reporter', 0.153), ('survey', 0.111), ('bought', 0.088), ('scheme', 0.083), ('sampling', 0.082), ('national', 0.077), ('reactions', 0.071), ('error', 0.067), ('members', 0.066), ('sample', 0.064), ('surveys', 0.064), ('nra', 0.063), ('rifle', 0.063), ('dietary', 0.063), ('fished', 0.063), ('three', 0.061), ('million', 0.061), ('lieberman', 0.06), ('hemenway', 0.06), ('purchased', 0.06), ('months', 0.056), ('previous', 0.056), ('websites', 0.055), ('amusingly', 0.055), ('heard', 0.054), ('illustrated', 0.052), ('reports', 0.052), ('array', 0.051), ('rate', 0.05), ('purchase', 0.05), ('exactly', 0.049), ('aka', 0.049), ('generalizing', 0.048), ('doesn', 0.048), ('consumers', 0.047), ('american', 0.046), ('accused', 0.045), ('called', 0.045), ('matched', 0.045), ('claim', 0.044), ('telephone', 0.043), ('margin', 0.043)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 1679 andrew gelman stats-2013-01-18-Is it really true that only 8% of people who buy Herbalife products are Herbalife distributors?

Introduction: A reporter emailed me the other day with a question about a case I’d never heard of before, a company called Herbalife that is being accused of being a pyramid scheme. The reporter pointed me to this document which describes a survey conducted by “a third party firm called Lieberman Research”: Two independent studies took place using real time (aka “river”) sampling, in which respondents were intercepted across a wide array of websites Sample size of 2,000 adults 18+ matched to U.S. census on age, gender, income, region and ethnicity “River sampling” in this case appears to mean, according to the reporter, that “people were invited into it through online ads.” The survey found that 5% of U.S. households had purchased Herbalife products during the past three months (with a “0.8% margin of error,” ha ha ha). They they did a multiplication and a division to estimate that only 8% of households who bought these products were Herbalife distributors: 480,000 active distributor

2 0.17999171 142 andrew gelman stats-2010-07-12-God, Guns, and Gaydar: The Laws of Probability Push You to Overestimate Small Groups

Introduction: Earlier today, Nate criticized a U.S. military survey that asks troops the question, “Do you currently serve with a male or female Service member you believe to be homosexual.” [emphasis added] As Nate points out, by asking this question in such a speculative way, “it would seem that you’ll be picking up a tremendous number of false positives–soldiers who are believed to be gay, but aren’t–and that these false positives will swamp any instances in which soldiers (in spite of DADT) are actually somewhat open about their same-sex attractions.” This is a general problem in survey research. In an article in Chance magazine in 1997, “The myth of millions of annual self-defense gun uses: a case study of survey overestimates of rare events” [see here for related references], David Hemenway uses the false-positive, false-negative reasoning to explain this bias in terms of probability theory. Misclassifications that induce seemingly minor biases in estimates of certain small probab

3 0.16027263 1017 andrew gelman stats-2011-11-18-Lack of complete overlap

Introduction: Evens Salies writes: I have a question regarding a randomizing constraint in my current funded electricity experiment. After elimination of missing data we have 110 voluntary households from a larger population (resource constraints do not allow us to have more households!). I randomly assign them to threated and non treated where the treatment variable is some ICT that allows the treated to track their electricity consumption in real tim. The ICT is made of two devices, one that is plugged on the household’s modem and the other on the electric meter. A necessary condition for being treated is that the distance between the box and the meter be below some threshold (d), the value of which is 20 meters approximately. 50 ICTs can be installed. 60 households will be in the control group. But, I can only assign 6 households in the control group for whom d is less than 20. Therefore, I have only 6 households in the control group who have a counterfactual in the group of treated.

4 0.15888372 2130 andrew gelman stats-2013-12-11-Multilevel marketing as a way of liquidating participants’ social networks

Introduction: Here I’m using the term “liquidate” in the economics sense (conversion of an asset into cash) rather than the Rocky-and-Bullwinkle sense of the word. Here’s the story: Katherine Chen writes : An executive summary version of Ackman and Dineen’s Powerpoint analysis underscores the potential impact of DSOs [direct selling organizations] upon distributors’ networks: Recruiting family members, friends, work and church acquaintances and others in their communities into a rigged game, one that is highly likely to exact financial and emotional harm on those loved and trusted by them, has an impact that cannot be repaired or recompensed with dollars alone. In class discussions over the years, students have made similar conclusions, with some sharing experiences about how they no longer can socialize with relatives and friends who are members of DSOs because of the relentless pressure to buy and join. Others continue to do part-time work as DSO members who were recruited by family.

5 0.13649902 1371 andrew gelman stats-2012-06-07-Question 28 of my final exam for Design and Analysis of Sample Surveys

Introduction: This is it, the last question on the exam! 28. A telephone survey was conducted several years ago, asking people how often they were polled in the past year. I can’t recall the responses, but suppose that 40% of the respondents said they participated in zero surveys in the previous year, 30% said they participated in one survey, 15% said two surveys, 10% said three, and 5% said four. From this it is easy to estimate an average, but there is a worry that this survey will itself overrepresent survey participants and thus overestimate the rate at which the average person is surveyed. Come up with a procedure to use these data to get an improved estimate of the average number of surveys that a randomly-sampled American is polled in a year. Solution to question 27 From yesterday : 27. Which of the following problems were identified with the Burnham et al. survey of Iraq mortality? (Indicate all that apply.) (a) The survey used cluster sampling, which is inappropriate for estim

6 0.10845268 1356 andrew gelman stats-2012-05-31-Question 21 of my final exam for Design and Analysis of Sample Surveys

7 0.1063882 107 andrew gelman stats-2010-06-24-PPS in Georgia

8 0.10578607 1708 andrew gelman stats-2013-02-05-Wouldn’t it be cool if Glenn Hubbard were consulting for Herbalife and I were on the other side?

9 0.10371438 627 andrew gelman stats-2011-03-24-How few respondents are reasonable to use when calculating the average by county?

10 0.098095216 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

11 0.09709204 1437 andrew gelman stats-2012-07-31-Paying survey respondents

12 0.096537776 784 andrew gelman stats-2011-07-01-Weighting and prediction in sample surveys

13 0.096125796 2236 andrew gelman stats-2014-03-07-Selection bias in the reporting of shaky research

14 0.08793278 2148 andrew gelman stats-2013-12-25-Spam!

15 0.084750786 1315 andrew gelman stats-2012-05-12-Question 2 of my final exam for Design and Analysis of Sample Surveys

16 0.081903607 761 andrew gelman stats-2011-06-13-A survey’s not a survey if they don’t tell you how they did it

17 0.080862835 849 andrew gelman stats-2011-08-11-The Reliability of Cluster Surveys of Conflict Mortality: Violent Deaths and Non-Violent Deaths

18 0.074353173 5 andrew gelman stats-2010-04-27-Ethical and data-integrity problems in a study of mortality in Iraq

19 0.071479946 1317 andrew gelman stats-2012-05-13-Question 3 of my final exam for Design and Analysis of Sample Surveys

20 0.071346745 972 andrew gelman stats-2011-10-25-How do you interpret standard errors from a regression fit to the entire population?


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.12), (1, -0.021), (2, 0.091), (3, -0.056), (4, 0.029), (5, 0.046), (6, -0.002), (7, 0.009), (8, -0.007), (9, -0.068), (10, 0.01), (11, -0.091), (12, -0.005), (13, 0.085), (14, -0.04), (15, -0.016), (16, 0.02), (17, 0.004), (18, 0.031), (19, 0.007), (20, -0.024), (21, -0.013), (22, -0.013), (23, 0.016), (24, -0.047), (25, 0.004), (26, -0.01), (27, 0.009), (28, 0.039), (29, 0.023), (30, -0.014), (31, -0.012), (32, -0.016), (33, 0.012), (34, -0.016), (35, -0.002), (36, -0.006), (37, -0.012), (38, -0.039), (39, 0.018), (40, -0.044), (41, 0.01), (42, 0.023), (43, -0.008), (44, -0.023), (45, 0.008), (46, -0.024), (47, -0.002), (48, 0.0), (49, -0.009)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9779979 1679 andrew gelman stats-2013-01-18-Is it really true that only 8% of people who buy Herbalife products are Herbalife distributors?

Introduction: A reporter emailed me the other day with a question about a case I’d never heard of before, a company called Herbalife that is being accused of being a pyramid scheme. The reporter pointed me to this document which describes a survey conducted by “a third party firm called Lieberman Research”: Two independent studies took place using real time (aka “river”) sampling, in which respondents were intercepted across a wide array of websites Sample size of 2,000 adults 18+ matched to U.S. census on age, gender, income, region and ethnicity “River sampling” in this case appears to mean, according to the reporter, that “people were invited into it through online ads.” The survey found that 5% of U.S. households had purchased Herbalife products during the past three months (with a “0.8% margin of error,” ha ha ha). They they did a multiplication and a division to estimate that only 8% of households who bought these products were Herbalife distributors: 480,000 active distributor

2 0.8807922 1371 andrew gelman stats-2012-06-07-Question 28 of my final exam for Design and Analysis of Sample Surveys

Introduction: This is it, the last question on the exam! 28. A telephone survey was conducted several years ago, asking people how often they were polled in the past year. I can’t recall the responses, but suppose that 40% of the respondents said they participated in zero surveys in the previous year, 30% said they participated in one survey, 15% said two surveys, 10% said three, and 5% said four. From this it is easy to estimate an average, but there is a worry that this survey will itself overrepresent survey participants and thus overestimate the rate at which the average person is surveyed. Come up with a procedure to use these data to get an improved estimate of the average number of surveys that a randomly-sampled American is polled in a year. Solution to question 27 From yesterday : 27. Which of the following problems were identified with the Burnham et al. survey of Iraq mortality? (Indicate all that apply.) (a) The survey used cluster sampling, which is inappropriate for estim

3 0.84974682 142 andrew gelman stats-2010-07-12-God, Guns, and Gaydar: The Laws of Probability Push You to Overestimate Small Groups

Introduction: Earlier today, Nate criticized a U.S. military survey that asks troops the question, “Do you currently serve with a male or female Service member you believe to be homosexual.” [emphasis added] As Nate points out, by asking this question in such a speculative way, “it would seem that you’ll be picking up a tremendous number of false positives–soldiers who are believed to be gay, but aren’t–and that these false positives will swamp any instances in which soldiers (in spite of DADT) are actually somewhat open about their same-sex attractions.” This is a general problem in survey research. In an article in Chance magazine in 1997, “The myth of millions of annual self-defense gun uses: a case study of survey overestimates of rare events” [see here for related references], David Hemenway uses the false-positive, false-negative reasoning to explain this bias in terms of probability theory. Misclassifications that induce seemingly minor biases in estimates of certain small probab

4 0.81287944 1320 andrew gelman stats-2012-05-14-Question 4 of my final exam for Design and Analysis of Sample Surveys

Introduction: 4. Researchers have found that survey respondents overreport church attendance. Thus, naive estimates from surveys overstate the percentage of Americans who attend church regularly. Does this have a large impact on estimates of time trends in religious attendance? Solution to question 3 From yesterday : 3. We discussed in class the best currently available method for estimating the proportion of military servicemembers who are gay. What is that method? (Recall the problems with the direct approach: there is no simple way to survey servicemembers at random, nor is it likely that they would answer such a question honestly.) Solution: I was talking about the work of Gary Gates, combining an estimate of the percentage of gays in the population with an estimate of the probability that someone is in the military, given that he or she is gay.

5 0.80931956 405 andrew gelman stats-2010-11-10-Estimation from an out-of-date census

Introduction: Suguru Mizunoya writes: When we estimate the number of people from a national sampling survey (such as labor force survey) using sampling weights, don’t we obtain underestimated number of people, if the country’s population is growing and the sampling frame is based on an old census data? In countries with increasing populations, the probability of inclusion changes over time, but the weights can’t be adjusted frequently because census takes place only once every five or ten years. I am currently working for UNICEF for a project on estimating number of out-of-school children in developing countries. The project leader is comfortable to use estimates of number of people from DHS and other surveys. But, I am concerned that we may need to adjust the estimated number of people by the population projection, otherwise the estimates will be underestimated. I googled around on this issue, but I could not find a right article or paper on this. My reply: I don’t know if there’s a pa

6 0.80893701 730 andrew gelman stats-2011-05-25-Rechecking the census

7 0.79810476 385 andrew gelman stats-2010-10-31-Wacky surveys where they don’t tell you the questions they asked

8 0.79307061 1455 andrew gelman stats-2012-08-12-Probabilistic screening to get an approximate self-weighted sample

9 0.78968573 1356 andrew gelman stats-2012-05-31-Question 21 of my final exam for Design and Analysis of Sample Surveys

10 0.78884435 5 andrew gelman stats-2010-04-27-Ethical and data-integrity problems in a study of mortality in Iraq

11 0.78687018 784 andrew gelman stats-2011-07-01-Weighting and prediction in sample surveys

12 0.76306641 107 andrew gelman stats-2010-06-24-PPS in Georgia

13 0.76060933 761 andrew gelman stats-2011-06-13-A survey’s not a survey if they don’t tell you how they did it

14 0.7591297 1345 andrew gelman stats-2012-05-26-Question 16 of my final exam for Design and Analysis of Sample Surveys

15 0.75781363 1358 andrew gelman stats-2012-06-01-Question 22 of my final exam for Design and Analysis of Sample Surveys

16 0.75770772 1288 andrew gelman stats-2012-04-29-Clueless Americans think they’ll never get sick

17 0.74735355 1437 andrew gelman stats-2012-07-31-Paying survey respondents

18 0.74705976 1940 andrew gelman stats-2013-07-16-A poll that throws away data???

19 0.73485923 1313 andrew gelman stats-2012-05-11-Question 1 of my final exam for Design and Analysis of Sample Surveys

20 0.73468369 1430 andrew gelman stats-2012-07-26-Some thoughts on survey weighting


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.048), (5, 0.011), (9, 0.028), (15, 0.02), (16, 0.062), (21, 0.033), (24, 0.131), (40, 0.194), (53, 0.012), (64, 0.02), (82, 0.01), (86, 0.028), (96, 0.011), (97, 0.044), (99, 0.234)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.95242333 149 andrew gelman stats-2010-07-16-Demographics: what variable best predicts a financial crisis?

Introduction: A few weeks ago I wrote about the importance of demographics in political trends . Today I’d like to show you how demographics help predict financial crises. Here are a few examples of countries with major crises. The working-age population in Japan peaked in the 1995 census . The 1995 Financial Crisis in Japan The working-age USA population growth slows down to unprecedented levels in 2008 (see figure below) Financial crisis of 2007-2010 . (Also, notice previous dips in 2001, 1991 and 1981, and consider the list of recessions .) China’s working-age population, age 15 to 64, has grown continuously. The labor pool will peak in 2015 and then decline. There are more charts in Demography and Growth report by the Reserve Bank of Australia: Wikipedia surveys the causes of the financial crisis, such as “liquidity shortfall in the United States banking system caused by the overvaluation of assets”. Oh my! Slightly better than the usu

2 0.93970394 1505 andrew gelman stats-2012-09-20-“Joseph Anton”

Introduction: I only read the review , not the book. What puzzled me was not any lack of self-awareness but rather this bit: The title of Mr. Rushdie’s new memoir . . . comes from the alias he assumed when British police told him back in 1989 that he needed a pseudonym: the Joseph comes from Joseph Conrad, the Anton from Anton Chekhov. The protection officers issued to him by the British government soon took to calling him “Joe,” an abbreviation he says he detested. The thing that I don’t understand is why he detested the nickname. If I were in a comparable situation, I think I’d appreciate if my security detail gave me a friendly nickname. Then again, with the stress that Rushdie’s been under, I can imagine all sorts of personality transformations.

3 0.93657076 243 andrew gelman stats-2010-08-30-Computer models of the oil spill

Introduction: Chris Wilson points me to this visualizatio n of three physical models of the oil spill in the Gulf of Mexico. Cool (and scary) stuff. Wilson writes: One of the major advantages is that the models are 3D and show the plumes and tails beneath the surface. One of the major disadvantages is that they’re still just models.

same-blog 4 0.92667365 1679 andrew gelman stats-2013-01-18-Is it really true that only 8% of people who buy Herbalife products are Herbalife distributors?

Introduction: A reporter emailed me the other day with a question about a case I’d never heard of before, a company called Herbalife that is being accused of being a pyramid scheme. The reporter pointed me to this document which describes a survey conducted by “a third party firm called Lieberman Research”: Two independent studies took place using real time (aka “river”) sampling, in which respondents were intercepted across a wide array of websites Sample size of 2,000 adults 18+ matched to U.S. census on age, gender, income, region and ethnicity “River sampling” in this case appears to mean, according to the reporter, that “people were invited into it through online ads.” The survey found that 5% of U.S. households had purchased Herbalife products during the past three months (with a “0.8% margin of error,” ha ha ha). They they did a multiplication and a division to estimate that only 8% of households who bought these products were Herbalife distributors: 480,000 active distributor

5 0.89834392 1945 andrew gelman stats-2013-07-18-“How big is your chance of dying in an ordinary play?”

Introduction: At first glance, that’s what I thought Tyler Cowen was asking . I assumed he was asking about the characters, not the audience, as watching a play seems like a pretty safe activity (A. Lincoln excepted). Characters in plays die all the time. I wonder what the chance is? Something between 5% and 10%, I’d guess. I’d guess your chance of dying (as a character) in a movie would be higher. On the other hand, movies have lots of extras who just show up and leave; if you count them maybe the risk isn’t so high. Perhaps the right way to do this is to weight people by screen time? P.S. The Mezzanine aside, works of art and literature tend to focus on the dramatic moments of lives, so it makes sense that death will be overrepresented.

6 0.89281535 1198 andrew gelman stats-2012-03-05-A cloud with a silver lining

7 0.87560022 1245 andrew gelman stats-2012-04-03-Redundancy and efficiency: In praise of Penn Station

8 0.87006176 1671 andrew gelman stats-2013-01-13-Preregistration of Studies and Mock Reports

9 0.8652432 1581 andrew gelman stats-2012-11-17-Horrible but harmless?

10 0.85834432 1277 andrew gelman stats-2012-04-23-Infographic of the year

11 0.84555984 1796 andrew gelman stats-2013-04-09-The guy behind me on line for the train . . .

12 0.845204 1803 andrew gelman stats-2013-04-14-Why girls do better in school

13 0.83784848 962 andrew gelman stats-2011-10-17-Death!

14 0.83652997 1153 andrew gelman stats-2012-02-04-More on the economic benefits of universities

15 0.83630967 2212 andrew gelman stats-2014-02-15-Mary, Mary, why ya buggin

16 0.82306629 56 andrew gelman stats-2010-05-28-Another argument in favor of expressing conditional probability statements using the population distribution

17 0.81922704 1445 andrew gelman stats-2012-08-06-Slow progress

18 0.81864667 2119 andrew gelman stats-2013-12-01-Separated by a common blah blah blah

19 0.81818032 81 andrew gelman stats-2010-06-12-Reputational Capital and Incentives in Organizations

20 0.81647944 548 andrew gelman stats-2011-02-01-What goes around . . .