andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1172 knowledge-graph by maker-knowledge-mining

1172 andrew gelman stats-2012-02-17-Rare name analysis and wealth convergence


meta infos for this blog

Source: html

Introduction: Steve Hsu summarizes the research of economic historian Greg Clark and Neil Cummins : Using rare surnames we track the socio-economic status of descendants of a sample of English rich and poor in 1800, until 2011. We measure social status through wealth, education, occupation, and age at death. Our method allows unbiased estimates of mobility rates. Paradoxically, we find two things. Mobility rates are lower than conventionally estimated. There is considerable persistence of status, even after 200 years. But there is convergence with each generation. The 1800 underclass has already attained mediocrity. And the 1800 upper class will eventually dissolve into the mass of society, though perhaps not for another 300 years, or longer. Read more at Steven’s blog. The idea of rare names to perform this analysis is interesting – and has been recently applied to the study of nepotism in Italy . I haven’t looked into the details of the methodology, but rare events


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Steve Hsu summarizes the research of economic historian Greg Clark and Neil Cummins : Using rare surnames we track the socio-economic status of descendants of a sample of English rich and poor in 1800, until 2011. [sent-1, score-1.102]

2 We measure social status through wealth, education, occupation, and age at death. [sent-2, score-0.217]

3 There is considerable persistence of status, even after 200 years. [sent-6, score-0.207]

4 And the 1800 upper class will eventually dissolve into the mass of society, though perhaps not for another 300 years, or longer. [sent-9, score-0.362]

5 The idea of rare names to perform this analysis is interesting – and has been recently applied to the study of nepotism in Italy . [sent-11, score-0.694]

6 I haven’t looked into the details of the methodology, but rare events have their own distributional characteristics, and could benefit from Bayesian modeling in sparse data conditions. [sent-12, score-0.627]

7 Moreover, there seems to be an underlying assumption that rare names are somehow uniformly represented in the population. [sent-13, score-0.822]

8 A hypothetical situation: in feudal days, rare names were good at predicting who’s rich and who’s not – wealth was passed through family by name. [sent-15, score-1.404]

9 But then industrialization perturbed the old feudal order stratified by name into one that’s stratified by skill and no longer identifiable by name. [sent-16, score-1.013]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('rare', 0.357), ('feudal', 0.297), ('stratified', 0.255), ('mobility', 0.228), ('status', 0.217), ('names', 0.21), ('wealth', 0.179), ('methodology', 0.155), ('dissolve', 0.135), ('scrutinize', 0.135), ('nepotism', 0.127), ('surnames', 0.127), ('rich', 0.126), ('occupation', 0.118), ('italy', 0.114), ('conventionally', 0.114), ('persistence', 0.114), ('jakulin', 0.111), ('identifiable', 0.109), ('summarizes', 0.109), ('distributional', 0.106), ('clark', 0.106), ('hsu', 0.099), ('uniformly', 0.098), ('neil', 0.098), ('paradoxically', 0.097), ('skill', 0.097), ('considerable', 0.093), ('sparse', 0.093), ('historian', 0.093), ('greg', 0.092), ('moreover', 0.091), ('convergence', 0.09), ('unbiased', 0.088), ('represented', 0.086), ('passed', 0.085), ('characteristics', 0.083), ('aleks', 0.083), ('hypothetical', 0.079), ('upper', 0.077), ('steve', 0.077), ('english', 0.076), ('mass', 0.076), ('eventually', 0.074), ('track', 0.073), ('allows', 0.073), ('events', 0.071), ('somehow', 0.071), ('steven', 0.071), ('predicting', 0.071)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999982 1172 andrew gelman stats-2012-02-17-Rare name analysis and wealth convergence

Introduction: Steve Hsu summarizes the research of economic historian Greg Clark and Neil Cummins : Using rare surnames we track the socio-economic status of descendants of a sample of English rich and poor in 1800, until 2011. We measure social status through wealth, education, occupation, and age at death. Our method allows unbiased estimates of mobility rates. Paradoxically, we find two things. Mobility rates are lower than conventionally estimated. There is considerable persistence of status, even after 200 years. But there is convergence with each generation. The 1800 underclass has already attained mediocrity. And the 1800 upper class will eventually dissolve into the mass of society, though perhaps not for another 300 years, or longer. Read more at Steven’s blog. The idea of rare names to perform this analysis is interesting – and has been recently applied to the study of nepotism in Italy . I haven’t looked into the details of the methodology, but rare events

2 0.13773426 2211 andrew gelman stats-2014-02-14-The popularity of certain baby names is falling off the clifffffffffffff

Introduction: Ubs writes: I was looking at baby name data last night and I stumbled upon something curious. I follow the baby names blog occasionally but not regularly, so I’m not sure if it’s been noticed before. Let me present it like this: Take the statement… Of the top 100 boys and top 100 girls names, only ___% contain the letter __. I’m using the SSA baby names page, so that’s U.S. births, and I’m looking at the decade of 2000-2009 (so kids currently aged 4 to 13). Which letters would you expect to have the lowest rate of occurrence? As expected, the lowest score is for Q, which appears zero times. (Jacqueline ranks #104 for girls.) It’s the second lowest that surprised me. (… You can pause and try to guess now. Spoilers to follow.) Of the other big-point Scrabble letters, Z appears in four names (Elizabeth, Zachary, Mackenzie, Zoe) and X in six, of which five are closely related (Alexis, Alexander, Alexandra, Alexa, Alex, Xavier). J is heavily overrepresented, especial

3 0.12797934 281 andrew gelman stats-2010-09-16-NSF crowdsourcing

Introduction: I have no idea what this and this are, but Aleks passed these on, and maybe some of you will find them interesting.

4 0.11018041 2212 andrew gelman stats-2014-02-15-Mary, Mary, why ya buggin

Introduction: In our Cliff thread from yesterday, sociologist Philip Cohen pointed to his discussions in the decline in the popularity of the name Mary. One thing that came up was the traditional trendiness of girls’ names. So I thought I’d share my thoughts from a couple of years ago, as reported by David Leonhardt: Andrew Gelman, a statistics professor at Columbia and an amateur name-ologist, argues that many parents want their boys to seem mature and so pick classic names. William, David, Joseph and James, all longtime stalwarts, remain in the Top 20. With girls, Gelman says, parents are attracted to names that convey youth even into adulthood and choose names that seem to be on the upswing. By the 1990s, of course, not many girls from the 1880s were still around, and that era’s names could seem fresh again. This search for youthfulness makes girls’ names more volatile — and increasingly so, as more statistics about names become available and parents grow more willing to experiment

5 0.10160537 1919 andrew gelman stats-2013-06-29-R sucks

Introduction: I was trying to make some new graphs using 5-year-old R code and I got all these problems because I was reading in files with variable names such as “co.fipsid” and now R is automatically changing them to “co_fipsid”. Or maybe the names had underbars all along, and the old R had changed them into dots. Whatever. I understand that backward compatibility can be hard to maintain, but this is just annoying.

6 0.09639816 1849 andrew gelman stats-2013-05-09-Same old same old

7 0.084196284 666 andrew gelman stats-2011-04-18-American Beliefs about Economic Opportunity and Income Inequality

8 0.078606784 1159 andrew gelman stats-2012-02-08-Charles Murray [perhaps] does a Tucker Carlson, provoking me to unleash the usual torrent of graphs

9 0.07784193 2071 andrew gelman stats-2013-10-21-Most Popular Girl Names by State over Time

10 0.076295406 1341 andrew gelman stats-2012-05-24-Question 14 of my final exam for Design and Analysis of Sample Surveys

11 0.071085095 1007 andrew gelman stats-2011-11-13-At last, treated with the disrespect that I deserve

12 0.070943303 925 andrew gelman stats-2011-09-26-Ethnicity and Population Structure in Personal Naming Networks

13 0.068797991 227 andrew gelman stats-2010-08-23-Visualization magazine

14 0.067934781 79 andrew gelman stats-2010-06-10-What happens when the Democrats are “fighting Wall Street with one hand, unions with the other,” while the Republicans are fighting unions with two hands?

15 0.067903541 1344 andrew gelman stats-2012-05-25-Question 15 of my final exam for Design and Analysis of Sample Surveys

16 0.066927791 743 andrew gelman stats-2011-06-03-An argument that can’t possibly make sense

17 0.066584423 1079 andrew gelman stats-2011-12-23-Surveys show Americans are populist class warriors, except when they aren’t

18 0.066410065 1063 andrew gelman stats-2011-12-16-Suspicious histogram bars

19 0.064488605 1763 andrew gelman stats-2013-03-14-Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses

20 0.064320214 2236 andrew gelman stats-2014-03-07-Selection bias in the reporting of shaky research


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.111), (1, -0.013), (2, 0.026), (3, 0.002), (4, 0.004), (5, 0.032), (6, -0.006), (7, 0.021), (8, -0.006), (9, 0.03), (10, -0.015), (11, -0.017), (12, 0.009), (13, 0.055), (14, 0.023), (15, 0.069), (16, 0.034), (17, 0.011), (18, -0.065), (19, -0.071), (20, -0.001), (21, -0.008), (22, 0.018), (23, 0.018), (24, 0.012), (25, -0.026), (26, -0.025), (27, 0.017), (28, 0.026), (29, -0.023), (30, 0.008), (31, 0.01), (32, -0.013), (33, 0.036), (34, 0.014), (35, -0.001), (36, 0.03), (37, 0.011), (38, -0.061), (39, -0.002), (40, -0.03), (41, 0.012), (42, 0.028), (43, -0.028), (44, 0.023), (45, -0.01), (46, -0.004), (47, -0.005), (48, 0.026), (49, 0.019)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95649201 1172 andrew gelman stats-2012-02-17-Rare name analysis and wealth convergence

Introduction: Steve Hsu summarizes the research of economic historian Greg Clark and Neil Cummins : Using rare surnames we track the socio-economic status of descendants of a sample of English rich and poor in 1800, until 2011. We measure social status through wealth, education, occupation, and age at death. Our method allows unbiased estimates of mobility rates. Paradoxically, we find two things. Mobility rates are lower than conventionally estimated. There is considerable persistence of status, even after 200 years. But there is convergence with each generation. The 1800 underclass has already attained mediocrity. And the 1800 upper class will eventually dissolve into the mass of society, though perhaps not for another 300 years, or longer. Read more at Steven’s blog. The idea of rare names to perform this analysis is interesting – and has been recently applied to the study of nepotism in Italy . I haven’t looked into the details of the methodology, but rare events

2 0.62426394 925 andrew gelman stats-2011-09-26-Ethnicity and Population Structure in Personal Naming Networks

Introduction: Aleks pointed me to this recent article by Pablo Mateos, Paul Longley, and David O’Sullivan on one of my favorite topics. The authors produced a potentially cool naming network of the city of Auckland New Zealand . I say “potentially cool” because I have such difficulty reading the article–I speak English, statistics, and a bit of political science and economics, but this one is written in heavy sociologese–that I can’t quite be sure what they’re doing. However, despite my (perhaps unfair) disdain for the particulars of their method, it’s probably good that they’re jumping in with this analysis. Others can take their data (and similar datasets from elsewhere) and do better. Ya gotta start somewhere, and the basic idea (to cluster first names that are associated with the same last names, and to cluster last names that are associated with the same first names) seems good. I have to admit, though, that I was amused by the following line, which, amazingly, led off the paper:

3 0.6146158 428 andrew gelman stats-2010-11-24-Flawed visualization of U.S. voting maybe has some good features

Introduction: Aleks points me to this attractive visualization by David Sparks of U.S. voting. On the plus side, the pictures and associated movie (showing an oddly horizontally-stretched-out United States) are pretty and seem to have gotten a bit of attention–the maps have received 31 comments, which is more than we get on almost all our blog entries here. On the minus side, the movie is misleading. In many years it shows the whole U.S. as a single color, even when candidates from both parties won some votes. The text has errors too, for example the false claim that the South favored a Democratic candidate in 1980. The southern states that Jimmy Carter carried in 1980 were Georgia and . . . that’s it. But, as Aleks says, once this tool is out there, maybe people can use it to do better. It’s in that spirit that I’m linking. Ya gotta start somewhere. Also, this is a good example of a general principle: When you make a graph, look at it carefully to see if it makes sense!

4 0.58819717 1583 andrew gelman stats-2012-11-19-I can’t read this interview with me

Introduction: From Alexandr Grigoryev: “Америка: «красная», «синяя» и «пурпурная».” Apparently my name is Эндрю Гелман. I had no idea that the Voice of America even existed anymore!

5 0.58565658 845 andrew gelman stats-2011-08-08-How adoption speed affects the abandonment of cultural tastes

Introduction: Interesting article by Jonah Berger and Gael Le Mens: Products, styles, and social movements often catch on and become popular, but little is known about why such identity-relevant cultural tastes and practices die out. We demonstrate that the velocity of adoption may affect abandonment: Analysis of over 100 years of data on first-name adoption in both France and the United States illustrates that cultural tastes that have been adopted quickly die faster (i.e., are less likely to persist). Mirroring this aggregate pattern, at the individual level, expecting parents are more hesitant to adopt names that recently experienced sharper increases in adoption. Further analysis indicate that these effects are driven by concerns about symbolic value: Fads are perceived negatively, so people avoid identity-relevant items with sharply increasing popularity because they believe that they will be short lived. Ancillary analyses also indicate that, in contrast to conventional wisdom, identity-r

6 0.5831393 2211 andrew gelman stats-2014-02-14-The popularity of certain baby names is falling off the clifffffffffffff

7 0.57925075 1305 andrew gelman stats-2012-05-07-Happy news on happiness; what can we believe?

8 0.56364918 624 andrew gelman stats-2011-03-22-A question about the economic benefits of universities

9 0.56052476 281 andrew gelman stats-2010-09-16-NSF crowdsourcing

10 0.55812687 289 andrew gelman stats-2010-09-21-“How segregated is your city?”: A story of why every graph, no matter how clear it seems to be, needs a caption to anchor the reader in some numbers

11 0.55512023 1942 andrew gelman stats-2013-07-17-“Stop and frisk” statistics

12 0.55497062 179 andrew gelman stats-2010-08-03-An Olympic size swimming pool full of lithium water

13 0.55263835 2212 andrew gelman stats-2014-02-15-Mary, Mary, why ya buggin

14 0.55042052 1302 andrew gelman stats-2012-05-06-Fun with google autocomplete

15 0.54970145 1548 andrew gelman stats-2012-10-25-Health disparities are associated with low life expectancy

16 0.54705745 1587 andrew gelman stats-2012-11-21-Red state blue state, or, states and counties are not persons

17 0.5347954 98 andrew gelman stats-2010-06-19-Further thoughts on happiness and life satisfaction research

18 0.53235477 1958 andrew gelman stats-2013-07-27-Teaching is hard

19 0.53071505 1114 andrew gelman stats-2012-01-12-Controversy about average personality differences between men and women

20 0.5293023 2308 andrew gelman stats-2014-04-27-White stripes and dead armadillos


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.011), (15, 0.033), (16, 0.074), (21, 0.012), (22, 0.011), (24, 0.161), (31, 0.011), (35, 0.011), (45, 0.04), (53, 0.013), (79, 0.195), (83, 0.011), (84, 0.017), (86, 0.051), (99, 0.258)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.95756519 469 andrew gelman stats-2010-12-16-2500 people living in a park in Chicago?

Introduction: Frank Hansen writes: Columbus Park is on Chicago’s west side, in the Austin neighborhood. The park is a big green area which includes a golf course. Here is the google satellite view. Here is the nytimes page. Go to Chicago, and zoom over to the census tract 2521, which is just north of the horizontal gray line (Eisenhower Expressway, aka I290) and just east of Oak Park. The park is labeled on the nytimes map. The census data have around 50 dots (they say 50 people per dot) in the park which has no residential buildings. Congressional district is Danny Davis, IL7. Here’s a map of the district. So, how do we explain the map showing ~50 dots worth of people living in the park. What’s up with the algorithm to place the dots? I dunno. I leave this one to you, the readers.

2 0.95746315 1515 andrew gelman stats-2012-09-29-Jost Haidt

Introduction: Research psychologist John Jost reviews the recent book, “The Righteous Mind,” by research psychologist Jonathan Haidt. Some of my thoughts on Haidt’s book are here . And here’s some of Jost’s review: Haidt’s book is creative, interesting, and provocative. . . . The book shines a new light on moral psychology and presents a bold, confrontational message. From a scientific perspective, however, I worry that his theory raises more questions than it answers. Why do some individuals feel that it is morally good (or necessary) to obey authority, favor the ingroup, and maintain purity, whereas others are skeptical? (Perhaps parenting style is relevant after all.) Why do some people think that it is morally acceptable to judge or even mistreat others such as gay or lesbian couples or, only a generation ago, interracial couples because they dislike or feel disgusted by them, whereas others do not? Why does the present generation “care about violence toward many more classes of victims

3 0.95505321 845 andrew gelman stats-2011-08-08-How adoption speed affects the abandonment of cultural tastes

Introduction: Interesting article by Jonah Berger and Gael Le Mens: Products, styles, and social movements often catch on and become popular, but little is known about why such identity-relevant cultural tastes and practices die out. We demonstrate that the velocity of adoption may affect abandonment: Analysis of over 100 years of data on first-name adoption in both France and the United States illustrates that cultural tastes that have been adopted quickly die faster (i.e., are less likely to persist). Mirroring this aggregate pattern, at the individual level, expecting parents are more hesitant to adopt names that recently experienced sharper increases in adoption. Further analysis indicate that these effects are driven by concerns about symbolic value: Fads are perceived negatively, so people avoid identity-relevant items with sharply increasing popularity because they believe that they will be short lived. Ancillary analyses also indicate that, in contrast to conventional wisdom, identity-r

4 0.95326853 1379 andrew gelman stats-2012-06-14-Cool-ass signal processing using Gaussian processes (birthdays again)

Introduction: Aki writes: Here’s my version of the birthday frequency graph . I used Gaussian process with two slowly varying components and periodic component with decay, so that periodic form can change in time. I used Student’s t-distribution as observation model to allow exceptional dates to be outliers. I guess that periodic component due to week effect is still in the data because there is data only from twenty years. Naturally it would be better to model the whole timeseries, but it was easier to just use the cvs by Mulligan. ALl I can say is . . . wow. Bayes wins again. Maybe Aki can supply the R or Matlab code? P.S. And let’s not forget how great the simple and clear time series plots are, compared to various fancy visualizations that people might try. P.P.S. More here .

5 0.9525516 1538 andrew gelman stats-2012-10-17-Rust

Introduction: I happened to be referring to the path sampling paper today and took a look at Appendix A.2: I’m sure I could reconstruct all of this if I had to, but I certainly can’t read this sort of thing cold anymore.

same-blog 6 0.92712587 1172 andrew gelman stats-2012-02-17-Rare name analysis and wealth convergence

7 0.92444581 1126 andrew gelman stats-2012-01-18-Bob on Stan

8 0.91512287 1825 andrew gelman stats-2013-04-25-It’s binless! A program for computing normalizing functions

9 0.91103387 863 andrew gelman stats-2011-08-21-Bad graph

10 0.90696287 1786 andrew gelman stats-2013-04-03-Hierarchical array priors for ANOVA decompositions

11 0.90459776 939 andrew gelman stats-2011-10-03-DBQQ rounding for labeling charts and communicating tolerances

12 0.89634115 1884 andrew gelman stats-2013-06-05-A story of fake-data checking being used to shoot down a flawed analysis at the Farm Credit Agency

13 0.89107764 399 andrew gelman stats-2010-11-07-Challenges of experimental design; also another rant on the practice of mentioning the publication of an article but not naming its author

14 0.87532043 1041 andrew gelman stats-2011-12-04-David MacKay and Occam’s Razor

15 0.87043226 1384 andrew gelman stats-2012-06-19-Slick time series decomposition of the birthdays data

16 0.86680806 2145 andrew gelman stats-2013-12-24-Estimating and summarizing inference for hierarchical variance parameters when the number of groups is small

17 0.86353141 1048 andrew gelman stats-2011-12-09-Maze generation algorithms!

18 0.85634279 888 andrew gelman stats-2011-09-03-A psychology researcher asks: Is Anova dead?

19 0.85055381 639 andrew gelman stats-2011-03-31-Bayes: radical, liberal, or conservative?

20 0.84889126 1089 andrew gelman stats-2011-12-28-Path sampling for models of varying dimension