andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1976 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: A friend with a baby who was born a couple weeks late commented that he would’ve liked a website that gave an estimated due date that was something more accurate than the usual “last menstrual period + 40 weeks.” I did a quick google and found this and this . Based on their descriptions of what information they use, the first site looks like it might be good, and the second site looks iffy. But I don’t really know.
sentIndex sentText sentNum sentScore
1 A friend with a baby who was born a couple weeks late commented that he would’ve liked a website that gave an estimated due date that was something more accurate than the usual “last menstrual period + 40 weeks. [sent-1, score-3.218]
2 ” I did a quick google and found this and this . [sent-2, score-0.397]
3 Based on their descriptions of what information they use, the first site looks like it might be good, and the second site looks iffy. [sent-3, score-1.873]
wordName wordTfidf (topN-words)
[('site', 0.365), ('menstrual', 0.276), ('descriptions', 0.262), ('looks', 0.256), ('commented', 0.232), ('born', 0.22), ('baby', 0.218), ('date', 0.215), ('liked', 0.194), ('late', 0.184), ('friend', 0.182), ('weeks', 0.18), ('accurate', 0.179), ('period', 0.177), ('website', 0.168), ('due', 0.164), ('google', 0.158), ('estimated', 0.154), ('gave', 0.146), ('quick', 0.143), ('usual', 0.141), ('couple', 0.12), ('second', 0.109), ('last', 0.1), ('found', 0.096), ('information', 0.092), ('based', 0.087), ('first', 0.068), ('something', 0.068), ('use', 0.068), ('might', 0.062), ('ve', 0.059), ('really', 0.059), ('good', 0.058), ('know', 0.056), ('would', 0.039), ('like', 0.038)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 1976 andrew gelman stats-2013-08-10-The birthday problem
Introduction: A friend with a baby who was born a couple weeks late commented that he would’ve liked a website that gave an estimated due date that was something more accurate than the usual “last menstrual period + 40 weeks.” I did a quick google and found this and this . Based on their descriptions of what information they use, the first site looks like it might be good, and the second site looks iffy. But I don’t really know.
2 0.1553265 1006 andrew gelman stats-2011-11-12-Val’s Number Scroll: Helping kids visualize math
Introduction: This looks cool.
3 0.13932371 919 andrew gelman stats-2011-09-21-Least surprising headline of the year
Introduction: “ Poker Web Site Cheated Users, U.S. Suit Says “ Shocking. Who’d have thought the developers of an online poker site would cheat??
4 0.13903598 1061 andrew gelman stats-2011-12-16-CrossValidated: A place to post your statistics questions
Introduction: Seth Rogers writes: I [Rogers] am a member of an online community of statisticians where I burn a great deal of time (and a recovering cog sci researcher). Our community website is a peer-reviewed Q and A spanning stats topics ranging from applications to mathematical theory. Our online community consists of mostly university faculty, grad students and technical consultants. The answer quality is very strong and the web design is intuitive. I think you and your readers are like-minded and would be really interested in some of the topics on the site, CrossValidated (you may know the sister site: stackoverflow.com ). The philosophy is purely to further knowledge for the sake of knowledge and take pride in learning. I took a quick look and the site seemed like it could be useful to people. The only thing I didn’t understand is, why doesn’t it have a search function? (Or maybe it was there somewhere and I couldn’t find it.) P.S. to all the commenters who wrote replies such
5 0.12490521 2086 andrew gelman stats-2013-11-03-How best to compare effects measured in two different time periods?
Introduction: I received the following email from someone who wishes to remain anonymous: My colleague and I are trying to understand the best way to approach a problem involving measuring a group of individuals’ abilities across time, and are hoping you can offer some guidance. We are trying to analyze the combined effect of two distinct groups of people (A and B, with no overlap between A and B) who collaborate to produce a binary outcome, using a mixed logistic regression along the lines of the following. Outcome ~ (1 | A) + (1 | B) + Other variables What we’re interested in testing was whether the observed A random effects in period 1 are predictive of the A random effects in the following period 2. Our idea being create two models, each using a different period’s worth of data, to create two sets of A coefficients, then observe the relationship between the two. If the A’s have a persistent ability across periods, the coefficients should be correlated or show a linear-ish relationshi
6 0.12263158 1871 andrew gelman stats-2013-05-27-Annals of spam
7 0.12049949 894 andrew gelman stats-2011-09-07-Hipmunk FAIL: Graphics without content is not enough
8 0.11645364 1782 andrew gelman stats-2013-03-30-“Statistical Modeling: A Fresh Approach”
9 0.10999922 2304 andrew gelman stats-2014-04-24-An open site for researchers to post and share papers
10 0.10517196 1249 andrew gelman stats-2012-04-06-Thinking seriously about social science research
11 0.1046581 995 andrew gelman stats-2011-11-06-Statistical models and actual models
12 0.10437962 207 andrew gelman stats-2010-08-14-Pourquoi Google search est devenu plus raisonnable?
13 0.10330817 1513 andrew gelman stats-2012-09-27-Estimating seasonality with a data set that’s just 52 weeks long
14 0.099198572 2333 andrew gelman stats-2014-05-13-Personally, I’d rather go with Teragram
15 0.099031195 1012 andrew gelman stats-2011-11-16-Blog bribes!
16 0.095561013 2071 andrew gelman stats-2013-10-21-Most Popular Girl Names by State over Time
17 0.094689354 911 andrew gelman stats-2011-09-15-More data tools worth using from Google
18 0.093495272 199 andrew gelman stats-2010-08-11-Note to semi-spammers
19 0.092269301 153 andrew gelman stats-2010-07-17-Tenure-track position at U. North Carolina in survey methods and social statistics
20 0.090635322 1698 andrew gelman stats-2013-01-30-The spam just gets weirder and weirder
topicId topicWeight
[(0, 0.105), (1, -0.032), (2, -0.018), (3, 0.03), (4, 0.058), (5, -0.006), (6, 0.035), (7, -0.023), (8, -0.008), (9, -0.041), (10, 0.01), (11, -0.029), (12, 0.055), (13, 0.001), (14, -0.017), (15, 0.065), (16, 0.034), (17, -0.023), (18, 0.011), (19, 0.038), (20, -0.02), (21, 0.011), (22, 0.023), (23, -0.051), (24, 0.037), (25, -0.021), (26, -0.022), (27, -0.013), (28, 0.002), (29, 0.009), (30, 0.034), (31, -0.051), (32, 0.069), (33, -0.054), (34, -0.116), (35, 0.005), (36, -0.004), (37, -0.013), (38, -0.027), (39, -0.03), (40, -0.004), (41, -0.026), (42, 0.058), (43, -0.004), (44, -0.021), (45, 0.039), (46, 0.012), (47, -0.043), (48, 0.012), (49, -0.02)]
simIndex simValue blogId blogTitle
same-blog 1 0.95747101 1976 andrew gelman stats-2013-08-10-The birthday problem
Introduction: A friend with a baby who was born a couple weeks late commented that he would’ve liked a website that gave an estimated due date that was something more accurate than the usual “last menstrual period + 40 weeks.” I did a quick google and found this and this . Based on their descriptions of what information they use, the first site looks like it might be good, and the second site looks iffy. But I don’t really know.
2 0.79573327 919 andrew gelman stats-2011-09-21-Least surprising headline of the year
Introduction: “ Poker Web Site Cheated Users, U.S. Suit Says “ Shocking. Who’d have thought the developers of an online poker site would cheat??
3 0.75021541 1698 andrew gelman stats-2013-01-30-The spam just gets weirder and weirder
Introduction: In the inbox today, under the header, “Hidden Costs behind Milk & Dairy Consumption (video)”: Hey Professor Gelman, Our site’s production team recently released a short video uncovering the local and global impact that milk has on our lives. After spending some time on your posts, I noticed you talked about dairy products and milk so I thought I’d email you. Are you the correct person to contact in regards to the content on the site? If so, let me know if you’re interested in checking out the video. Thanks, Emily S. Hmmm . . . I guess I do talk a lot about dairy products and milk on this site!
4 0.72445959 894 andrew gelman stats-2011-09-07-Hipmunk FAIL: Graphics without content is not enough
Introduction: I love a good GUI but not if it doesn’t give me the information I need. I again tried Hipmunk and it again failed (this time for a trip to Baltimore where it gave only a useless subset of the available Amtrak trains). I don’t know anything about the internet biz. What I’m guessing is that they set up this cool website that is pretty much functional, with the goal of selling it for a few million dollars to Travelocity or Expedia or Kayak. What I’m wondering is, why haven’t they made the deal already? Hipmunk’s GUI is great. The site is useless because it’s missing so many flights, but if you put it in an actual travel site such as Expedia, it would be great. It’s enough to make me want to hit someone with an i-phone . . .
5 0.69799834 1061 andrew gelman stats-2011-12-16-CrossValidated: A place to post your statistics questions
Introduction: Seth Rogers writes: I [Rogers] am a member of an online community of statisticians where I burn a great deal of time (and a recovering cog sci researcher). Our community website is a peer-reviewed Q and A spanning stats topics ranging from applications to mathematical theory. Our online community consists of mostly university faculty, grad students and technical consultants. The answer quality is very strong and the web design is intuitive. I think you and your readers are like-minded and would be really interested in some of the topics on the site, CrossValidated (you may know the sister site: stackoverflow.com ). The philosophy is purely to further knowledge for the sake of knowledge and take pride in learning. I took a quick look and the site seemed like it could be useful to people. The only thing I didn’t understand is, why doesn’t it have a search function? (Or maybe it was there somewhere and I couldn’t find it.) P.S. to all the commenters who wrote replies such
6 0.65278351 1193 andrew gelman stats-2012-03-03-“Do you guys pay your bills?”
7 0.62651551 1211 andrew gelman stats-2012-03-13-A personal bit of spam, just for me!
9 0.60760725 497 andrew gelman stats-2011-01-02-Hipmunk update
10 0.60343665 2304 andrew gelman stats-2014-04-24-An open site for researchers to post and share papers
12 0.60065347 1871 andrew gelman stats-2013-05-27-Annals of spam
13 0.60004473 1012 andrew gelman stats-2011-11-16-Blog bribes!
14 0.59592181 530 andrew gelman stats-2011-01-22-MS-Bayes?
15 0.58425415 748 andrew gelman stats-2011-06-06-Why your Klout score is meaningless
16 0.57939726 1421 andrew gelman stats-2012-07-19-Alexa, Maricel, and Marty: Three cellular automata who got on my nerves
17 0.57900625 2238 andrew gelman stats-2014-03-09-Hipmunk worked
18 0.57787484 1810 andrew gelman stats-2013-04-17-Subway series
19 0.57305396 135 andrew gelman stats-2010-07-09-Rasmussen sez: “108% of Respondents Say . . .”
20 0.57015681 129 andrew gelman stats-2010-07-05-Unrelated to all else
topicId topicWeight
[(7, 0.112), (9, 0.042), (16, 0.071), (21, 0.056), (23, 0.035), (24, 0.186), (48, 0.03), (77, 0.064), (82, 0.045), (99, 0.217)]
simIndex simValue blogId blogTitle
same-blog 1 0.95115423 1976 andrew gelman stats-2013-08-10-The birthday problem
Introduction: A friend with a baby who was born a couple weeks late commented that he would’ve liked a website that gave an estimated due date that was something more accurate than the usual “last menstrual period + 40 weeks.” I did a quick google and found this and this . Based on their descriptions of what information they use, the first site looks like it might be good, and the second site looks iffy. But I don’t really know.
2 0.92712182 1607 andrew gelman stats-2012-12-05-The p-value is not . . .
Introduction: From a recent email exchange: I agree that you should never compare p-values directly. The p-value is a strange nonlinear transformation of data that is only interpretable under the null hypothesis. Once you abandon the null (as we do when we observe something with a very low p-value), the p-value itself becomes irrelevant. To put it another way, the p-value is a measure of evidence, it is not an estimate of effect size (as it is often treated, with the idea that a p=.001 effect is larger than a p=.01 effect, etc). Even conditional on sample size, the p-value is not a measure of effect size.
Introduction: Aleks points me to this article showing some pretty maps by Eric Fisher showing where people of different ethnicity live within several metro areas within the U.S. The idea is simple but effective; in the words of Cliff Kuang: Fisher used a straight forward method borrowed from Rankin: Using U.S. Census data from 2000, he created a map where one dot equals 25 people. The dots are then color-coded based on race: White is pink; Black is blue; Hispanic is orange, and Asian is green. The results for various cities are fascinating: Just like every city is different, every city is integrated (or segregated) in different ways. New York is shown below. No, San Francisco is not “very, very white” But I worry that these maps are difficult for non-experts to read. For example, Kuang writes the following:: San Francisco proper is very, very white. This is an understandable mistake coming from someone who, I assume, has never lived in the Bay Area. But what’s amazing i
4 0.91117239 975 andrew gelman stats-2011-10-27-Caffeine keeps your Mac awake
Introduction: Sometimes my computer goes blank when I’m giving a presentation and I haven’t clicked on anything for awhile. I mentioned this to Malecki and he installed Caffeine on my computer; problem solved.
5 0.8928827 401 andrew gelman stats-2010-11-08-Silly old chi-square!
Introduction: Brian Mulford writes: I [Mulford] ran across this blog post and found myself questioning the relevance of the test used. I’d think Chi-Square would be inappropriate for trying to measure significance of choice in the manner presented here; irrespective of the cute hamster. Since this is a common test for marketers and website developers – I’d be interested in which techniques you might suggest? For tests of this nature, I typically measure a variety of variables (image placement, size, type, page speed, “page feel” as expressed in a factor, etc) and use LOGIT, Cluster and possibly a simple Bayesian model to determine which variables were most significant (chosen). Pearson Chi-squared may be used to express relationships between variables and outcome but I’ve typically not used it to simply judge a 0/1 choice as statistically significant or not. My reply: I like the decision-theoretic way that the blogger (Jason Cohen, according to the webpage) starts: If you wait too
6 0.89170611 1525 andrew gelman stats-2012-10-08-Ethical standards in different data communities
7 0.89161992 2029 andrew gelman stats-2013-09-18-Understanding posterior p-values
8 0.8905741 1757 andrew gelman stats-2013-03-11-My problem with the Lindley paradox
9 0.89010084 1584 andrew gelman stats-2012-11-19-Tradeoffs in information graphics
10 0.88902211 1080 andrew gelman stats-2011-12-24-Latest in blog advertising
11 0.88723409 1792 andrew gelman stats-2013-04-07-X on JLP
12 0.88529128 1604 andrew gelman stats-2012-12-04-An epithet I can live with
13 0.88437569 721 andrew gelman stats-2011-05-20-Non-statistical thinking in the US foreign policy establishment
14 0.88261425 574 andrew gelman stats-2011-02-14-“The best data visualizations should stand on their own”? I don’t think so.
15 0.88203442 488 andrew gelman stats-2010-12-27-Graph of the year
16 0.88171971 502 andrew gelman stats-2011-01-04-Cash in, cash out graph
17 0.88124549 562 andrew gelman stats-2011-02-06-Statistician cracks Toronto lottery
18 0.88018048 1176 andrew gelman stats-2012-02-19-Standardized writing styles and standardized graphing styles
19 0.88011205 1438 andrew gelman stats-2012-07-31-What is a Bayesian?
20 0.87924659 1219 andrew gelman stats-2012-03-18-Tips on “great design” from . . . Microsoft!