andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1250 knowledge-graph by maker-knowledge-mining

1250 andrew gelman stats-2012-04-07-Hangman tips


meta infos for this blog

Source: html

Introduction: Jeff pointed me to this article by Nick Berry. It’s kind of fun but of course if you know your opponent will be following this strategy you can figure out how to outwit it. Also, Berry writes that ETAOIN SHRDLU CMFWYP VBGKQJ XZ is the “ordering of letter frequency in English language.” Indeed this is the conventional ordering but nobody thinks it’s right anymore. See here (with further discussion here ). I wonder what corpus he’s using. P.S. Klutz was my personal standby.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 It’s kind of fun but of course if you know your opponent will be following this strategy you can figure out how to outwit it. [sent-2, score-0.988]

2 Also, Berry writes that ETAOIN SHRDLU CMFWYP VBGKQJ XZ is the “ordering of letter frequency in English language. [sent-3, score-0.416]

3 ” Indeed this is the conventional ordering but nobody thinks it’s right anymore. [sent-4, score-0.996]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('ordering', 0.453), ('etaoin', 0.312), ('shrdlu', 0.312), ('corpus', 0.264), ('berry', 0.246), ('opponent', 0.241), ('nick', 0.215), ('frequency', 0.196), ('english', 0.176), ('letter', 0.173), ('thinks', 0.168), ('conventional', 0.165), ('jeff', 0.161), ('strategy', 0.154), ('nobody', 0.139), ('fun', 0.135), ('personal', 0.133), ('pointed', 0.128), ('figure', 0.12), ('kind', 0.117), ('wonder', 0.116), ('indeed', 0.101), ('course', 0.089), ('following', 0.082), ('discussion', 0.076), ('right', 0.071), ('article', 0.061), ('know', 0.05), ('writes', 0.047), ('see', 0.041), ('also', 0.041)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 1250 andrew gelman stats-2012-04-07-Hangman tips

Introduction: Jeff pointed me to this article by Nick Berry. It’s kind of fun but of course if you know your opponent will be following this strategy you can figure out how to outwit it. Also, Berry writes that ETAOIN SHRDLU CMFWYP VBGKQJ XZ is the “ordering of letter frequency in English language.” Indeed this is the conventional ordering but nobody thinks it’s right anymore. See here (with further discussion here ). I wonder what corpus he’s using. P.S. Klutz was my personal standby.

2 0.12042627 2211 andrew gelman stats-2014-02-14-The popularity of certain baby names is falling off the clifffffffffffff

Introduction: Ubs writes: I was looking at baby name data last night and I stumbled upon something curious. I follow the baby names blog occasionally but not regularly, so I’m not sure if it’s been noticed before. Let me present it like this: Take the statement… Of the top 100 boys and top 100 girls names, only ___% contain the letter __. I’m using the SSA baby names page, so that’s U.S. births, and I’m looking at the decade of 2000-2009 (so kids currently aged 4 to 13). Which letters would you expect to have the lowest rate of occurrence? As expected, the lowest score is for Q, which appears zero times. (Jacqueline ranks #104 for girls.) It’s the second lowest that surprised me. (… You can pause and try to guess now. Spoilers to follow.) Of the other big-point Scrabble letters, Z appears in four names (Elizabeth, Zachary, Mackenzie, Zoe) and X in six, of which five are closely related (Alexis, Alexander, Alexandra, Alexa, Alex, Xavier). J is heavily overrepresented, especial

3 0.10101911 429 andrew gelman stats-2010-11-24-“But you and I don’t learn in isolation either”

Introduction: Indeed.

4 0.10101911 887 andrew gelman stats-2011-09-02-“It’s like marveling over a plastic flower when there’s a huge garden blooming outside”

Introduction: Indeed.

5 0.092461534 61 andrew gelman stats-2010-05-31-A data visualization manifesto

Introduction: Details matter (at least, they do for me), but we don’t yet have a systematic way of going back and forth between the structure of a graph, its details, and the underlying questions that motivate our visualizations. (Cleveland, Wilkinson, and others have written a bit on how to formalize these connections, and I’ve thought about it too, but we have a ways to go.) I was thinking about this difficulty after reading an article on graphics by some computer scientists that was well-written but to me lacked a feeling for the linkages between substantive/statistical goals and graphical details. I have problems with these issues too, and my point here is not to criticize but to move the discussion forward. When thinking about visualization, how important are the details? Aleks pointed me to this article by Jeffrey Heer, Michael Bostock, and Vadim Ogievetsky, “A Tour through the Visualization Zoo: A survey of powerful visualization techniques, from the obvious to the obscure.” Th

6 0.081692278 2119 andrew gelman stats-2013-12-01-Separated by a common blah blah blah

7 0.080677986 2177 andrew gelman stats-2014-01-19-“The British amateur who debunked the mathematics of happiness”

8 0.076939374 1263 andrew gelman stats-2012-04-13-Question of the week: Will the authors of a controversial new study apologize to busy statistician Don Berry for wasting his time reading and responding to their flawed article?

9 0.076277532 2356 andrew gelman stats-2014-06-02-On deck this week

10 0.072213233 87 andrew gelman stats-2010-06-15-Statistical analysis and visualization of the drug war in Mexico

11 0.071241923 227 andrew gelman stats-2010-08-23-Visualization magazine

12 0.061859142 2353 andrew gelman stats-2014-05-30-I posted this as a comment on a sociology blog

13 0.058948409 1318 andrew gelman stats-2012-05-13-Stolen jokes

14 0.057451472 688 andrew gelman stats-2011-04-30-Why it’s so relaxing to think about social issues

15 0.057184421 670 andrew gelman stats-2011-04-20-Attractive but hard-to-read graph could be made much much better

16 0.057070848 1610 andrew gelman stats-2012-12-06-Yes, checking calibration of probability forecasts is part of Bayesian statistics

17 0.056990139 538 andrew gelman stats-2011-01-25-Postdoc Position #2: Hierarchical Modeling and Statistical Graphics

18 0.056616247 1473 andrew gelman stats-2012-08-28-Turing chess run update

19 0.055962451 280 andrew gelman stats-2010-09-16-Meet Hipmunk, a really cool flight-finder that doesn’t actually work

20 0.055900272 1684 andrew gelman stats-2013-01-20-Ugly ugly ugly


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.064), (1, -0.027), (2, -0.021), (3, 0.01), (4, 0.006), (5, -0.019), (6, 0.015), (7, 0.001), (8, 0.009), (9, -0.004), (10, 0.014), (11, 0.001), (12, 0.03), (13, 0.024), (14, 0.007), (15, -0.001), (16, 0.016), (17, 0.008), (18, -0.029), (19, -0.034), (20, 0.001), (21, 0.024), (22, -0.003), (23, -0.008), (24, -0.02), (25, 0.017), (26, -0.018), (27, -0.011), (28, 0.004), (29, -0.013), (30, 0.025), (31, -0.017), (32, -0.019), (33, -0.029), (34, -0.001), (35, -0.006), (36, -0.017), (37, -0.028), (38, 0.011), (39, 0.012), (40, 0.009), (41, 0.006), (42, 0.02), (43, -0.054), (44, -0.016), (45, -0.013), (46, -0.01), (47, 0.007), (48, 0.001), (49, -0.012)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95303524 1250 andrew gelman stats-2012-04-07-Hangman tips

Introduction: Jeff pointed me to this article by Nick Berry. It’s kind of fun but of course if you know your opponent will be following this strategy you can figure out how to outwit it. Also, Berry writes that ETAOIN SHRDLU CMFWYP VBGKQJ XZ is the “ordering of letter frequency in English language.” Indeed this is the conventional ordering but nobody thinks it’s right anymore. See here (with further discussion here ). I wonder what corpus he’s using. P.S. Klutz was my personal standby.

2 0.63697809 841 andrew gelman stats-2011-08-06-Twitteo killed the bloggio star . . . Not!

Introduction: Alex Braunstein writes: Thanks for the post . You drove >800 pageviews to my site. That’s >90% of what Robert Scoble’s tweet generated with 184k followers, which I find incredibly impressive. 800 doesn’t sound like so much to me, but I suppose if it’s the right 800 . . .

3 0.60939455 2068 andrew gelman stats-2013-10-18-G+ hangout for Bayesian Data Analysis course now! (actually, in 5 minutes)

Introduction: Here’s the link . When you’re on the hangout, please mute your own microphone! I’ll have the computer point at the blackboard. You can follow along with the slides: for the first hour for the second hour P.S. Apparently there is some limit on number of hangout participants (see comments). I didn’t know about that! Maybe next time will try “on air” hangout, I will have to learn more about this. Next week the teaching asst will do the course so no hangout, then in two weeks there is no class because it’s the day after Halloween and that’s a holiday around here. So we’ll resume this on Fri 8 Nov. See you then! P.P.S. Those of you who were able to join the hangout: Could you please let me know how the visual and sound quality were? Thanks.

4 0.60556775 1798 andrew gelman stats-2013-04-11-Continuing conflict over conflict statistics

Introduction: Mike Spagat sends along a serious presentation with an ironic title: 18.7 MILLION ANNIHILATED SAYS LEADING EXPERT IN PEER–REVIEWED JOURNAL: AN APPROVED, AUTHORITATIVE, SCIENTIFIC PRESENTATION MADE BY AN EXPERT He’ll be speaking on it at tomorrow’s meeting of the Catastrophes and Conflict Forum of the Royal Society of Medicine in London. All I can say is, it’s a long time since I’ve seen a slide presentation in portrait form. It brings me back to the days of transparency sheets.

5 0.59349108 1660 andrew gelman stats-2013-01-08-Bayesian, Permutable Symmetries

Introduction: Mike Betancourt sends along this paper . Could be interesting, no? Note the heavy tail on the CDF in Figure 3, exhibiting weakened median time since 1999. And, as you can see from the bibliography, the work draws on a variety of sources:

6 0.58815408 2237 andrew gelman stats-2014-03-08-Disagreeing to disagree

7 0.57447916 260 andrew gelman stats-2010-09-07-QB2

8 0.57073921 1676 andrew gelman stats-2013-01-16-Detecting cheating in chess

9 0.56756997 2082 andrew gelman stats-2013-10-30-Berri Gladwell Loken football update

10 0.56521666 1982 andrew gelman stats-2013-08-15-Blaming scientific fraud on the Kuhnians

11 0.56408358 263 andrew gelman stats-2010-09-08-The China Study: fact or fallacy?

12 0.56386989 915 andrew gelman stats-2011-09-17-(Worst) graph of the year

13 0.56166071 1290 andrew gelman stats-2012-04-30-I suppose it’s too late to add Turing’s run-around-the-house-chess to the 2012 London Olympics?

14 0.56166059 1573 andrew gelman stats-2012-11-11-Incredibly strange spam

15 0.55968314 514 andrew gelman stats-2011-01-13-News coverage of statistical issues…how did I do?

16 0.5573414 1503 andrew gelman stats-2012-09-19-“Poor Smokers in New York State Spend 25% of Income on Cigarettes, Study Finds”

17 0.5572226 2203 andrew gelman stats-2014-02-08-“Guys who do more housework get less sex”

18 0.5564006 2112 andrew gelman stats-2013-11-25-An interesting but flawed attempt to apply general forecasting principles to contextualize attitudes toward risks of global warming

19 0.55602914 685 andrew gelman stats-2011-04-29-Data mining and allergies

20 0.5541811 1283 andrew gelman stats-2012-04-26-Let’s play “Guess the smoother”!


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(5, 0.437), (16, 0.055), (24, 0.117), (60, 0.032), (63, 0.025), (99, 0.171)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.93772864 224 andrew gelman stats-2010-08-22-Mister P gets married

Introduction: Jeff, Justin, and I write : Gay marriage is not going away as a highly emotional, contested issue. Proposition 8, the California ballot measure that bans same-sex marriage, has seen to that, as it winds its way through the federal courts. But perhaps the public has reached a turning point. And check out the (mildly) dynamic graphics. The picture below is ok but for the full effect you have to click through and play the movie.

2 0.89251649 228 andrew gelman stats-2010-08-24-A new efficient lossless compression algorithm

Introduction: Frank Wood and Nick Bartlett write : Deplump works the same as all probabilistic lossless compressors. A datastream is fed one observation at a time into a predictor which emits both the data stream and predictions about what the next observation in the stream should be for every observation. An encoder takes this output and produces a compressed stream which can be piped over a network or to a file. A receiver then takes this stream and decompresses it by doing everything in reverse. In order to ensure that the decoder has the same information available to it that the encoder had when compressing the stream, the decoded datastream is both emitted and directed to another predictor. This second predictor’s job is to produce exactly the same predictions as the initial predictor so that the decoder has the same information at every step of the process as the encoder did. The difference between probabilistic lossless compressors is in the prediction engine, encoding and decoding bein

3 0.88222206 422 andrew gelman stats-2010-11-20-A Gapminder-like data visualization package

Introduction: Ossama Hamed writes in with a new dynamic graphing software: I have the pleasure to brief you on our Data Visualization software “Trend Compass”. TC is a new concept in viewing statistics and trends in an animated way by displaying in one chart 5 axis (X, Y, Time, Bubble size & Bubble color) instead of just the traditional X and Y axis. . . .

same-blog 4 0.84926939 1250 andrew gelman stats-2012-04-07-Hangman tips

Introduction: Jeff pointed me to this article by Nick Berry. It’s kind of fun but of course if you know your opponent will be following this strategy you can figure out how to outwit it. Also, Berry writes that ETAOIN SHRDLU CMFWYP VBGKQJ XZ is the “ordering of letter frequency in English language.” Indeed this is the conventional ordering but nobody thinks it’s right anymore. See here (with further discussion here ). I wonder what corpus he’s using. P.S. Klutz was my personal standby.

5 0.84426326 665 andrew gelman stats-2011-04-17-Yes, your wish shall be granted (in 25 years)

Introduction: This one was so beautiful I just had to repost it: From the New York Times, 9 Sept 1981: IF I COULD CHANGE PARK SLOPE If I could change Park Slope I would turn it into a palace with queens and kings and princesses to dance the night away at the ball. The trees would look like garden stalks. The lights would look like silver pearls and the dresses would look like soft silver silk. You should see the ball. It looks so luxurious to me. The Park Slope ball is great. Can you guess what street it’s on? “Yes. My street. That’s Carroll Street.” – Jennifer Chatmon, second grade, P.S. 321 This was a few years before my sister told me that she felt safer having a crack house down the block because the cops were surveilling it all the time.

6 0.80317891 2005 andrew gelman stats-2013-09-02-“Il y a beaucoup de candidats démocrates, et leurs idéologies ne sont pas très différentes. Et la participation est imprévisible.”

7 0.78980017 87 andrew gelman stats-2010-06-15-Statistical analysis and visualization of the drug war in Mexico

8 0.78152806 513 andrew gelman stats-2011-01-12-“Tied for Warmest Year On Record”

9 0.76392257 164 andrew gelman stats-2010-07-26-A very short story

10 0.73233098 1606 andrew gelman stats-2012-12-05-The Grinch Comes Back

11 0.71008265 1512 andrew gelman stats-2012-09-27-A Non-random Walk Down Campaign Street

12 0.70760238 1286 andrew gelman stats-2012-04-28-Agreement Groups in US Senate and Dynamic Clustering

13 0.70483983 1103 andrew gelman stats-2012-01-06-Unconvincing defense of the recent Russian elections, and a problem when an official organ of an academic society has low standards for publication

14 0.68734187 1841 andrew gelman stats-2013-05-04-The Folk Theorem of Statistical Computing

15 0.66889435 2194 andrew gelman stats-2014-02-01-Recently in the sister blog

16 0.65402901 123 andrew gelman stats-2010-07-01-Truth in headlines

17 0.65298307 364 andrew gelman stats-2010-10-22-Politics is not a random walk: Momentum and mean reversion in polling

18 0.64646697 1052 andrew gelman stats-2011-12-11-Rational Turbulence

19 0.61668986 764 andrew gelman stats-2011-06-14-Examining US Legislative process with “Many Bills”

20 0.613846 951 andrew gelman stats-2011-10-11-Data mining efforts for Obama’s campaign