andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-640 knowledge-graph by maker-knowledge-mining

640 andrew gelman stats-2011-03-31-Why Edit Wikipedia?


meta infos for this blog

Source: html

Introduction: Zoe Corbyn’s article for The Guardian (UK), titled Wikipedia wants more contributions from academics , and the followup discussion on Slashdot got me thinking about my own Wikipedia edits. The article quotes Dario Taraborelli, a research analyst for the Wikimedia Foundation, as saying “Academics are trapped in this paradox of using Wikipedia but not contributing,” Huh? I’m really wondering what man-in-the-street wrote all the great stats stuff out there. And what’s the paradox? I use lots of things without contributing to them. Taraborelli is further quoted as saying “The Wikimedia Foundation is looking at how it might capture expert conversation about Wikipedia content happening on other websites and feed it back to the community as a way of providing pointers for improvement.” This struck home. I recently went through the entry for latent Dirichlet allocation and found a bug in their derivation. I wrote up a revised derivation and posted it on my own blog .


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Zoe Corbyn’s article for The Guardian (UK), titled Wikipedia wants more contributions from academics , and the followup discussion on Slashdot got me thinking about my own Wikipedia edits. [sent-1, score-0.292]

2 The article quotes Dario Taraborelli, a research analyst for the Wikimedia Foundation, as saying “Academics are trapped in this paradox of using Wikipedia but not contributing,” Huh? [sent-2, score-0.36]

3 I use lots of things without contributing to them. [sent-5, score-0.153]

4 Taraborelli is further quoted as saying “The Wikimedia Foundation is looking at how it might capture expert conversation about Wikipedia content happening on other websites and feed it back to the community as a way of providing pointers for improvement. [sent-6, score-0.405]

5 I recently went through the entry for latent Dirichlet allocation and found a bug in their derivation. [sent-8, score-0.175]

6 I wrote up a revised derivation and posted it on my own blog . [sent-9, score-0.176]

7 Second, as Corbyn’s article points out, I was afraid I’d put in lots of work and my changes would be backed out. [sent-12, score-0.148]

8 I wasn’t worried that Wikipedia would erase whole pages, but apparently it’s an issue for some these days. [sent-13, score-0.262]

9 A real issue is that most of the articles are pretty good, and while they’re not necessarily written the way I’d write them, they’re good enough that I don’t think it’s worth rewriting the whole thing (also, see point 2). [sent-14, score-0.373]

10 If you’re status conscious in a traditional way, you don’t blog either. [sent-15, score-0.095]

11 It’s not what “counts” when it comes time for tenure and promotion. [sent-16, score-0.15]

12 Well, encyclopedia articles and such never counted for much on your CV. [sent-18, score-0.479]

13 I did a few handbook type things and then started turning them down, mainly because I’m not a big fan of the handbook format. [sent-19, score-0.492]

14 I was told many times on tenure track that I shouldn’t be “wasting” so much time teaching. [sent-21, score-0.239]

15 I was even told by a dean at a major midwestern university that they barely even counted teaching. [sent-22, score-0.512]

16 So is it any surprise we don’t want to focus on teaching or writing encyclopedia articles? [sent-23, score-0.191]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('wikipedia', 0.414), ('corbyn', 0.241), ('taraborelli', 0.241), ('wikimedia', 0.241), ('encyclopedia', 0.191), ('handbook', 0.173), ('counted', 0.166), ('contributing', 0.153), ('tenure', 0.15), ('academics', 0.132), ('paradox', 0.13), ('foundation', 0.125), ('articles', 0.122), ('erase', 0.11), ('guardian', 0.11), ('slashdot', 0.11), ('zoe', 0.11), ('derivation', 0.103), ('midwestern', 0.103), ('allocation', 0.099), ('rewriting', 0.099), ('conscious', 0.095), ('websites', 0.095), ('pointers', 0.095), ('dirichlet', 0.09), ('told', 0.089), ('uk', 0.086), ('trapped', 0.086), ('followup', 0.085), ('feed', 0.081), ('whole', 0.08), ('backed', 0.079), ('barely', 0.079), ('editing', 0.078), ('analyst', 0.077), ('bug', 0.076), ('titled', 0.075), ('dean', 0.075), ('mainly', 0.074), ('revised', 0.073), ('wasting', 0.073), ('turning', 0.072), ('format', 0.072), ('issue', 0.072), ('attributed', 0.071), ('afraid', 0.069), ('struck', 0.069), ('counts', 0.068), ('capture', 0.067), ('saying', 0.067)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000002 640 andrew gelman stats-2011-03-31-Why Edit Wikipedia?

Introduction: Zoe Corbyn’s article for The Guardian (UK), titled Wikipedia wants more contributions from academics , and the followup discussion on Slashdot got me thinking about my own Wikipedia edits. The article quotes Dario Taraborelli, a research analyst for the Wikimedia Foundation, as saying “Academics are trapped in this paradox of using Wikipedia but not contributing,” Huh? I’m really wondering what man-in-the-street wrote all the great stats stuff out there. And what’s the paradox? I use lots of things without contributing to them. Taraborelli is further quoted as saying “The Wikimedia Foundation is looking at how it might capture expert conversation about Wikipedia content happening on other websites and feed it back to the community as a way of providing pointers for improvement.” This struck home. I recently went through the entry for latent Dirichlet allocation and found a bug in their derivation. I wrote up a revised derivation and posted it on my own blog .

2 0.16973874 904 andrew gelman stats-2011-09-13-My wikipedia edit

Introduction: The other day someone mentioned my complaint about the Wikipedia article on “Bayesian inference” (see footnote 1 of this article ) and he said I should fix the Wikipedia entry myself. And so I did . I didn’t have the energy to rewrite the whole article–in particular, all of its examples involve discrete parameters, whereas the Bayesian problems I work on generally have continuous parameters, and its “mathematical foundations” section focuses on “independent identically distributed observations x” rather than data y which can have different distributions. It’s just a wacky, unbalanced article. But I altered the first few paragraphs to get rid of the stuff about the posterior probability that a model is true. I much prefer the Scholarpedia article on Bayesian statistics by David Spiegelhalter and Kenneth Rice, but I couldn’t bring myself to simply delete the Wikipedia article and replace it with the Scholarpedia content. Just to be clear: I’m not at all trying to disparage

3 0.16690637 930 andrew gelman stats-2011-09-28-Wiley Wegman chutzpah update: Now you too can buy a selection of garbled Wikipedia articles, for a mere $1400-$2800 per year!

Introduction: Someone passed on to a message from his university library announcing that the journal “Wiley Interdisciplinary Reviews: Computational Statistics” is no longer free. Librarians have to decide what to do, so I thought I’d offer the following consumer guide: Wiley Computational Statistics journal Wikipedia Frequency 6 issues per year Continuously updated Includes articles from Wikipedia? Yes Yes Cites the Wikipedia sources it uses? No Yes Edited by recipient of ASA Founders Award? Yes No Articles are subject to rigorous review? No Yes Errors, when discovered, get fixed? No Yes Number of vertices in n-dimensional hypercube? 2n 2 n Easy access to Brady Bunch trivia? No Yes Cost (North America) $1400-$2800 $0 Cost (UK) £986-£1972 £0 Cost (Europe) €1213-€2426 €0 The choice seems pretty clear to me! It’s funny for the Wiley journal to start charging now

4 0.1488113 1026 andrew gelman stats-2011-11-25-Bayes wikipedia update

Introduction: I checked and somebody went in and screwed up my fixes to the wikipedia page on Bayesian inference. I give up.

5 0.10914806 2070 andrew gelman stats-2013-10-20-The institution of tenure

Introduction: Rohin Dhar writes: The Priceonomics blog is doing a feature where we ask a few economists what they think of the the institution of tenure. If you’d be interested in participating, I’d love to get your response. As an economist, what do you think of tenure? Should it be abolished / kept / modified? My reply: Just to be clear, I’m assuming that when you say “tenure,” you’re talking about lifetime employment for college professors such as myself. I’m actually a political scientist, not an economist. So rather than giving my opinion, I’ll say what I think an economist might say. I think an economist could say one of two things: Economist as anthropologist would say: Tenure is decided by independent institutions acting freely. If they choose to offer tenure, they will have good reasons, and it is not part of an economist’s job to second-guess individual decisions. Economist as McKinsey consultant would say: Tenure can be evaluated based on a cost-benefit analysis. How

6 0.096928611 69 andrew gelman stats-2010-06-04-A Wikipedia whitewash

7 0.090452805 258 andrew gelman stats-2010-09-05-A review of a review of a review of a decade

8 0.090275355 2098 andrew gelman stats-2013-11-12-Plaig!

9 0.089452401 571 andrew gelman stats-2011-02-13-A departmental wiki page?

10 0.088545859 2319 andrew gelman stats-2014-05-05-Can we make better graphs of global temperature history?

11 0.085844204 1502 andrew gelman stats-2012-09-19-Scalability in education

12 0.083299659 2245 andrew gelman stats-2014-03-12-More on publishing in journals

13 0.08115343 120 andrew gelman stats-2010-06-30-You can’t put Pandora back in the box

14 0.079560705 865 andrew gelman stats-2011-08-22-Blogging is “destroying the business model for quality”?

15 0.078817062 1205 andrew gelman stats-2012-03-09-Coming to agreement on philosophy of statistics

16 0.073860474 831 andrew gelman stats-2011-07-30-A Wikipedia riddle!

17 0.073645279 2235 andrew gelman stats-2014-03-06-How much time (if any) should we spend criticizing research that’s fraudulent, crappy, or just plain pointless?

18 0.073126301 1832 andrew gelman stats-2013-04-29-The blogroll

19 0.072845727 1338 andrew gelman stats-2012-05-23-Advice on writing research articles

20 0.072442599 2172 andrew gelman stats-2014-01-14-Advice on writing research articles


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.139), (1, -0.061), (2, -0.053), (3, 0.024), (4, -0.007), (5, -0.001), (6, 0.055), (7, -0.016), (8, 0.02), (9, -0.019), (10, 0.013), (11, -0.012), (12, -0.006), (13, -0.0), (14, 0.004), (15, 0.002), (16, -0.006), (17, 0.018), (18, 0.0), (19, 0.022), (20, 0.003), (21, 0.009), (22, 0.002), (23, 0.021), (24, 0.031), (25, -0.053), (26, -0.022), (27, 0.031), (28, -0.013), (29, 0.016), (30, 0.025), (31, 0.027), (32, -0.002), (33, -0.004), (34, 0.005), (35, -0.027), (36, 0.021), (37, -0.005), (38, -0.017), (39, 0.034), (40, -0.001), (41, -0.026), (42, -0.007), (43, 0.015), (44, 0.022), (45, 0.006), (46, -0.005), (47, -0.001), (48, -0.016), (49, -0.004)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95763016 640 andrew gelman stats-2011-03-31-Why Edit Wikipedia?

Introduction: Zoe Corbyn’s article for The Guardian (UK), titled Wikipedia wants more contributions from academics , and the followup discussion on Slashdot got me thinking about my own Wikipedia edits. The article quotes Dario Taraborelli, a research analyst for the Wikimedia Foundation, as saying “Academics are trapped in this paradox of using Wikipedia but not contributing,” Huh? I’m really wondering what man-in-the-street wrote all the great stats stuff out there. And what’s the paradox? I use lots of things without contributing to them. Taraborelli is further quoted as saying “The Wikimedia Foundation is looking at how it might capture expert conversation about Wikipedia content happening on other websites and feed it back to the community as a way of providing pointers for improvement.” This struck home. I recently went through the entry for latent Dirichlet allocation and found a bug in their derivation. I wrote up a revised derivation and posted it on my own blog .

2 0.82282257 204 andrew gelman stats-2010-08-12-Sloppily-written slam on moderately celebrated writers is amusing nonetheless

Introduction: Via J. Robert Lennon , I discovered this amusing blog by Anis Shivani on “The 15 Most Overrated Contemporary American Writers.” Lennon found it so annoying that he refused to even link to it, but I actually enjoyed Shivani’s bit of performance art. The literary criticism I see is so focused on individual books that it’s refreshing to see someone take on an entire author’s career in a single paragraph. I agree with Lennon that Shivani’s blog doesn’t have much content –it’s full of terms such as “vacuity” and “pap,” compared to which “trendy” and “fashionable” are precision instruments–but Shivani covers a lot of ground and it’s fun to see this all in one place. My main complaint with Shivani, beyond his sloppy writing (but, hey, it’s just a blog; I’m sure he saves the good stuff for his paid gigs) is his implicit assumption that everyone should agree with him. I’m as big a Kazin fan as anyone, but I still think he completely undervalued Marquand . The other thing I noticed

3 0.81309062 865 andrew gelman stats-2011-08-22-Blogging is “destroying the business model for quality”?

Introduction: Journalist Jonathan Rauch writes that the internet is Sturgeon squared: This is the blogosphere. I’m not getting paid to be here. I’m here to get incredibly famous (in my case, even more incredibly famous) so that I can get paid somewhere else. . . . The average quality of newspapers and (published) novels is far, far better than the average quality of blog posts (and–ugh!–comments). This is because people pay for newspapers and novels. What distinguishes newspapers and novels is how much does not get published in them, because people won’t pay for it. Payment is a filter, and a pretty good one. Imperfect, of course. But pointing out the defects of the old model is merely changing the subject if the new model is worse. . . . Yes, the new model is bringing a lot of new content into being. But most of it is bad. And it’s displacing a lot of better content, by destroying the business model for quality. Even in the information economy, there’s no free lunch. . . . Yes, there’s g

4 0.80394703 1597 andrew gelman stats-2012-11-29-What is expected of a consultant

Introduction: Robin Hanson writes on paid expert consulting (of the sort that I do sometime, and is common among economists and statisticians). Hanson agrees with Keith Yost, who says: Fellow consultants and associates . . . [said] fifty percent of the job is nodding your head at whatever’s being said, thirty percent of it is just sort of looking good, and the other twenty percent is raising an objection but then if you meet resistance, then dropping it. On the other side is Steven Levitt, who Hanson quotes as saying: My own experience has been that even though I know nothing about an industry, if you give me a week, and you get a bunch of really smart people to explain the industry to me, and to tell me what they do, a lot of times what I’ve learned in economics, what I’ve learned in other places can actually be really helpful in changing the way that they see the world. Perhaps unsurprisingly given my Bayesian attitudes and my preference for continuity , I’m inclined to split the d

5 0.79784739 458 andrew gelman stats-2010-12-08-Blogging: Is it “fair use”?

Introduction: Dave Kane writes: I [Kane] am involved in a dispute relating to whether or not a blog can be considered part of one’s academic writing. Williams College restricts the use of undergraduate theses as follows: Non-commercial, academic use within the scope of “Fair Use” standards is acceptable. Otherwise, you may not copy or distribute any content without the permission of the copyright holder. Seems obvious enough. Yet some folks think that my use of thesis material in a blog post fails this test because it is not “academic.” See this post for the gory details. Parenthetically, your readers might be interested in the substantive discovery here, the details of the Williams admissions process (which is probably very similar to Columbia’s). Williams places students into academic rating (AR) categories as follows: verbal math composite SAT II ACT AP AR 1: 770-800 750-800 1520-1600 750-800 35-36 mostly 5s AR 2: 730-770 720-750 1450-1520 720-770 33-34 4s an

6 0.79480553 2306 andrew gelman stats-2014-04-26-Sleazy sock puppet can’t stop spamming our discussion of compressed sensing and promoting the work of Xiteng Liu

7 0.78317231 532 andrew gelman stats-2011-01-23-My Wall Street Journal story

8 0.78021455 1600 andrew gelman stats-2012-12-01-$241,364.83 – $13,000 = $228,364.83

9 0.77984053 430 andrew gelman stats-2010-11-25-The von Neumann paradox

10 0.77909499 1007 andrew gelman stats-2011-11-13-At last, treated with the disrespect that I deserve

11 0.7759245 1442 andrew gelman stats-2012-08-03-Double standard? Plagiarizing journos get slammed, plagiarizing profs just shrug it off

12 0.77068478 2025 andrew gelman stats-2013-09-15-The it-gets-me-so-angry-I-can’t-deal-with-it threshold

13 0.76994318 1561 andrew gelman stats-2012-11-04-Someone is wrong on the internet

14 0.76928854 489 andrew gelman stats-2010-12-28-Brow inflation

15 0.7656064 1003 andrew gelman stats-2011-11-11-$

16 0.76316422 2080 andrew gelman stats-2013-10-28-Writing for free

17 0.76252824 2015 andrew gelman stats-2013-09-10-The ethics of lying, cheating, and stealing with data: A case study

18 0.76202303 868 andrew gelman stats-2011-08-24-Blogs vs. real journalism

19 0.76180828 2197 andrew gelman stats-2014-02-04-Peabody here.

20 0.76172107 1796 andrew gelman stats-2013-04-09-The guy behind me on line for the train . . .


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(9, 0.238), (16, 0.057), (21, 0.019), (24, 0.104), (27, 0.049), (29, 0.011), (42, 0.016), (43, 0.013), (47, 0.01), (53, 0.015), (61, 0.015), (82, 0.033), (84, 0.015), (86, 0.021), (96, 0.01), (99, 0.247)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.97339761 577 andrew gelman stats-2011-02-16-Annals of really really stupid spam

Introduction: This came in the inbox today: Dear Dr. Gelman, GenWay recently found your article titled “Multiple imputation for model checking: completed-data plots with missing and latent data.” (Biometrics. 2005 Mar;61(1):74-85.) and thought you might be interested in learning about our superior quality signaling proteins. GenWay prides itself on being a leader in customer service aiming to exceed your expectations with the quality and price of our products. With more than 60,000 reagents backed by our outstanding guarantee you are sure to find the products you have been searching for. Please feel free to visit the following resource pages: * Apoptosis Pathway (product list) * Adipocytokine (product list) * Cell Cycle Pathway (product list) * Jak STAT (product list) * GnRH (product list) * MAPK (product list) * mTOR (product list) * T Cell Receptor (product list) * TGF-beta (product list) * Wnt (product list) * View All Pathways

2 0.96753418 993 andrew gelman stats-2011-11-05-The sort of thing that gives technocratic reasoning a bad name

Introduction: 1. Freakonomics characterizes drunk driving as an example of “the human tendency to worry about rare problems that are unlikely to happen.” 2. The CDC reports , “Alcohol-impaired drivers are involved in about 1 in 3 crash deaths, resulting in nearly 11,000 deaths in 2009.” No offense to the tenured faculty at the University of Chicago, but I’m going with the CDC on this one. P.S. The Freakonomics blog deserves to be dinged another time, not just for claiming, based on implausible assumptions and making the all-else-equal fallacy that “drunk walking is 8 times more likely to result in your death than drunk driving” but for presenting this weak inference as a fact rather than as a speculation. When doing “Freakonomics,” you can be counterintuitive, or you can be sensible, but it’s hard to be both. I mean, sure, sometimes you can be. But there’s a tradeoff, and in this case, they’re choosing to push the envelope on counterintuitiveness.

3 0.96397471 1356 andrew gelman stats-2012-05-31-Question 21 of my final exam for Design and Analysis of Sample Surveys

Introduction: 21. A country is divided into three regions with populations of 2 million, 2 million, and 0.5 million, respectively. A survey is done asking about foreign policy opinions.. Somebody proposes taking a sample of 50 people from each reason. Give a reason why this non-proportional sample would not usually be done, and also a reason why it might actually be a good idea. Solution to question 20 From yesterday : 20. Explain in two sentences why we expect survey respondents to be honest about vote preferences but possibly dishonest about reporting unhealty behaviors. Solution: Respondents tend to be sincere about vote preferences because this affects the outcome of the poll, and people are motivated to have their candidate poll well. This motivation is typically not present in reporting behaviors; you have no particular reason for wanting to affect the average survey response.

4 0.9555831 1532 andrew gelman stats-2012-10-13-A real-life dollar auction game!

Introduction: Actually, $100,000 auction. I learned about it after seeing the following email which was broadcast to a couple of mailing lists: Dear all, I am now writing about something completely different! I need your help “voting” for our project, and sending this e-mail to others so that they can also vote for our project. As you will see from the video, the project would fund *** Project: I am a finalist for a $100,000 prize from Brigham and Women’s Hospital. My project is to understand how ***. Ultimately, we want to develop a ***. We expect that this ** can be used to *** Here are the instructions: 1. Go to the web page: http://brighamandwomens.org/research/BFF/default.aspx 2. scroll to the bottom and follow the link to “Vote” 3. select project #** 4. FORWARD THIS E-MAIL TO AS MANY PEOPLE AS YOU CAN. Best regards, ** I love that step 4 is in ALL CAPS, just to give it that genuine chain-letter aura. Isn’t this weird? First, that this foundation would give ou

5 0.95440555 529 andrew gelman stats-2011-01-21-“City Opens Inquiry on Grading Practices at a Top-Scoring Bronx School”

Introduction: Sharon Otterman reports : When report card grades were released in the fall for the city’s 455 high schools, the highest score went to a small school in a down-and-out section of the Bronx . . . A stunning 94 percent of its seniors graduated, more than 30 points above the citywide average. . . . “When I interviewed for the school,” said Sam Buchbinder, a history teacher, “it was made very clear: this is a school that doesn’t believe in anyone failing.” That statement was not just an exhortation to excellence. It was school policy. By order of the principal, codified in the school’s teacher handbook, all teachers should grade their classes in the same way: 30 percent of students should earn a grade in the A range, 40 percent B’s, 25 percent C’s, and no more than 5 percent D’s. As long as they show up, they should not fail. Hey, that sounds like Harvard and Columbia^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H various selective northeastern colleges I’ve known. Of course, we^H^H

6 0.9501189 1291 andrew gelman stats-2012-04-30-Systematic review of publication bias in studies on publication bias

7 0.94832742 1424 andrew gelman stats-2012-07-22-Extreme events as evidence for differences in distributions

8 0.93984139 1332 andrew gelman stats-2012-05-20-Problemen met het boek

9 0.92632139 1664 andrew gelman stats-2013-01-10-Recently in the sister blog: Brussels sprouts, ugly graphs, and switched at birth

10 0.92049873 1961 andrew gelman stats-2013-07-29-Postdocs in probabilistic modeling! With David Blei! And Stan!

11 0.91679537 389 andrew gelman stats-2010-11-01-Why it can be rational to vote

12 0.91679513 1565 andrew gelman stats-2012-11-06-Why it can be rational to vote

13 0.90719223 1142 andrew gelman stats-2012-01-29-Difficulties with the 1-4-power transformation

14 0.90487474 29 andrew gelman stats-2010-05-12-Probability of successive wins in baseball

15 0.90316916 560 andrew gelman stats-2011-02-06-Education and Poverty

16 0.90267718 1110 andrew gelman stats-2012-01-10-Jobs in statistics research! In New Jersey!

17 0.89605623 1226 andrew gelman stats-2012-03-22-Story time meets the all-else-equal fallacy and the fallacy of measurement

same-blog 18 0.89222789 640 andrew gelman stats-2011-03-31-Why Edit Wikipedia?

19 0.88642108 1566 andrew gelman stats-2012-11-07-A question about voting systems—unrelated to U.S. elections!

20 0.87735403 1715 andrew gelman stats-2013-02-09-Thomas Hobbes would be spinning in his grave