andrew_gelman_stats andrew_gelman_stats-2011 andrew_gelman_stats-2011-945 knowledge-graph by maker-knowledge-mining

945 andrew gelman stats-2011-10-06-W’man < W’pedia, again


meta infos for this blog

Source: html

Introduction: Blogger Deep Climate looks at another paper by the 2002 recipient of the American Statistical Association’s Founders award. This time it’s not funny, it’s just sad. Here’s Wikipedia on simulated annealing: By analogy with this physical process, each step of the SA algorithm replaces the current solution by a random “nearby” solution, chosen with a probability that depends on the difference between the corresponding function values and on a global parameter T (called the temperature), that is gradually decreased during the process. The dependency is such that the current solution changes almost randomly when T is large, but increasingly “downhill” as T goes to zero. The allowance for “uphill” moves saves the method from becoming stuck at local minima—which are the bane of greedier methods. And here’s Wegman: During each step of the algorithm, the variable that will eventually represent the minimum is replaced by a random solution that is chosen according to a temperature


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 The dependency is such that the current solution changes almost randomly when T is large, but increasingly “downhill” as T goes to zero. [sent-4, score-0.117]

2 The allowance for “uphill” moves saves the method from becoming stuck at local minima—which are the bane of greedier methods. [sent-5, score-0.307]

3 And here’s Wegman: During each step of the algorithm, the variable that will eventually represent the minimum is replaced by a random solution that is chosen according to a temperature parameter, T. [sent-6, score-0.726]

4 As the temperature of the system decreases, the probability of higher temperature values replacing the minimum decreases, but it is always non-zero. [sent-7, score-0.969]

5 The decrease in probability ensures a gradual decrease in the value of the minimum. [sent-8, score-0.431]

6 However, the non-zero stipulation allows for a higher value to replace the minimum. [sent-9, score-0.546]

7 Though this may sound like a flaw in the algorithm, it makes simulated annealing very useful because it allows for global minimums to be found rather than local ones. [sent-10, score-0.83]

8 It reads like a junior high school book report: “Though this may sound like a flaw in the algorithm, it makes simulated annealing very useful . [sent-14, score-0.554]

9 ” And how about this: “this non-zero probability stipulation will allow for the value of the minimum to back track in a sense and become unstuck from local minima. [sent-17, score-1.515]

10 Use font-preserving software so that “2^n” doesn’t become “2n”. [sent-30, score-0.105]

11 Hope that nobody actually reads your article—if they do, they might notice the mistakes and the plagiarism. [sent-32, score-0.082]

12 In reality, I think people should not plagiarize, should not pollute the scientific literature by writing about things they know nothing about, and should admit and apologize for their offenses. [sent-45, score-0.089]

13 First, I think it’s worse to copy others’ work without attribution than to copy one’s own work. [sent-50, score-0.198]

14 Arguably they were not as great as the journal reviewers thought—Frey seems to have been able to go pretty far based on good writing and novelty value of his topics and ideas—but they were real (if minor) contributions to the literature. [sent-53, score-0.401]

15 In contrast, Wegman’s papers discussed here are not contributions are all. [sent-54, score-0.137]

16 Breaking the rules is bad, but breaking the rules and not coming up with anything helpful to anybody, that’s much worse. [sent-56, score-0.208]

17 John Mashey points me to this : Your review will be published alongside other world-class contributions from leading researchers in the field. [sent-59, score-0.137]

18 All WIREs article topics and authors are selected by an internationally renowned Editorial Board, and all content is rigorously peer reviewed by experts. [sent-60, score-0.496]

19 I cannot imagine that the statement, “this non-zero probability stipulation will allow for the value of the minimum to back track in a sense and become unstuck from local minima” was rigorously peer reviewed by experts. [sent-62, score-1.825]

20 I’d loooove to see the record of who were the “experts” who reviewed for Wegman’s various contributions to that journal. [sent-63, score-0.318]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('minimum', 0.296), ('stipulation', 0.258), ('wegman', 0.247), ('temperature', 0.237), ('unstuck', 0.212), ('local', 0.184), ('annealing', 0.182), ('algorithm', 0.17), ('plagiarize', 0.147), ('value', 0.142), ('contributions', 0.137), ('frey', 0.137), ('simulated', 0.129), ('minima', 0.129), ('probability', 0.123), ('reviewed', 0.12), ('solution', 0.117), ('rigorously', 0.116), ('become', 0.105), ('track', 0.105), ('global', 0.104), ('flaw', 0.095), ('decreases', 0.094), ('allow', 0.09), ('apologize', 0.089), ('breaking', 0.084), ('decrease', 0.083), ('reads', 0.082), ('higher', 0.076), ('chosen', 0.076), ('peer', 0.074), ('copy', 0.071), ('allows', 0.07), ('sound', 0.066), ('bane', 0.065), ('internationally', 0.065), ('misattribution', 0.065), ('uphill', 0.065), ('journal', 0.062), ('rules', 0.062), ('loooove', 0.061), ('renowned', 0.061), ('replaces', 0.061), ('yasmin', 0.061), ('topics', 0.06), ('intonation', 0.058), ('saves', 0.058), ('downhill', 0.058), ('recipient', 0.058), ('worse', 0.056)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 945 andrew gelman stats-2011-10-06-W’man < W’pedia, again

Introduction: Blogger Deep Climate looks at another paper by the 2002 recipient of the American Statistical Association’s Founders award. This time it’s not funny, it’s just sad. Here’s Wikipedia on simulated annealing: By analogy with this physical process, each step of the SA algorithm replaces the current solution by a random “nearby” solution, chosen with a probability that depends on the difference between the corresponding function values and on a global parameter T (called the temperature), that is gradually decreased during the process. The dependency is such that the current solution changes almost randomly when T is large, but increasingly “downhill” as T goes to zero. The allowance for “uphill” moves saves the method from becoming stuck at local minima—which are the bane of greedier methods. And here’s Wegman: During each step of the algorithm, the variable that will eventually represent the minimum is replaced by a random solution that is chosen according to a temperature

2 0.20246267 2340 andrew gelman stats-2014-05-20-Thermodynamic Monte Carlo: Michael Betancourt’s new method for simulating from difficult distributions and evaluating normalizing constants

Introduction: I hate to keep bumping our scheduled posts but this is just too important and too exciting to wait. So it’s time to jump the queue. The news is a paper from Michael Betancourt that presents a super-cool new way to compute normalizing constants: A common strategy for inference in complex models is the relaxation of a simple model into the more complex target model, for example the prior into the posterior in Bayesian inference. Existing approaches that attempt to generate such transformations, however, are sensitive to the pathologies of complex distributions and can be difficult to implement in practice. Leveraging the geometry of thermodynamic processes I introduce a principled and robust approach to deforming measures that presents a powerful new tool for inference. The idea is to generalize Hamiltonian Monte Carlo so that it moves through a family of distributions (that is, it transitions through an “inverse temperature” variable called beta that indexes the family) a

3 0.20189099 901 andrew gelman stats-2011-09-12-Some thoughts on academic cheating, inspired by Frey, Wegman, Fischer, Hauser, Stapel

Introduction: As regular readers of this blog are aware, I am fascinated by academic and scientific cheating and the excuses people give for it. Bruno Frey and colleagues published a single article (with only minor variants) in five different major journals, and these articles did not cite each other. And there have been several other cases of his self-plagiarism (see this review from Olaf Storbeck). I do not mind the general practice of repeating oneself for different audiences—in the social sciences, we call this Arrow’s Theorem —but in this case Frey seems to have gone a bit too far. Blogger Economic Logic has looked into this and concluded that this sort of common practice is standard in “the context of the German(-speaking) academic environment,” and what sets Frey apart is not his self-plagiarism or even his brazenness but rather his practice of doing it in high-visibility journals. Economic Logic writes that “[Frey's] contribution is pedagogical, he found a good and interesting

4 0.19007775 728 andrew gelman stats-2011-05-24-A (not quite) grand unified theory of plagiarism, as applied to the Wegman case

Introduction: A common reason for plagiarism is laziness: you want credit for doing something but you don’t really feel like doing it–maybe you’d rather go fishing, or bowling, or blogging, or whatever, so you just steal it, or you hire someone to steal it for you. Interestingly enough, we see that in many defenses of plagiarism allegations. A common response is: I was sloppy in dealing with my notes, or I let my research assistant (who, incidentally, wasn’t credited in the final version) copy things for me and the research assistant got sloppy. The common theme: The person wanted the credit without doing the work. As I wrote last year, I like to think that directness and openness is a virtue in scientific writing. For example, clearly citing the works we draw from, even when such citing of secondary sources might make us appear less erudite. But I can see how some scholars might feel a pressure to cover their traces. Wegman Which brings us to Ed Wegman, whose defense of plagiari

5 0.18310575 766 andrew gelman stats-2011-06-14-Last Wegman post (for now)

Introduction: John Mashey points me to a news article by Eli Kintisch with the following wonderful quote: Will Happer, a physicist at Princeton University who questions the consensus view on climate, thinks Mashey is a destructive force who uses “totalitarian tactics”–publishing damaging documents online, without peer review–to carry out personal vendettas. I’ve never thought of uploading files as “totalitarian” but maybe they do things differently at Princeton. I actually think of totalitarians as acting secretly–denunciations without evidence, midnight arrests, trials in undisclosed locations, and so forth. Mashey’s practice of putting everything out in the open seems to me the opposite of totalitarian. The article also reports that Edward Wegman’s lawyer said that Wegman “has never engaged in plagiarism.” If I were the lawyer, I’d be pretty mad at Wegman at this point. I can just imagine the conversation: Lawyer: You never told me about that 2005 paper where you stole from Bria

6 0.17701563 751 andrew gelman stats-2011-06-08-Another Wegman plagiarism

7 0.17082392 902 andrew gelman stats-2011-09-12-The importance of style in academic writing

8 0.1675754 1501 andrew gelman stats-2012-09-18-More studies on the economic effects of climate change

9 0.15096444 180 andrew gelman stats-2010-08-03-Climate Change News

10 0.13272932 883 andrew gelman stats-2011-09-01-Arrow’s theorem update

11 0.12587175 815 andrew gelman stats-2011-07-22-Statistical inference based on the minimum description length principle

12 0.111774 1201 andrew gelman stats-2012-03-07-Inference = data + model

13 0.10912478 2244 andrew gelman stats-2014-03-11-What if I were to stop publishing in journals?

14 0.10833697 1442 andrew gelman stats-2012-08-03-Double standard? Plagiarizing journos get slammed, plagiarizing profs just shrug it off

15 0.10443905 1440 andrew gelman stats-2012-08-02-“A Christmas Carol” as applied to plagiarism

16 0.10413464 2191 andrew gelman stats-2014-01-29-“Questioning The Lancet, PLOS, And Other Surveys On Iraqi Deaths, An Interview With Univ. of London Professor Michael Spagat”

17 0.10282256 1324 andrew gelman stats-2012-05-16-Wikipedia author confronts Ed Wegman

18 0.102103 2364 andrew gelman stats-2014-06-08-Regression and causality and variable ordering

19 0.10147611 1435 andrew gelman stats-2012-07-30-Retracted articles and unethical behavior in economics journals?

20 0.098258525 1295 andrew gelman stats-2012-05-02-Selection bias, or, How you can think the experts don’t check their models, if you simply don’t look at what the experts actually are doing


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.184), (1, -0.034), (2, -0.028), (3, -0.026), (4, -0.009), (5, -0.026), (6, 0.088), (7, -0.049), (8, 0.009), (9, -0.041), (10, 0.059), (11, -0.028), (12, -0.046), (13, 0.021), (14, -0.05), (15, -0.039), (16, 0.079), (17, 0.013), (18, 0.045), (19, -0.073), (20, 0.001), (21, 0.044), (22, 0.004), (23, 0.021), (24, 0.046), (25, -0.045), (26, -0.028), (27, 0.001), (28, 0.002), (29, 0.005), (30, 0.027), (31, 0.069), (32, -0.028), (33, 0.007), (34, -0.037), (35, -0.011), (36, -0.014), (37, -0.045), (38, 0.042), (39, 0.003), (40, -0.062), (41, -0.004), (42, -0.072), (43, -0.016), (44, -0.049), (45, -0.03), (46, -0.035), (47, -0.0), (48, -0.004), (49, 0.017)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94642413 945 andrew gelman stats-2011-10-06-W’man < W’pedia, again

Introduction: Blogger Deep Climate looks at another paper by the 2002 recipient of the American Statistical Association’s Founders award. This time it’s not funny, it’s just sad. Here’s Wikipedia on simulated annealing: By analogy with this physical process, each step of the SA algorithm replaces the current solution by a random “nearby” solution, chosen with a probability that depends on the difference between the corresponding function values and on a global parameter T (called the temperature), that is gradually decreased during the process. The dependency is such that the current solution changes almost randomly when T is large, but increasingly “downhill” as T goes to zero. The allowance for “uphill” moves saves the method from becoming stuck at local minima—which are the bane of greedier methods. And here’s Wegman: During each step of the algorithm, the variable that will eventually represent the minimum is replaced by a random solution that is chosen according to a temperature

2 0.7495572 766 andrew gelman stats-2011-06-14-Last Wegman post (for now)

Introduction: John Mashey points me to a news article by Eli Kintisch with the following wonderful quote: Will Happer, a physicist at Princeton University who questions the consensus view on climate, thinks Mashey is a destructive force who uses “totalitarian tactics”–publishing damaging documents online, without peer review–to carry out personal vendettas. I’ve never thought of uploading files as “totalitarian” but maybe they do things differently at Princeton. I actually think of totalitarians as acting secretly–denunciations without evidence, midnight arrests, trials in undisclosed locations, and so forth. Mashey’s practice of putting everything out in the open seems to me the opposite of totalitarian. The article also reports that Edward Wegman’s lawyer said that Wegman “has never engaged in plagiarism.” If I were the lawyer, I’d be pretty mad at Wegman at this point. I can just imagine the conversation: Lawyer: You never told me about that 2005 paper where you stole from Bria

3 0.7357316 1324 andrew gelman stats-2012-05-16-Wikipedia author confronts Ed Wegman

Introduction: Wegman: “It’s not reprinted 100 percent like you had it.” Wikipedia guy: “No, you added another paragraph at the end and you changed the headline. . . . You even copied the typos that I’ve corrected on my website. It was taken verbatim and reprinted in your paper.” The original author got a check for $500 but, unfortunately, no free subscription to “Wiley Interdisciplinary Reviews: Computational Statistics” (a $1400-$2800 value ). P.S. To those who think I’m being mean to Wegman: I haven’t yet heard that he’s apologized to the people whose work he copied without attribution, or to the people who spent their time tracking all this down, or to the U.S. Congress for misrepresenting his expertise in his official report. Everyone makes mistakes, and just about everyone has ethical lapses at times. But when you get caught you’re supposed to make apology and restitution.

4 0.73112077 728 andrew gelman stats-2011-05-24-A (not quite) grand unified theory of plagiarism, as applied to the Wegman case

Introduction: A common reason for plagiarism is laziness: you want credit for doing something but you don’t really feel like doing it–maybe you’d rather go fishing, or bowling, or blogging, or whatever, so you just steal it, or you hire someone to steal it for you. Interestingly enough, we see that in many defenses of plagiarism allegations. A common response is: I was sloppy in dealing with my notes, or I let my research assistant (who, incidentally, wasn’t credited in the final version) copy things for me and the research assistant got sloppy. The common theme: The person wanted the credit without doing the work. As I wrote last year, I like to think that directness and openness is a virtue in scientific writing. For example, clearly citing the works we draw from, even when such citing of secondary sources might make us appear less erudite. But I can see how some scholars might feel a pressure to cover their traces. Wegman Which brings us to Ed Wegman, whose defense of plagiari

5 0.70400584 901 andrew gelman stats-2011-09-12-Some thoughts on academic cheating, inspired by Frey, Wegman, Fischer, Hauser, Stapel

Introduction: As regular readers of this blog are aware, I am fascinated by academic and scientific cheating and the excuses people give for it. Bruno Frey and colleagues published a single article (with only minor variants) in five different major journals, and these articles did not cite each other. And there have been several other cases of his self-plagiarism (see this review from Olaf Storbeck). I do not mind the general practice of repeating oneself for different audiences—in the social sciences, we call this Arrow’s Theorem —but in this case Frey seems to have gone a bit too far. Blogger Economic Logic has looked into this and concluded that this sort of common practice is standard in “the context of the German(-speaking) academic environment,” and what sets Frey apart is not his self-plagiarism or even his brazenness but rather his practice of doing it in high-visibility journals. Economic Logic writes that “[Frey's] contribution is pedagogical, he found a good and interesting

6 0.69572628 1568 andrew gelman stats-2012-11-07-That last satisfaction at the end of the career

7 0.6912272 751 andrew gelman stats-2011-06-08-Another Wegman plagiarism

8 0.68975258 1588 andrew gelman stats-2012-11-23-No one knows what it’s like to be the bad man

9 0.68681645 400 andrew gelman stats-2010-11-08-Poli sci plagiarism update, and a note about the benefits of not caring

10 0.6848591 930 andrew gelman stats-2011-09-28-Wiley Wegman chutzpah update: Now you too can buy a selection of garbled Wikipedia articles, for a mere $1400-$2800 per year!

11 0.68251562 2334 andrew gelman stats-2014-05-14-“The subtle funk of just a little poultry offal”

12 0.66102082 722 andrew gelman stats-2011-05-20-Why no Wegmania?

13 0.6599856 1654 andrew gelman stats-2013-01-04-“Don’t think of it as duplication. Think of it as a single paper in a superposition of two quantum journals.”

14 0.65963328 1442 andrew gelman stats-2012-08-03-Double standard? Plagiarizing journos get slammed, plagiarizing profs just shrug it off

15 0.65505844 180 andrew gelman stats-2010-08-03-Climate Change News

16 0.65292031 1161 andrew gelman stats-2012-02-10-If an entire article in Computational Statistics and Data Analysis were put together from other, unacknowledged, sources, would that be a work of art?

17 0.64614266 1236 andrew gelman stats-2012-03-29-Resolution of Diederik Stapel case

18 0.64431167 902 andrew gelman stats-2011-09-12-The importance of style in academic writing

19 0.64409059 2191 andrew gelman stats-2014-01-29-“Questioning The Lancet, PLOS, And Other Surveys On Iraqi Deaths, An Interview With Univ. of London Professor Michael Spagat”

20 0.64105743 2234 andrew gelman stats-2014-03-05-Plagiarism, Arizona style


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(15, 0.291), (16, 0.055), (21, 0.017), (24, 0.15), (36, 0.011), (45, 0.013), (59, 0.026), (63, 0.02), (68, 0.016), (95, 0.033), (99, 0.231)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.96354342 439 andrew gelman stats-2010-11-30-Of psychology research and investment tips

Introduction: A few days after “ Dramatic study shows participants are affected by psychological phenomena from the future ,” (see here ) the British Psychological Society follows up with “ Can psychology help combat pseudoscience? .” Somehow I’m reminded of that bit of financial advice which says, if you want to save some money, your best investment is to pay off your credit card bills.

2 0.96088409 1081 andrew gelman stats-2011-12-24-Statistical ethics violation

Introduction: A colleague writes: When I was in NYC I went to this party by group of Japanese bio-scientists. There, one guy told me about how the biggest pharmaceutical company in Japan did their statistics. They ran 100 different tests and reported the most significant one. (This was in 2006 and he said they stopped doing this few years back so they were doing this until pretty recently…) I’m not sure if this was 100 multiple comparison or 100 different kinds of test but I’m sure they wouldn’t want to disclose their data… Ouch!

3 0.95547754 908 andrew gelman stats-2011-09-14-Type M errors in the lab

Introduction: Jeff points us to this news article by Asher Mullard: Bayer halts nearly two-thirds of its target-validation projects because in-house experimental findings fail to match up with published literature claims, finds a first-of-a-kind analysis on data irreproducibility. An unspoken industry rule alleges that at least 50% of published studies from academic laboratories cannot be repeated in an industrial setting, wrote venture capitalist Bruce Booth in a recent blog post. A first-of-a-kind analysis of Bayer’s internal efforts to validate ‘new drug target’ claims now not only supports this view but suggests that 50% may be an underestimate; the company’s in-house experimental data do not match literature claims in 65% of target-validation projects, leading to project discontinuation. . . . Khusru Asadullah, Head of Target Discovery at Bayer, and his colleagues looked back at 67 target-validation projects, covering the majority of Bayer’s work in oncology, women’s health and cardiov

4 0.94234371 834 andrew gelman stats-2011-08-01-I owe it all to the haters

Introduction: Sometimes when I submit an article to a journal it is accepted right away or with minor alterations. But many of my favorite articles were rejected or had to go through an exhausting series of revisions. For example, this influential article had a very hostile referee and we had to seriously push the journal editor to accept it. This one was rejected by one or two journals before finally appearing with discussion. This paper was rejected by the American Political Science Review with no chance of revision and we had to publish it in the British Journal of Political Science, which was a bit odd given that the article was 100% about American politics. And when I submitted this instant classic (actually at the invitation of the editor), the referees found it to be trivial, and the editor did me the favor of publishing it but only by officially labeling it as a discussion of another article that appeared in the same issue. Some of my most influential papers were accepted right

same-blog 5 0.93250829 945 andrew gelman stats-2011-10-06-W’man < W’pedia, again

Introduction: Blogger Deep Climate looks at another paper by the 2002 recipient of the American Statistical Association’s Founders award. This time it’s not funny, it’s just sad. Here’s Wikipedia on simulated annealing: By analogy with this physical process, each step of the SA algorithm replaces the current solution by a random “nearby” solution, chosen with a probability that depends on the difference between the corresponding function values and on a global parameter T (called the temperature), that is gradually decreased during the process. The dependency is such that the current solution changes almost randomly when T is large, but increasingly “downhill” as T goes to zero. The allowance for “uphill” moves saves the method from becoming stuck at local minima—which are the bane of greedier methods. And here’s Wegman: During each step of the algorithm, the variable that will eventually represent the minimum is replaced by a random solution that is chosen according to a temperature

6 0.93189847 1541 andrew gelman stats-2012-10-19-Statistical discrimination again

7 0.92073846 329 andrew gelman stats-2010-10-08-More on those dudes who will pay your professor $8000 to assign a book to your class, and related stories about small-time sleazoids

8 0.91455102 1800 andrew gelman stats-2013-04-12-Too tired to mock

9 0.91207492 133 andrew gelman stats-2010-07-08-Gratuitous use of “Bayesian Statistics,” a branding issue?

10 0.90522969 1624 andrew gelman stats-2012-12-15-New prize on causality in statstistics education

11 0.90440953 1394 andrew gelman stats-2012-06-27-99!

12 0.90225434 1794 andrew gelman stats-2013-04-09-My talks in DC and Baltimore this week

13 0.89307821 1908 andrew gelman stats-2013-06-21-Interpreting interactions in discrete-data regression

14 0.87703514 762 andrew gelman stats-2011-06-13-How should journals handle replication studies?

15 0.87673712 2278 andrew gelman stats-2014-04-01-Association for Psychological Science announces a new journal

16 0.85091656 2188 andrew gelman stats-2014-01-27-“Disappointed with your results? Boost your scientific paper”

17 0.85031891 576 andrew gelman stats-2011-02-15-With a bit of precognition, you’d have known I was going to post again on this topic, and with a lot of precognition, you’d have known I was going to post today

18 0.84637845 1833 andrew gelman stats-2013-04-30-“Tragedy of the science-communication commons”

19 0.8454842 1998 andrew gelman stats-2013-08-25-A new Bem theory

20 0.84267688 883 andrew gelman stats-2011-09-01-Arrow’s theorem update