andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1502 knowledge-graph by maker-knowledge-mining

1502 andrew gelman stats-2012-09-19-Scalability in education


meta infos for this blog

Source: html

Introduction: This blog is an exercise in scalability. Instead of sending a long email to one person, I put the email in a blog where thousands can read it. Instead of devoting three hours to a referee report that will only be read by two people (the author and the journal editor), I do the equivalent here on the blog. When the American Statistical Association asked me to participate in a workshop to give writing advice for a select group of young researchers, I agreed to participate in this program, as long as the authors were willing to have their articles and my comments posted on the blog. I think my advice on writing research articles had much more effect being posted on the web than it would’ve had, if I’d kept it in that meeting. (On the other hand, my advice benefited from having those two student papers to push against. If I’d just tried to give general advice without the context, I don’t think it would’ve been so useful to anyone.) I’ve tried to be scalable for many years before


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Instead of sending a long email to one person, I put the email in a blog where thousands can read it. [sent-2, score-0.899]

2 Instead of devoting three hours to a referee report that will only be read by two people (the author and the journal editor), I do the equivalent here on the blog. [sent-3, score-0.591]

3 When the American Statistical Association asked me to participate in a workshop to give writing advice for a select group of young researchers, I agreed to participate in this program, as long as the authors were willing to have their articles and my comments posted on the blog. [sent-4, score-1.946]

4 I think my advice on writing research articles had much more effect being posted on the web than it would’ve had, if I’d kept it in that meeting. [sent-5, score-0.921]

5 (On the other hand, my advice benefited from having those two student papers to push against. [sent-6, score-0.564]

6 If I’d just tried to give general advice without the context, I don’t think it would’ve been so useful to anyone. [sent-7, score-0.573]

7 ) I’ve tried to be scalable for many years before I started blogging. [sent-8, score-0.391]

8 Rather than put a huge effort into preparing a new class, I’ll center the class around notes which become a book. [sent-9, score-0.558]

9 The readership of Bayesian Data Analysis is orders of magnitude more than all the students who’ve taken my courses on Bayesian statistics. [sent-10, score-0.519]

10 But now there are programs such as Coursera and institutions such as Wikipedia that can easily reach thousands. [sent-11, score-0.409]

11 So there’s a lot more to scalability than I’ve done so far. [sent-12, score-0.185]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('advice', 0.316), ('participate', 0.228), ('scalability', 0.185), ('devoting', 0.174), ('coursera', 0.174), ('posted', 0.161), ('orders', 0.161), ('tried', 0.159), ('ve', 0.157), ('scalable', 0.152), ('email', 0.152), ('preparing', 0.143), ('readership', 0.143), ('workshop', 0.14), ('articles', 0.137), ('class', 0.137), ('benefited', 0.136), ('referee', 0.132), ('instead', 0.12), ('institutions', 0.118), ('sending', 0.114), ('courses', 0.113), ('select', 0.113), ('push', 0.112), ('long', 0.112), ('writing', 0.111), ('exercise', 0.111), ('agreed', 0.109), ('programs', 0.107), ('kept', 0.107), ('magnitude', 0.102), ('young', 0.101), ('reach', 0.101), ('wikipedia', 0.1), ('thousands', 0.098), ('give', 0.098), ('editor', 0.098), ('notes', 0.097), ('bayesian', 0.096), ('read', 0.096), ('equivalent', 0.095), ('hours', 0.094), ('put', 0.093), ('willing', 0.092), ('association', 0.09), ('web', 0.089), ('center', 0.088), ('easily', 0.083), ('blog', 0.082), ('started', 0.08)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 1502 andrew gelman stats-2012-09-19-Scalability in education

Introduction: This blog is an exercise in scalability. Instead of sending a long email to one person, I put the email in a blog where thousands can read it. Instead of devoting three hours to a referee report that will only be read by two people (the author and the journal editor), I do the equivalent here on the blog. When the American Statistical Association asked me to participate in a workshop to give writing advice for a select group of young researchers, I agreed to participate in this program, as long as the authors were willing to have their articles and my comments posted on the blog. I think my advice on writing research articles had much more effect being posted on the web than it would’ve had, if I’d kept it in that meeting. (On the other hand, my advice benefited from having those two student papers to push against. If I’d just tried to give general advice without the context, I don’t think it would’ve been so useful to anyone.) I’ve tried to be scalable for many years before

2 0.17729177 2287 andrew gelman stats-2014-04-09-Advice: positive-sum, zero-sum, or negative-sum

Introduction: There’s a lot of free advice out there. I offer some of it myself! As I’ve written before (see this post from 2008 reacting to this advice from Dan Goldstein for business school students, and this post from 2010 reacting to some general advice from Nassim Taleb), what we see is typically presented as advice to individuals, but it’s also interesting to consider the possible total effects if the advice is taken. It’s time to play the game again. This time it’s advice from sociologist Fabio Rojas for Ph.D. students. I’ll copy his eight points of advice, then, for each, evaluate whether I think it is positive or negative sum: 1. Show up. Even if you feel horrible, show up. No matter what. Period. Unless someone died in your family, show up. 2. Do your job. Grade the papers. Do the lab work. Unless the work is extreme, take it in stride. 3. Be completely realistic about how you will be evaluated from day #1 – acquire a teaching record and a record of publication. Don’t h

3 0.1610423 278 andrew gelman stats-2010-09-15-Advice that might make sense for individuals but is negative-sum overall

Introduction: There’s a lot of free advice out there. As I wrote a couple years ago, it’s usually presented as advice to individuals, but it’s also interesting to consider the possible total effects if the advice is taken. For example, Nassim Taleb has a webpage that includes a bunch of one-line bits of advice (scroll to item 132 on the linked page). Here’s his final piece of advice: If you dislike someone, leave him alone or eliminate him; don’t attack him verbally. I’m a big Taleb fan (search this blog to see), but this seems like classic negative-sum advice. I can see how it can be a good individual strategy to keep your mouth shut, bide your time, and then sandbag your enemies. But it can’t be good if lots of people are doing this. Verbal attacks are great, as long as there’s a chance to respond. I’ve been in environments where people follow Taleb’s advice, saying nothing and occasionally trying to “eliminate” people, and it’s not pretty. I much prefer for people to be open

4 0.14952095 1520 andrew gelman stats-2012-10-03-Advice that’s so eminently sensible but so difficult to follow

Introduction: When we suggest a new method, we are duty-bound to not just demonstrate that it works better than existing approaches (or is superior in some other way such as simplicity or cost). We also need to explain why, if this new method is so great, people aren’t already using it. Various answers are possible, for example: - The new idea is technically advanced, requiring a level of mathematical or engineering complexity such that it could not easily have been discovered by accident. Hence its novelty can be explained as a product of some particular historical process. - The new idea is clever and unexpected, as with the mechanical device underlying Rubik’s Cube. - The new idea could only exist given recent technological developments (perhaps hardware developments such as a new composite material or an ultralight battery, or software developments such as a new MCMC algorithm). - The new idea usually isn’t so impressive but it shows its virtues in some previously hidden domain (fo

5 0.13949431 1428 andrew gelman stats-2012-07-25-The problem with realistic advice?

Introduction: In an article entitled 16 Weeks, Thomas Basbøll ruthlessly lays out the time constraints that limit what a student will be able to write during a semester and recommends that students follow a plan: Try to be realistic. If you need time for “free writing” or “thought writing” (writing to find out what you think) book that into your calendar as well, but the important part of the challenge is to find time to write down what you already know needs to be written. If you don’t yet know what you’re going to say this semester, then your challenge is, in part, to figure that out. But you should still find at least 30 minutes a day to write down something you know you want to say. Keep in mind that we are only talking about sixteen weeks in the very near future. . . . Assuming that you do have something say, then, here’s the challenge: write always and only when (and what) your calendar tells you to. Don’t write when “inspired” to do so (unless this happens to coincide with your writing s

6 0.13931243 2172 andrew gelman stats-2014-01-14-Advice on writing research articles

7 0.13324916 2245 andrew gelman stats-2014-03-12-More on publishing in journals

8 0.12834014 2009 andrew gelman stats-2013-09-05-A locally organized online BDA course on G+ hangout?

9 0.12710451 503 andrew gelman stats-2011-01-04-Clarity on my email policy

10 0.12010624 18 andrew gelman stats-2010-05-06-$63,000 worth of abusive research . . . or just a really stupid waste of time?

11 0.11429226 1517 andrew gelman stats-2012-10-01-“On Inspiring Students and Being Human”

12 0.10905911 1008 andrew gelman stats-2011-11-13-Student project competition

13 0.10529619 834 andrew gelman stats-2011-08-01-I owe it all to the haters

14 0.10374578 1191 andrew gelman stats-2012-03-01-Hoe noem je?

15 0.10269668 1752 andrew gelman stats-2013-03-06-Online Education and Jazz

16 0.10243753 27 andrew gelman stats-2010-05-11-Update on the spam email study

17 0.1022526 1860 andrew gelman stats-2013-05-17-How can statisticians help psychologists do their research better?

18 0.10215293 1450 andrew gelman stats-2012-08-08-My upcoming talk for the data visualization meetup

19 0.10200443 605 andrew gelman stats-2011-03-09-Does it feel like cheating when I do this? Variation in ethical standards and expectations

20 0.10046151 2022 andrew gelman stats-2013-09-13-You heard it here first: Intense exercise can suppress appetite


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.201), (1, -0.054), (2, -0.095), (3, -0.024), (4, 0.012), (5, 0.052), (6, 0.052), (7, 0.013), (8, -0.024), (9, -0.074), (10, 0.109), (11, 0.011), (12, 0.037), (13, 0.022), (14, 0.042), (15, -0.008), (16, 0.027), (17, -0.024), (18, -0.05), (19, 0.107), (20, 0.058), (21, 0.027), (22, 0.021), (23, -0.035), (24, -0.011), (25, -0.01), (26, 0.043), (27, -0.048), (28, 0.029), (29, 0.013), (30, -0.018), (31, 0.008), (32, -0.042), (33, 0.044), (34, 0.032), (35, -0.028), (36, -0.006), (37, 0.021), (38, 0.002), (39, -0.025), (40, 0.016), (41, 0.043), (42, -0.04), (43, -0.013), (44, 0.041), (45, -0.065), (46, -0.009), (47, -0.04), (48, -0.038), (49, -0.006)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97667617 1502 andrew gelman stats-2012-09-19-Scalability in education

Introduction: This blog is an exercise in scalability. Instead of sending a long email to one person, I put the email in a blog where thousands can read it. Instead of devoting three hours to a referee report that will only be read by two people (the author and the journal editor), I do the equivalent here on the blog. When the American Statistical Association asked me to participate in a workshop to give writing advice for a select group of young researchers, I agreed to participate in this program, as long as the authors were willing to have their articles and my comments posted on the blog. I think my advice on writing research articles had much more effect being posted on the web than it would’ve had, if I’d kept it in that meeting. (On the other hand, my advice benefited from having those two student papers to push against. If I’d just tried to give general advice without the context, I don’t think it would’ve been so useful to anyone.) I’ve tried to be scalable for many years before

2 0.73695987 1338 andrew gelman stats-2012-05-23-Advice on writing research articles

Introduction: From a few years ago : Both the papers sent to me appear to have strong research results. Now that the research has been done, I’d recommend rewriting both articles from scratch , using the following template: 1. Start with the conclusions. Write a couple pages on what you’ve found and what you recommend. In writing these conclusions, you should also be writing some of the introduction, in that you’ll need to give enough background so that general readers can understand what you’re talking about and why they should care. But you want to start with the conclusions, because that will determine what sort of background information you’ll need to give. 2. Now step back. What is the principal evidence for your conclusions? Make some graphs and pull out some key numbers that represent your research findings which back up your claims. 3. Back one more step, now. What are the methods and data you used to obtain your research findings. 4. Now go back and write the literature review

3 0.73677081 605 andrew gelman stats-2011-03-09-Does it feel like cheating when I do this? Variation in ethical standards and expectations

Introduction: John Sides points to this discussion (with over 200 comments!) by political scientist Charli Carpenter of her response to a student from another university who emailed with questions that look like they come from a homework assignment. Here’s the student’s original email : Hi Mr. Carpenter, I am a fourth year college student and I have the honor of reading one of your books and I just had a few questions… I am very fascinated by your work and I am just trying to understand everything. Can you please address some of my questions? I would greatly appreciate it. It certainly help me understand your wonderful article better. Thank you very much! :) 1. What is the fundamental purpose of your article? 2. What is your fundamental thesis? 3. What evidence do you use to support your thesis? 4. What is the overall conclusion? 5. Do you feel that you have a fair balance of opposing viewpoints? Sincerely, After a series of emails in which Carpenter explained why she thought

4 0.73423022 2172 andrew gelman stats-2014-01-14-Advice on writing research articles

Introduction: From a few years ago : General advice Both the papers sent to me appear to have strong research results. Now that the research has been done, I’d recommend rewriting both articles from scratch, using the following template: 1. Start with the conclusions. Write a couple pages on what you’ve found and what you recommend. In writing these conclusions, you should also be writing some of the introduction, in that you’ll need to give enough background so that general readers can understand what you’re talking about and why they should care. But you want to start with the conclusions, because that will determine what sort of background information you’ll need to give. 2. Now step back. What is the principal evidence for your conclusions? Make some graphs and pull out some key numbers that represent your research findings which back up your claims. 3. Back one more step, now. What are the methods and data you used to obtain your research findings. 4. Now go back and write the l

5 0.72787774 1611 andrew gelman stats-2012-12-07-Feedback on my Bayesian Data Analysis class at Columbia

Introduction: In one of the final Jitts, we asked the students how the course could be improved. Some of their suggestions would work, some would not. I’m putting all the suggestions below, interpolating my responses. (Overall, I think the course went well. Please remember that the remarks below are not course evaluations; they are answers to my specific question of how the course could be better. If we’d had a Jitt asking all the ways the course was good, you’d be seeing lots of positive remarks. But that wouldn’t be particularly useful or interesting.) The best thing about the course is that the kids worked hard each week on their homeworks. OK, here are the comments and my replies: Could have been better if we did less amount but more in detail. I don’t know if this would’ve been possible. I wanted to get to the harder stuff (HMC, VB, nonparametric models) which required a certain amount of preparation. And, even so, there was not time for everything. And also, needs solut

6 0.72773391 1428 andrew gelman stats-2012-07-25-The problem with realistic advice?

7 0.72610116 727 andrew gelman stats-2011-05-23-My new writing strategy

8 0.72028172 1254 andrew gelman stats-2012-04-09-In the future, everyone will publish everything.

9 0.69626433 2075 andrew gelman stats-2013-10-23-PubMed Commons: A system for commenting on articles in PubMed

10 0.6948719 2148 andrew gelman stats-2013-12-25-Spam!

11 0.69093615 1225 andrew gelman stats-2012-03-22-Procrastination as a positive productivity strategy

12 0.68781078 1658 andrew gelman stats-2013-01-07-Free advice from an academic writing coach!

13 0.68316567 2232 andrew gelman stats-2014-03-03-What is the appropriate time scale for blogging—the day or the week?

14 0.67144555 2302 andrew gelman stats-2014-04-23-A short questionnaire regarding the subjective assessment of evidence

15 0.66941816 1520 andrew gelman stats-2012-10-03-Advice that’s so eminently sensible but so difficult to follow

16 0.66846275 980 andrew gelman stats-2011-10-29-When people meet this guy, can they resist the temptation to ask him what he’s doing for breakfast??

17 0.66260576 515 andrew gelman stats-2011-01-13-The Road to a B

18 0.66232342 1917 andrew gelman stats-2013-06-28-Econ coauthorship update

19 0.66195005 2244 andrew gelman stats-2014-03-11-What if I were to stop publishing in journals?

20 0.66153049 579 andrew gelman stats-2011-02-18-What is this, a statistics class or a dentist’s office??


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(9, 0.039), (15, 0.042), (16, 0.032), (21, 0.03), (24, 0.221), (46, 0.096), (53, 0.013), (57, 0.012), (86, 0.05), (88, 0.016), (89, 0.025), (99, 0.334)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98016214 1502 andrew gelman stats-2012-09-19-Scalability in education

Introduction: This blog is an exercise in scalability. Instead of sending a long email to one person, I put the email in a blog where thousands can read it. Instead of devoting three hours to a referee report that will only be read by two people (the author and the journal editor), I do the equivalent here on the blog. When the American Statistical Association asked me to participate in a workshop to give writing advice for a select group of young researchers, I agreed to participate in this program, as long as the authors were willing to have their articles and my comments posted on the blog. I think my advice on writing research articles had much more effect being posted on the web than it would’ve had, if I’d kept it in that meeting. (On the other hand, my advice benefited from having those two student papers to push against. If I’d just tried to give general advice without the context, I don’t think it would’ve been so useful to anyone.) I’ve tried to be scalable for many years before

2 0.96219528 1939 andrew gelman stats-2013-07-15-Forward causal reasoning statements are about estimation; reverse causal questions are about model checking and hypothesis generation

Introduction: Consider two broad classes of inferential questions : 1. Forward causal inference . What might happen if we do X? What are the effects of smoking on health, the effects of schooling on knowledge, the effect of campaigns on election outcomes, and so forth? 2. Reverse causal inference . What causes Y? Why do more attractive people earn more money? Why do many poor people vote for Republicans and rich people vote for Democrats? Why did the economy collapse? When statisticians and econometricians write about causal inference, they focus on forward causal questions. Rubin always told us: Never ask Why? Only ask What if? And, from the econ perspective, causation is typically framed in terms of manipulations: if x had changed by 1, how much would y be expected to change, holding all else constant? But reverse causal questions are important too. They’re a natural way to think (consider the importance of the word “Why”) and are arguably more important than forward questions.

3 0.96218884 2035 andrew gelman stats-2013-09-23-Scalable Stan

Introduction: Bob writes: If you have papers that have used Stan, we’d love to hear about it. We finally got some submissions, so we’re going to start a list on the web site for 2.0 in earnest. You can either mail them to the list, to me directly, or just update the issue (at least until it’s closed or moved): https://github.com/stan-dev/stan/issues/187 For example, Henrik Mannerstrom fit a hierarchical model the other day with 360,000 data points and 120,000 variables. And it worked just fine in Stan. I’ve asked him to write this up so we can post it here. Here’s the famous graph Bob made showing the scalability of Stan for a series of hierarchical item-response models:

4 0.96160698 511 andrew gelman stats-2011-01-11-One more time on that ESP study: The problem of overestimates and the shrinkage solution

Introduction: Benedict Carey writes a follow-up article on ESP studies and Bayesian statistics. ( See here for my previous thoughts on the topic.) Everything Carey writes is fine, and he even uses an example I recommended: The statistical approach that has dominated the social sciences for almost a century is called significance testing. The idea is straightforward. A finding from any well-designed study — say, a correlation between a personality trait and the risk of depression — is considered “significant” if its probability of occurring by chance is less than 5 percent. This arbitrary cutoff makes sense when the effect being studied is a large one — for example, when measuring the so-called Stroop effect. This effect predicts that naming the color of a word is faster and more accurate when the word and color match (“red” in red letters) than when they do not (“red” in blue letters), and is very strong in almost everyone. “But if the true effect of what you are measuring is small,” sai

5 0.96025383 970 andrew gelman stats-2011-10-24-Bell Labs

Introduction: Sining Chen told me they’re hiring in the statistics group at Bell Labs . I’ll do my bit for economic stimulus by announcing this job (see below). I love Bell Labs. I worked there for three summers, in a physics lab in 1985-86 under the supervision of Loren Pfeiffer, and by myself in the statistics group in 1990. I learned a lot working for Loren. He was a really smart and driven guy. His lab was a small set of rooms—in Bell Labs, everything’s in a small room, as they value the positive externality of close physical proximity of different labs, which you get by making each lab compact—and it was Loren, his assistant (a guy named Ken West who kept everything running in the lab), and three summer students: me, Gowton Achaibar, and a girl whose name I’ve forgotten. Gowtan and I had a lot of fun chatting in the lab. One day I made a silly comment about Gowton’s accent—he was from Guyana and pronounced “three” as “tree”—and then I apologized and said: Hey, here I am making fun o

6 0.95968139 669 andrew gelman stats-2011-04-19-The mysterious Gamma (1.4, 0.4)

7 0.95962733 899 andrew gelman stats-2011-09-10-The statistical significance filter

8 0.9595502 1150 andrew gelman stats-2012-02-02-The inevitable problems with statistical significance and 95% intervals

9 0.95951819 494 andrew gelman stats-2010-12-31-Type S error rates for classical and Bayesian single and multiple comparison procedures

10 0.95831132 2109 andrew gelman stats-2013-11-21-Hidden dangers of noninformative priors

11 0.9579007 1941 andrew gelman stats-2013-07-16-Priors

12 0.95765817 1390 andrew gelman stats-2012-06-23-Traditionalist claims that modern art could just as well be replaced by a “paint-throwing chimp”

13 0.9573316 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes

14 0.95730388 1465 andrew gelman stats-2012-08-21-D. Buggin

15 0.95725906 2129 andrew gelman stats-2013-12-10-Cross-validation and Bayesian estimation of tuning parameters

16 0.95688117 1474 andrew gelman stats-2012-08-29-More on scaled-inverse Wishart and prior independence

17 0.956707 2208 andrew gelman stats-2014-02-12-How to think about “identifiability” in Bayesian inference?

18 0.95669782 2340 andrew gelman stats-2014-05-20-Thermodynamic Monte Carlo: Michael Betancourt’s new method for simulating from difficult distributions and evaluating normalizing constants

19 0.95669258 1644 andrew gelman stats-2012-12-30-Fixed effects, followed by Bayes shrinkage?

20 0.95651305 2149 andrew gelman stats-2013-12-26-Statistical evidence for revised standards