andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-1783 knowledge-graph by maker-knowledge-mining

1783 andrew gelman stats-2013-03-31-He’s getting ready to write a book


meta infos for this blog

Source: html

Introduction: Eric Novik does some open-source planning : My co-author, Jacki Buros, and I [Novik] have just signed a contract with Apress to write a book tentatively entitled “Predictive Analytics with R”, which will cover programming best practices, data munging, data exploration, and single and multi-level models with case studies in social media, healthcare, politics, marketing, and the stock market. Why does the world need another R book? We think there is a shortage of books that deal with the complete and programmer centric analysis of real, dirty, and sometimes unstructured data. Our target audience are people who have some familiarity with statistics, but do not have much experience with programming. . . . The book is projected to be about 300 pages across 8 chapters. This is my first experience with writing a book and everything I heard about the process tells me that this is going to be a long and arduous endeavor lasting anywhere from 6 to 8 months. Novik emailed me and wrot


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 We think there is a shortage of books that deal with the complete and programmer centric analysis of real, dirty, and sometimes unstructured data. [sent-3, score-0.483]

2 Our target audience are people who have some familiarity with statistics, but do not have much experience with programming. [sent-4, score-0.409]

3 The book is projected to be about 300 pages across 8 chapters. [sent-8, score-0.39]

4 This is my first experience with writing a book and everything I heard about the process tells me that this is going to be a long and arduous endeavor lasting anywhere from 6 to 8 months. [sent-9, score-1.211]

5 Novik emailed me and wrote: The work seems overwhelming. [sent-10, score-0.101]

6 I always wondered how you manage to produce such high volume of high quality content. [sent-11, score-0.547]

7 The first secret is, I wouldn’t try to write a book in 6 to 8 months. [sent-13, score-0.545]

8 The first edition of Bayesian Data Analysis took several years. [sent-14, score-0.382]

9 So if “long and arduous” to you means “6 to 8 months,” I think your time management skills are already much better than mine! [sent-16, score-0.17]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('novik', 0.491), ('arduous', 0.298), ('book', 0.195), ('secret', 0.182), ('edition', 0.177), ('munging', 0.149), ('shortage', 0.149), ('tentatively', 0.141), ('unstructured', 0.135), ('dirty', 0.135), ('familiarity', 0.13), ('programmer', 0.126), ('lasting', 0.126), ('endeavor', 0.123), ('signed', 0.12), ('projected', 0.12), ('took', 0.118), ('experience', 0.117), ('analytics', 0.111), ('healthcare', 0.11), ('contract', 0.102), ('manage', 0.101), ('emailed', 0.101), ('wondered', 0.099), ('anywhere', 0.098), ('marketing', 0.095), ('stock', 0.094), ('entitled', 0.093), ('exploration', 0.092), ('long', 0.09), ('practices', 0.089), ('management', 0.089), ('volume', 0.089), ('high', 0.088), ('target', 0.088), ('first', 0.087), ('eric', 0.085), ('planning', 0.084), ('programming', 0.083), ('mine', 0.082), ('cover', 0.082), ('produce', 0.082), ('write', 0.081), ('skills', 0.081), ('tells', 0.077), ('pages', 0.075), ('media', 0.075), ('audience', 0.074), ('complete', 0.073), ('awhile', 0.072)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 1783 andrew gelman stats-2013-03-31-He’s getting ready to write a book

Introduction: Eric Novik does some open-source planning : My co-author, Jacki Buros, and I [Novik] have just signed a contract with Apress to write a book tentatively entitled “Predictive Analytics with R”, which will cover programming best practices, data munging, data exploration, and single and multi-level models with case studies in social media, healthcare, politics, marketing, and the stock market. Why does the world need another R book? We think there is a shortage of books that deal with the complete and programmer centric analysis of real, dirty, and sometimes unstructured data. Our target audience are people who have some familiarity with statistics, but do not have much experience with programming. . . . The book is projected to be about 300 pages across 8 chapters. This is my first experience with writing a book and everything I heard about the process tells me that this is going to be a long and arduous endeavor lasting anywhere from 6 to 8 months. Novik emailed me and wrot

2 0.12272792 1642 andrew gelman stats-2012-12-28-New book by Stef van Buuren on missing-data imputation looks really good!

Introduction: Ben points us to a new book, Flexible Imputation of Missing Data . It’s excellent and I highly recommend it. Definitely worth the $89.95. Van Buuren’s book is great even if you don’t end up using the algorithm described in the book (I actually like their approach but I do think there are some limitations with their particular implementation, which is one reason we’re developing our own package ); he supplies lots of intuition, examples, and graphs. P.S. Stef’s book features an introduction by Don Rubin, which gets me thinking: if Don can find the time to write an introduction to somebody else’s book, he surely should be willing to read and comment on the third edition of his own book, no?

3 0.11028251 316 andrew gelman stats-2010-10-03-Suggested reading for a prospective statistician?

Introduction: Sam Jessup writes: I am writing to ask you to recommend papers, books–anything that comes to mind that might give a prospective statistician some sense of what the future holds for statistics (and statisticians). I have a liberal arts background with an emphasis in mathematics. It seems like this is an exciting time to be a statistician, but that’s just from the outside looking in. I’m curious about your perspective on the future of the discipline. Any recommendations? My favorite is still the book, “Statistics: A Guide to the Unknown,” first edition. (I actually have a chapter in the latest (fourth) edition, but I think the first edition (from 1972, I believe) is still the best.

4 0.10482228 1948 andrew gelman stats-2013-07-21-Bayes related

Introduction: Dave Decker writes: I’ve seen some Bayes related things recently that might make for interesting fodder on your blog. There are two books, teaching Bayesian analysis from a programming perspective. And also a “web application for data analysis using powerful Bayesian statistical methods.” I took a look. The first book is Think Bayes: Bayesian Statistics Made Simple, by Allen B. Downey . It’s super readable and, amazingly, has approximately zero overlap with Bayesian Data Analysis. Downey discusses lots of little problems in a conversational way. In some ways it’s like an old-style math stat textbook (although with a programming rather than mathematical flavor) in that the examples are designed for simplicity rather than realism. I like it! Our book already exists; it’s good to have something else for people to read, coming from an entirely different perspective. The second book is Probabilistic Programming and Bayesian Methods for Hackers , by Cameron Davidson-P

5 0.10153541 1902 andrew gelman stats-2013-06-17-Job opening at new “big data” consulting firm!

Introduction: David Shor sends along a job announcement for Civis Analytics, which he describes as “basically Obama’s Analytics team reconstituted as a company”: Data Scientist Position Overview Data Scientists are responsible for providing the fundamental data science that powers our work – including predictive analytics, data mining, experimental design and ad-hoc statistical analysis. As a Data Scientist, you will join our Chicago-based data science team, working closely and collaboratively with analysts and engineers to identify, quantify and solve big, meaningful problems. Data Scientists will have the opportunity to dive deeply into big problems and work in a variety of areas. Civis Analytics has opportunities for applicants who are seasoned professionals, brilliant new comers, and anywhere in between. Qualifications · Master’s degree in statistics, machine learning, computer science with heavy quant focus, a related subject, or a Bachelor’s degree and significant work ex

6 0.10007755 8 andrew gelman stats-2010-04-28-Advice to help the rich get richer

7 0.096412338 2255 andrew gelman stats-2014-03-19-How Americans vote

8 0.093848966 231 andrew gelman stats-2010-08-24-Yet another Bayesian job opportunity

9 0.087581195 1990 andrew gelman stats-2013-08-20-Job opening at an organization that promotes reproducible research!

10 0.085887827 1436 andrew gelman stats-2012-07-31-A book on presenting numbers from spreadsheets

11 0.085137464 1984 andrew gelman stats-2013-08-16-BDA at 40% off!

12 0.079767637 611 andrew gelman stats-2011-03-14-As the saying goes, when they argue that you’re taking over, that’s when you know you’ve won

13 0.079366386 1782 andrew gelman stats-2013-03-30-“Statistical Modeling: A Fresh Approach”

14 0.078566849 1912 andrew gelman stats-2013-06-24-Bayesian quality control?

15 0.076976329 2182 andrew gelman stats-2014-01-22-Spell-checking example demonstrates key aspects of Bayesian data analysis

16 0.075828008 2106 andrew gelman stats-2013-11-19-More on “data science” and “statistics”

17 0.07486137 2245 andrew gelman stats-2014-03-12-More on publishing in journals

18 0.074756376 1634 andrew gelman stats-2012-12-21-Two reviews of Nate Silver’s new book, from Kaiser Fung and Cathy O’Neil

19 0.07446263 1021 andrew gelman stats-2011-11-21-Don’t judge a book by its title

20 0.073206224 1909 andrew gelman stats-2013-06-21-Job openings at conservative political analytics firm!


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.135), (1, -0.018), (2, -0.067), (3, 0.05), (4, 0.001), (5, 0.057), (6, -0.001), (7, 0.008), (8, 0.04), (9, 0.032), (10, 0.02), (11, -0.044), (12, 0.028), (13, -0.026), (14, 0.079), (15, 0.003), (16, -0.015), (17, 0.007), (18, 0.058), (19, -0.047), (20, 0.011), (21, 0.035), (22, 0.012), (23, 0.024), (24, -0.015), (25, 0.018), (26, 0.031), (27, -0.008), (28, 0.054), (29, 0.008), (30, -0.086), (31, -0.045), (32, -0.025), (33, 0.021), (34, 0.024), (35, 0.067), (36, -0.039), (37, -0.007), (38, 0.014), (39, -0.041), (40, 0.002), (41, 0.012), (42, -0.027), (43, 0.039), (44, -0.007), (45, 0.005), (46, -0.01), (47, 0.022), (48, 0.003), (49, -0.015)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98044086 1783 andrew gelman stats-2013-03-31-He’s getting ready to write a book

Introduction: Eric Novik does some open-source planning : My co-author, Jacki Buros, and I [Novik] have just signed a contract with Apress to write a book tentatively entitled “Predictive Analytics with R”, which will cover programming best practices, data munging, data exploration, and single and multi-level models with case studies in social media, healthcare, politics, marketing, and the stock market. Why does the world need another R book? We think there is a shortage of books that deal with the complete and programmer centric analysis of real, dirty, and sometimes unstructured data. Our target audience are people who have some familiarity with statistics, but do not have much experience with programming. . . . The book is projected to be about 300 pages across 8 chapters. This is my first experience with writing a book and everything I heard about the process tells me that this is going to be a long and arduous endeavor lasting anywhere from 6 to 8 months. Novik emailed me and wrot

2 0.86334652 1642 andrew gelman stats-2012-12-28-New book by Stef van Buuren on missing-data imputation looks really good!

Introduction: Ben points us to a new book, Flexible Imputation of Missing Data . It’s excellent and I highly recommend it. Definitely worth the $89.95. Van Buuren’s book is great even if you don’t end up using the algorithm described in the book (I actually like their approach but I do think there are some limitations with their particular implementation, which is one reason we’re developing our own package ); he supplies lots of intuition, examples, and graphs. P.S. Stef’s book features an introduction by Don Rubin, which gets me thinking: if Don can find the time to write an introduction to somebody else’s book, he surely should be willing to read and comment on the third edition of his own book, no?

3 0.80717117 1782 andrew gelman stats-2013-03-30-“Statistical Modeling: A Fresh Approach”

Introduction: Ben Hansen recommended to me this book and course by Daniel Kaplan. It looks pretty good. I’ve only looked at the website, not the book itself, and I’m sure I’d find lots of places to disagree with it on details, but the general flow seemed reasonable, also I liked that there’s lots of course materials to go with it. Does anyone have any experience with this book? Is it the way to go (for now)?

4 0.79587793 1188 andrew gelman stats-2012-02-28-Reference on longitudinal models?

Introduction: Antonio Ramos writes: The book with Hill has very little on longitudinal models. So do you recommended any reference to complement your book on covariance structures typical from these models, such as AR(1), Antedependence, Factor Analytic, etc? I am very much interest in BUGS code for these basic models as well as how to extend them to more complex situations. My reply: There is a book by Banerjee, Carlin, and Gelfand on Bayesian space-time models. Beyond that, I think there is good work in psychometrics on covaraince structures but I don’t know the literature.

5 0.79473442 1179 andrew gelman stats-2012-02-21-“Readability” as freedom from the actual sensation of reading

Introduction: In her essay on Margaret Mitchell and Gone With the Wind, Claudia Roth Pierpoint writes: The much remarked “readability” of the book must have played a part in this smooth passage from the page to the screen, since “readability” has to do not only with freedom from obscurity but, paradoxically, with freedom from the actual sensation of reading [emphasis added]—of the tug and traction of words as they move thoughts into place in the mind. Requiring, in fact, the least reading, the most “readable” book allows its characters to slip easily through nets of words and into other forms. Popular art has been well defined by just this effortless movement from medium to medium, which is carried out, as Leslie Fiedler observed in relation to Uncle Tom’s Cabin, “without loss of intensity or alteration of meaning.” Isabel Archer rises from the page only in the hanging garments of Henry James’s prose, but Scarlett O’Hara is a free woman. Well put. I wish Pierpoint would come out with ano

6 0.79405224 2021 andrew gelman stats-2013-09-13-Swiss Jonah Lehrer

7 0.7894237 590 andrew gelman stats-2011-02-25-Good introductory book for statistical computation?

8 0.78873891 1634 andrew gelman stats-2012-12-21-Two reviews of Nate Silver’s new book, from Kaiser Fung and Cathy O’Neil

9 0.78472382 1895 andrew gelman stats-2013-06-12-Peter Thiel is writing another book!

10 0.78094357 1984 andrew gelman stats-2013-08-16-BDA at 40% off!

11 0.76745838 1948 andrew gelman stats-2013-07-21-Bayes related

12 0.76220596 2168 andrew gelman stats-2014-01-12-Things that I like that almost nobody else is interested in

13 0.75764835 517 andrew gelman stats-2011-01-14-Bayes in China update

14 0.75524324 1436 andrew gelman stats-2012-07-31-A book on presenting numbers from spreadsheets

15 0.74852151 1970 andrew gelman stats-2013-08-06-New words of 1917

16 0.74747252 1542 andrew gelman stats-2012-10-20-A statistical model for underdispersion

17 0.74355584 8 andrew gelman stats-2010-04-28-Advice to help the rich get richer

18 0.74236906 1843 andrew gelman stats-2013-05-05-The New York Times Book of Mathematics

19 0.74192387 16 andrew gelman stats-2010-05-04-Burgess on Kipling

20 0.73773277 1927 andrew gelman stats-2013-07-05-“Numbersense: How to use big data to your advantage”


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(16, 0.08), (21, 0.038), (24, 0.07), (28, 0.016), (47, 0.021), (53, 0.051), (63, 0.038), (73, 0.025), (86, 0.021), (89, 0.262), (99, 0.277)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.96015865 1756 andrew gelman stats-2013-03-10-He said he was sorry

Introduction: Yes, it can be done : Hereby I contact you to clarify the situation that occurred with the publication of the article entitled *** which was published in Volume 11, Issue 3 of *** and I made the mistake of declaring as an author. This chapter is a plagiarism of . . . I wish to express and acknowledge that I am solely responsible for this . . . I recognize the gravity of the offense committed, since there is no justification for so doing. Therefore, and as a sign of shame and regret I feel in this situation, I will publish this letter, in order to set an example for other researchers do not engage in a similar error. No more, and to please accept my apologies, Sincerely, *** P.S. Since we’re on Retraction Watch already, I’ll point you to this unrelated story featuring a hilarious photo of a fraudster, who in this case was a grad student in psychology who faked his data and “has agreed to submit to a three-year supervisory period for any work involving funding from the

2 0.95425665 459 andrew gelman stats-2010-12-09-Solve mazes by starting at the exit

Introduction: It worked on this one . Good maze designers know this trick and are careful to design multiple branches in each direction. Back when I was in junior high, I used to make huge mazes, and the basic idea was to anticipate what the solver might try to do and to make the maze difficult by postponing the point at which he would realize a path was going nowhere. For example, you might have 6 branches: one dead end, two pairs that form loops going back to the start, and one that is the correct solution. You do this from both directions and add some twists and turns, and there you are. But the maze designer aiming for the naive solver–the sap who starts from the entrance and goes toward the exit–can simplify matters by just having 6 branches: five dead ends and one winner. This sort of thing is easy to solve in the reverse direction. I’m surprised the Times didn’t do better for their special puzzle issue.

3 0.94766611 1685 andrew gelman stats-2013-01-21-Class on computational social science this semester, Fridays, 1:00-3:40pm

Introduction: Sharad Goel, Jake Hofman, and Sergei Vassilvitskii are teaching this awesome class on computational social science this semester in the applied math department at Columbia. Here’s the course info . You should take this course. These guys are amazing.

4 0.94407046 1160 andrew gelman stats-2012-02-09-Familial Linkage between Neuropsychiatric Disorders and Intellectual Interests

Introduction: When I spoke at Princeton last year, I talked with neuroscientist Sam Wang, who told me about a project he did surveying incoming Princeton freshmen about mental illness in their families. He and his coauthor Benjamin Campbell found some interesting results, which they just published : A link between intellect and temperament has long been the subject of speculation. . . . Studies of the artistically inclined report linkage with familial depression, while among eminent and creative scientists, a lower incidence of affective disorders is found. In the case of developmental disorders, a heightened prevalence of autism spectrum disorders (ASDs) has been found in the families of mathematicians, physicists, and engineers. . . . We surveyed the incoming class of 2014 at Princeton University about their intended academic major, familial incidence of neuropsychiatric disorders, and demographic variables. . . . Consistent with prior findings, we noticed a relation between intended academ

5 0.94046146 1215 andrew gelman stats-2012-03-16-The “hot hand” and problems with hypothesis testing

Introduction: Gur Yaari writes : Anyone who has ever watched a sports competition is familiar with expressions like “on fire”, “in the zone”, “on a roll”, “momentum” and so on. But what do these expressions really mean? In 1985 when Thomas Gilovich, Robert Vallone and Amos Tversky studied this phenomenon for the first time, they defined it as: “. . . these phrases express a belief that the performance of a player during a particular period is significantly better than expected on the basis of the player’s overall record”. Their conclusion was that what people tend to perceive as a “hot hand” is essentially a cognitive illusion caused by a misperception of random sequences. Until recently there was little, if any, evidence to rule out their conclusion. Increased computing power and new data availability from various sports now provide surprising evidence of this phenomenon, thus reigniting the debate. Yaari goes on to some studies that have found time dependence in basketball, baseball, voll

6 0.92604756 833 andrew gelman stats-2011-07-31-Untunable Metropolis

7 0.92590714 2243 andrew gelman stats-2014-03-11-The myth of the myth of the myth of the hot hand

8 0.92180187 1855 andrew gelman stats-2013-05-13-Stan!

9 0.90808457 1708 andrew gelman stats-2013-02-05-Wouldn’t it be cool if Glenn Hubbard were consulting for Herbalife and I were on the other side?

10 0.89117825 1953 andrew gelman stats-2013-07-24-Recently in the sister blog

same-blog 11 0.88939387 1783 andrew gelman stats-2013-03-31-He’s getting ready to write a book

12 0.88558334 407 andrew gelman stats-2010-11-11-Data Visualization vs. Statistical Graphics

13 0.87988871 1477 andrew gelman stats-2012-08-30-Visualizing Distributions of Covariance Matrices

14 0.87628078 623 andrew gelman stats-2011-03-21-Baseball’s greatest fielders

15 0.86763984 566 andrew gelman stats-2011-02-09-The boxer, the wrestler, and the coin flip, again

16 0.85901558 1320 andrew gelman stats-2012-05-14-Question 4 of my final exam for Design and Analysis of Sample Surveys

17 0.8588295 1839 andrew gelman stats-2013-05-04-Jesus historian Niall Ferguson and the improving standards of public discourse

18 0.84609216 850 andrew gelman stats-2011-08-11-Understanding how estimates change when you move to a multilevel model

19 0.84432697 1991 andrew gelman stats-2013-08-21-BDA3 table of contents (also a new paper on visualization)

20 0.84424293 231 andrew gelman stats-2010-08-24-Yet another Bayesian job opportunity