andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1497 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Steve Cohen writes: Thank you for fulfilling another request of mine almost two years ago. I gave you a job description of a senior Bayesian statistical software developer that I was loooking to hire. You kindly posted it on your site. THAT EVENING, I received a response from a fellow in Florida who had worked as a C++ programmer for 10+ years and — inexplicably — was now finishing his PhD in Bayesian econometrics. We hired him and he has done magical things with our software, making estimating hierarchical Bayesian models in “real time” on very large, complex datasets a reality. He’s a super guy, to boot! Together with others on our staff, we now have fully parallelized versions of most econometric models working on over 100 cores. (You may have read a paper by one of my partners, Prof John Liechty at Penn State, on parallel slicer samplers).
sentIndex sentText sentNum sentScore
1 Steve Cohen writes: Thank you for fulfilling another request of mine almost two years ago. [sent-1, score-0.641]
2 I gave you a job description of a senior Bayesian statistical software developer that I was loooking to hire. [sent-2, score-0.824]
3 THAT EVENING, I received a response from a fellow in Florida who had worked as a C++ programmer for 10+ years and — inexplicably — was now finishing his PhD in Bayesian econometrics. [sent-4, score-0.871]
4 We hired him and he has done magical things with our software, making estimating hierarchical Bayesian models in “real time” on very large, complex datasets a reality. [sent-5, score-1.027]
5 Together with others on our staff, we now have fully parallelized versions of most econometric models working on over 100 cores. [sent-7, score-0.595]
6 (You may have read a paper by one of my partners, Prof John Liechty at Penn State, on parallel slicer samplers). [sent-8, score-0.144]
wordName wordTfidf (topN-words)
[('software', 0.215), ('samplers', 0.21), ('fulfilling', 0.201), ('kindly', 0.194), ('magical', 0.194), ('partners', 0.189), ('developer', 0.189), ('programmer', 0.189), ('finishing', 0.18), ('super', 0.18), ('bayesian', 0.174), ('evening', 0.173), ('econometric', 0.167), ('penn', 0.164), ('florida', 0.154), ('hired', 0.154), ('cohen', 0.154), ('prof', 0.154), ('staff', 0.146), ('fellow', 0.145), ('parallel', 0.144), ('request', 0.142), ('thank', 0.14), ('versions', 0.139), ('phd', 0.138), ('senior', 0.137), ('datasets', 0.131), ('steve', 0.126), ('mine', 0.123), ('models', 0.112), ('fully', 0.11), ('description', 0.108), ('complex', 0.104), ('estimating', 0.101), ('posted', 0.097), ('years', 0.097), ('hierarchical', 0.096), ('received', 0.094), ('gave', 0.093), ('worked', 0.091), ('guy', 0.09), ('together', 0.089), ('job', 0.082), ('john', 0.079), ('almost', 0.078), ('response', 0.075), ('state', 0.075), ('done', 0.068), ('making', 0.067), ('working', 0.067)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 1497 andrew gelman stats-2012-09-15-Our blog makes connections!
Introduction: Steve Cohen writes: Thank you for fulfilling another request of mine almost two years ago. I gave you a job description of a senior Bayesian statistical software developer that I was loooking to hire. You kindly posted it on your site. THAT EVENING, I received a response from a fellow in Florida who had worked as a C++ programmer for 10+ years and — inexplicably — was now finishing his PhD in Bayesian econometrics. We hired him and he has done magical things with our software, making estimating hierarchical Bayesian models in “real time” on very large, complex datasets a reality. He’s a super guy, to boot! Together with others on our staff, we now have fully parallelized versions of most econometric models working on over 100 cores. (You may have read a paper by one of my partners, Prof John Liechty at Penn State, on parallel slicer samplers).
2 0.2238874 231 andrew gelman stats-2010-08-24-Yet another Bayesian job opportunity
Introduction: Steve Cohen writes: My [Cohen's] firm is looking for strong candidates to help us in developing software and analyzing data using Bayesian methods. We have been developing a suite of programs in C++ which allow us to do Bayesian hierarchical regression and logit/probit models on marketing data. These efforts have included the use of high performance computing tools like nVidia’s CUDA and the new OpenCL standard, which allow parallel processing of Bayesian models. Our software is very, very fast – even on databases that are ½ terabyte in size. The software still needs many additions and improvements and a person with the right skill set will have the chance to make a significant contribution. Here’s the job description he sent: Bayesian statistician and C++ programmer The company In4mation Insights is a marketing research, analytics, and consulting firm which operates on the leading-edge of our industry. Our clients are Fortune 500 companies and major management consul
3 0.15920579 1489 andrew gelman stats-2012-09-09-Commercial Bayesian inference software is popping up all over
Introduction: Steve Cohen writes: As someone who has been working with Bayesian statistical models for the past several years, I [Cohen] have been challenged recently to describe the difference between Bayesian Networks (as implemented in BayesiaLab software) and modeling and inference using MCMC methods. I hope you have the time to give me (or to write on your blog) and relatively simple explanation that an advanced layman could understand. My reply: I skimmed the above website but I couldn’t quite see what they do. My guess is that they use MCMC and also various parametric approximations such as variational Bayes. They also seem to have something set up for decision analysis. My guess is that, compared to a general-purpose tool such as Stan, this Bayesia software is more accessible to non-academics in particular application areas (in this case, it looks like business marketing). But I can’t be sure. I’ve also heard about another company that looks to be doing something similar: h
4 0.11786464 2216 andrew gelman stats-2014-02-18-Florida backlash
Introduction: In a post entitled, “A holiday message from the creative class to Richard Florida — screw you,” Mark Palko argues that Florida’s famous theories about the rise of the creative class have not held up over time: Florida paints a bright picture of these people and their future, with rapidly increasing numbers, influence and wealth. He goes so far as to say “Places that succeed in attracting and retaining creative class people prosper; those that fail don’t.” . . . But, Palko argues, Except for a few special cases, this may be the worst time to make a living in the arts since the emergence of modern newspapers and general interest magazines and other mass media a hundred and twenty years ago . . . Though we now have tools that make creating and disseminating art easier than ever, no one has come up with a viable business model that supports creation in today’s economy. . . . OK, fine, so individual creatives aren’t doing so well? But what about the larger urban economies? P
5 0.097684272 1514 andrew gelman stats-2012-09-28-AdviseStat 47% Campaign Ad
Introduction: Lee Wilkinson sends me this amusing ad for his new software, AdviseStat: The ad is a parody, but the software is real !
6 0.096968643 101 andrew gelman stats-2010-06-20-“People with an itch to scratch”
7 0.095976822 1948 andrew gelman stats-2013-07-21-Bayes related
8 0.0937628 1228 andrew gelman stats-2012-03-25-Continuous variables in Bayesian networks
9 0.092566669 1554 andrew gelman stats-2012-10-31-It not necessary that Bayesian methods conform to the likelihood principle
10 0.086832903 1205 andrew gelman stats-2012-03-09-Coming to agreement on philosophy of statistics
11 0.085524172 2254 andrew gelman stats-2014-03-18-Those wacky anti-Bayesians used to be intimidating, but now they’re just pathetic
12 0.085230507 181 andrew gelman stats-2010-08-03-MCMC in Python
13 0.084805213 1670 andrew gelman stats-2013-01-13-More Bell Labs happy talk
15 0.081962667 244 andrew gelman stats-2010-08-30-Useful models, model checking, and external validation: a mini-discussion
16 0.081786111 836 andrew gelman stats-2011-08-03-Another plagiarism mystery
17 0.081437737 1719 andrew gelman stats-2013-02-11-Why waste time philosophizing?
18 0.080859967 1469 andrew gelman stats-2012-08-25-Ways of knowing
19 0.077822559 1712 andrew gelman stats-2013-02-07-Philosophy and the practice of Bayesian statistics (with all the discussions!)
20 0.077614345 183 andrew gelman stats-2010-08-04-Bayesian models for simultaneous equation systems?
topicId topicWeight
[(0, 0.125), (1, 0.059), (2, -0.069), (3, 0.053), (4, -0.037), (5, 0.048), (6, -0.036), (7, -0.016), (8, 0.016), (9, -0.012), (10, 0.016), (11, -0.057), (12, 0.028), (13, 0.029), (14, 0.039), (15, 0.054), (16, 0.035), (17, 0.015), (18, -0.007), (19, 0.046), (20, -0.012), (21, 0.055), (22, -0.007), (23, -0.018), (24, 0.005), (25, -0.059), (26, -0.032), (27, 0.017), (28, 0.017), (29, -0.025), (30, 0.012), (31, -0.023), (32, 0.051), (33, -0.016), (34, 0.001), (35, -0.019), (36, -0.018), (37, 0.05), (38, -0.021), (39, 0.006), (40, -0.019), (41, 0.059), (42, -0.062), (43, 0.053), (44, -0.022), (45, -0.048), (46, -0.028), (47, 0.059), (48, 0.045), (49, 0.012)]
simIndex simValue blogId blogTitle
same-blog 1 0.97462535 1497 andrew gelman stats-2012-09-15-Our blog makes connections!
Introduction: Steve Cohen writes: Thank you for fulfilling another request of mine almost two years ago. I gave you a job description of a senior Bayesian statistical software developer that I was loooking to hire. You kindly posted it on your site. THAT EVENING, I received a response from a fellow in Florida who had worked as a C++ programmer for 10+ years and — inexplicably — was now finishing his PhD in Bayesian econometrics. We hired him and he has done magical things with our software, making estimating hierarchical Bayesian models in “real time” on very large, complex datasets a reality. He’s a super guy, to boot! Together with others on our staff, we now have fully parallelized versions of most econometric models working on over 100 cores. (You may have read a paper by one of my partners, Prof John Liechty at Penn State, on parallel slicer samplers).
2 0.83378428 1489 andrew gelman stats-2012-09-09-Commercial Bayesian inference software is popping up all over
Introduction: Steve Cohen writes: As someone who has been working with Bayesian statistical models for the past several years, I [Cohen] have been challenged recently to describe the difference between Bayesian Networks (as implemented in BayesiaLab software) and modeling and inference using MCMC methods. I hope you have the time to give me (or to write on your blog) and relatively simple explanation that an advanced layman could understand. My reply: I skimmed the above website but I couldn’t quite see what they do. My guess is that they use MCMC and also various parametric approximations such as variational Bayes. They also seem to have something set up for decision analysis. My guess is that, compared to a general-purpose tool such as Stan, this Bayesia software is more accessible to non-academics in particular application areas (in this case, it looks like business marketing). But I can’t be sure. I’ve also heard about another company that looks to be doing something similar: h
3 0.81140482 231 andrew gelman stats-2010-08-24-Yet another Bayesian job opportunity
Introduction: Steve Cohen writes: My [Cohen's] firm is looking for strong candidates to help us in developing software and analyzing data using Bayesian methods. We have been developing a suite of programs in C++ which allow us to do Bayesian hierarchical regression and logit/probit models on marketing data. These efforts have included the use of high performance computing tools like nVidia’s CUDA and the new OpenCL standard, which allow parallel processing of Bayesian models. Our software is very, very fast – even on databases that are ½ terabyte in size. The software still needs many additions and improvements and a person with the right skill set will have the chance to make a significant contribution. Here’s the job description he sent: Bayesian statistician and C++ programmer The company In4mation Insights is a marketing research, analytics, and consulting firm which operates on the leading-edge of our industry. Our clients are Fortune 500 companies and major management consul
4 0.74731773 1228 andrew gelman stats-2012-03-25-Continuous variables in Bayesian networks
Introduction: Antti Rasinen writes: I’m a former undergrad machine learning student and a current software engineer with a Bayesian hobby. Today my two worlds collided. I ask for some enlightenment. On your blog you’ve repeatedly advocated continuous distributions with Bayesian models. Today I read this article by Ricky Ho, who writes: The strength of Bayesian network is it is highly scalable and can learn incrementally because all we do is to count the observed variables and update the probability distribution table. Similar to Neural Network, Bayesian network expects all data to be binary, categorical variable will need to be transformed into multiple binary variable as described above. Numeric variable is generally not a good fit for Bayesian network. The last sentence seems to be at odds with what you’ve said. Sadly, I don’t have enough expertise to say which view of the world is correct. During my undergrad years our team wrote an implementation of the Junction Tree algorithm. We r
5 0.73996609 183 andrew gelman stats-2010-08-04-Bayesian models for simultaneous equation systems?
Introduction: A neuroeconomist asks:: Is there any literature on the Bayesian approach to simultaneous equation systems that you could suggest? (Think demand/supply in econ). My reply: I’m not up-to-date on the Bayesian econometrics literature. TTony Lancaster came out with a book a few years ago that might have some of these models. Maybe you, the commenters, have some suggestions? Measurement-error models are inherently Bayesian, seeing as they have all these latent parameters, so it seems like there should be a lot out there.
6 0.73683268 1781 andrew gelman stats-2013-03-29-Another Feller theory
7 0.73513597 2000 andrew gelman stats-2013-08-28-Why during the 1950-1960′s did Jerry Cornfield become a Bayesian?
8 0.72208887 635 andrew gelman stats-2011-03-29-Bayesian spam!
9 0.71902019 449 andrew gelman stats-2010-12-04-Generalized Method of Moments, whatever that is
10 0.71289814 2254 andrew gelman stats-2014-03-18-Those wacky anti-Bayesians used to be intimidating, but now they’re just pathetic
11 0.70484591 2293 andrew gelman stats-2014-04-16-Looking for Bayesian expertise in India, for the purpose of analysis of sarcoma trials
12 0.69392562 205 andrew gelman stats-2010-08-13-Arnold Zellner
13 0.69136816 1469 andrew gelman stats-2012-08-25-Ways of knowing
14 0.69032478 2368 andrew gelman stats-2014-06-11-Bayes in the research conversation
15 0.68973011 110 andrew gelman stats-2010-06-26-Philosophy and the practice of Bayesian statistics
16 0.68680847 83 andrew gelman stats-2010-06-13-Silly Sas lays out old-fashioned statistical thinking
17 0.67726237 1443 andrew gelman stats-2012-08-04-Bayesian Learning via Stochastic Gradient Langevin Dynamics
18 0.67720783 1438 andrew gelman stats-2012-07-31-What is a Bayesian?
19 0.67514503 117 andrew gelman stats-2010-06-29-Ya don’t know Bayes, Jack
20 0.67281413 133 andrew gelman stats-2010-07-08-Gratuitous use of “Bayesian Statistics,” a branding issue?
topicId topicWeight
[(16, 0.065), (17, 0.021), (24, 0.157), (30, 0.127), (36, 0.051), (43, 0.019), (53, 0.021), (56, 0.018), (68, 0.023), (82, 0.02), (84, 0.017), (86, 0.05), (89, 0.057), (99, 0.249)]
simIndex simValue blogId blogTitle
same-blog 1 0.94713593 1497 andrew gelman stats-2012-09-15-Our blog makes connections!
Introduction: Steve Cohen writes: Thank you for fulfilling another request of mine almost two years ago. I gave you a job description of a senior Bayesian statistical software developer that I was loooking to hire. You kindly posted it on your site. THAT EVENING, I received a response from a fellow in Florida who had worked as a C++ programmer for 10+ years and — inexplicably — was now finishing his PhD in Bayesian econometrics. We hired him and he has done magical things with our software, making estimating hierarchical Bayesian models in “real time” on very large, complex datasets a reality. He’s a super guy, to boot! Together with others on our staff, we now have fully parallelized versions of most econometric models working on over 100 cores. (You may have read a paper by one of my partners, Prof John Liechty at Penn State, on parallel slicer samplers).
2 0.93945098 1195 andrew gelman stats-2012-03-04-Multiple comparisons dispute in the tabloids
Introduction: Yarden Katz writes: I’m probably not the first to point this out, but just in case, you might be interested in this article by T. Florian Jaeger, Daniel Pontillo, and Peter Graff on a statistical dispute [regarding the claim, "Phonemic Diversity Supports a Serial Founder Effect Model of Language Expansion from Africa"]. Seems directly relevant to your article on multiple hypothesis testing and associated talk at the Voodoo correlations meeting. Curious to know your thoughts on this if you think it’s blog-worthy. Here’s the abstract of the paper: Atkinson (Reports, 15 April 2011, p. 346) argues that the phonological complexity of languages reflects the loss of phonemic distinctions due to successive founder events during human migration (the serial founder hypothesis). Statistical simulations show that the type I error rate of Atkinson’s analysis is hugely inflated. The data at best support only a weak interpretation of the serial founder hypothesis. My reaction: I d
3 0.93677378 593 andrew gelman stats-2011-02-27-Heat map
Introduction: Jarad Niemi sends along this plot: and writes: 2010-2011 Miami Heat offensive (red), defensive (blue), and combined (black) player contribution means (dots) and 95% credible intervals (lines) where zero indicates an average NBA player. Larger positive numbers for offensive and combined are better while larger negative numbers for defense are better. In retrospect, I [Niemi] should have plotted -1*defensive_contribution so that larger was always better. The main point with this figure is that this awesome combination of James-Wade-Bosh that was discussed immediately after the LeBron trade to the Heat has a one-of-these-things-is-not-like-the-other aspect. At least according to my analysis, Bosh is hurting his team compared to the average player (although not statistically significant) due to his terrible defensive contribution (which is statistically significant). All fine so far. But the punchline comes at the end, when he writes: Anyway, a reviewer said he hated the
4 0.93532193 1416 andrew gelman stats-2012-07-14-Ripping off a ripoff
Introduction: I opened the newspaper today (recall that this blog is on an approximately one-month delay) to see a moderately horrifying story about art appraisers who are deterred by fear of lawsuits from expressing an opinion about possible forgeries. Maybe this trend will come to science too? Perhaps Brett Pelham will sue Uri Simonsohn for the pain, suffering, and loss of income occurring from the questioning of his Dennis the dentist paper ? Or maybe I’ll be sued by some rogue sociologist for publicly questioning his data dredging? Anyway, what amused me about the NYT article on art forgery was that two of the artists featured in the discussion were . . . Andy Warhol and Roy Lichtenstein! Warhol is famous for diluting the notion of the unique art object and for making works of art in a “Factory,” and Lichtenstein is famous for ripping off the style and imagery of comic book artists. It’s funny for the two of them, of all people, to come up in a discussion of authenticity. Or maybe it
5 0.93467826 179 andrew gelman stats-2010-08-03-An Olympic size swimming pool full of lithium water
Introduction: As part of his continuing plan to sap etc etc., Aleks pointed me to an article by Max Miller reporting on a recommendation from Jacob Appel: Adding trace amounts of lithium to the drinking water could limit suicides. . . . Communities with higher than average amounts of lithium in their drinking water had significantly lower suicide rates than communities with lower levels. Regions of Texas with lower lithium concentrations had an average suicide rate of 14.2 per 100,000 people, whereas those areas with naturally higher lithium levels had a dramatically lower suicide rate of 8.7 per 100,000. The highest levels in Texas (150 micrograms of lithium per liter of water) are only a thousandth of the minimum pharmaceutical dose, and have no known deleterious effects. I don’t know anything about this and am offering no judgment on it; I’m just passing it on. The research studies are here and here . I am skeptical, though, about this part of the argument: We are not talking a
6 0.93222797 1831 andrew gelman stats-2013-04-29-The Great Race
7 0.92983705 1188 andrew gelman stats-2012-02-28-Reference on longitudinal models?
9 0.91940939 1259 andrew gelman stats-2012-04-11-How things sound to us, versus how they sound to others
10 0.9186815 1623 andrew gelman stats-2012-12-14-GiveWell charity recommendations
11 0.91808188 1768 andrew gelman stats-2013-03-18-Mertz’s reply to Unz’s response to Mertz’s comments on Unz’s article
12 0.9150852 1429 andrew gelman stats-2012-07-26-Our broken scholarly publishing system
13 0.91107404 412 andrew gelman stats-2010-11-13-Time to apply for the hackNY summer fellows program
14 0.89962578 2089 andrew gelman stats-2013-11-04-Shlemiel the Software Developer and Unknown Unknowns
15 0.89667702 1572 andrew gelman stats-2012-11-10-I don’t like this cartoon
16 0.89626515 41 andrew gelman stats-2010-05-19-Updated R code and data for ARM
17 0.89575446 1580 andrew gelman stats-2012-11-16-Stantastic!
18 0.89526623 631 andrew gelman stats-2011-03-28-Explaining that plot.
20 0.89384365 2161 andrew gelman stats-2014-01-07-My recent debugging experience