andrew_gelman_stats andrew_gelman_stats-2012 andrew_gelman_stats-2012-1297 knowledge-graph by maker-knowledge-mining

1297 andrew gelman stats-2012-05-03-New New York data research organizations


meta infos for this blog

Source: html

Introduction: In a single day, New York City obtained two data analysis/statistics/machine learning organizations: Microsoft Research New York City with John Langford (machine learning), Duncan Watts (networks), and Dave Pennock (algorithmic economics). eBay technology center focusing on data – led by Chris Dixon , the co-founder of the recommendation engine company Hunch, which has recently been acquired by eBay. New York already has Facebook’s engineering unit , Twitter’s East Coast headquarters , and Google’s second-largest engineering office. The data community here is on an upswing, and it might be one of the best places to be if you’re into applied statistics, machine learning or data analysis. Post by Aleks Jakulin . P.S. (from Andrew): The formerly-Yahoo-now-Microsoft researchers have a more-or-less formal connection to Columbia, through the Applied Statistics Center, where some of them will be organizing occasional mini-conferences and workshops!


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 In a single day, New York City obtained two data analysis/statistics/machine learning organizations: Microsoft Research New York City with John Langford (machine learning), Duncan Watts (networks), and Dave Pennock (algorithmic economics). [sent-1, score-0.527]

2 eBay technology center focusing on data – led by Chris Dixon , the co-founder of the recommendation engine company Hunch, which has recently been acquired by eBay. [sent-2, score-1.084]

3 New York already has Facebook’s engineering unit , Twitter’s East Coast headquarters , and Google’s second-largest engineering office. [sent-3, score-0.743]

4 The data community here is on an upswing, and it might be one of the best places to be if you’re into applied statistics, machine learning or data analysis. [sent-4, score-0.954]

5 (from Andrew): The formerly-Yahoo-now-Microsoft researchers have a more-or-less formal connection to Columbia, through the Applied Statistics Center, where some of them will be organizing occasional mini-conferences and workshops! [sent-8, score-0.475]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('learning', 0.239), ('york', 0.234), ('engineering', 0.225), ('machine', 0.201), ('city', 0.2), ('pennock', 0.187), ('center', 0.178), ('langford', 0.177), ('ebay', 0.177), ('headquarters', 0.177), ('algorithmic', 0.177), ('workshops', 0.177), ('acquired', 0.169), ('organizing', 0.158), ('hunch', 0.154), ('jakulin', 0.154), ('watts', 0.14), ('coast', 0.138), ('duncan', 0.136), ('facebook', 0.131), ('east', 0.128), ('twitter', 0.128), ('applied', 0.126), ('engine', 0.125), ('microsoft', 0.122), ('organizations', 0.12), ('dave', 0.12), ('occasional', 0.117), ('obtained', 0.117), ('unit', 0.116), ('aleks', 0.115), ('technology', 0.112), ('networks', 0.109), ('focusing', 0.108), ('formal', 0.105), ('recommendation', 0.103), ('data', 0.102), ('new', 0.101), ('connection', 0.095), ('community', 0.095), ('company', 0.094), ('chris', 0.093), ('led', 0.093), ('places', 0.089), ('columbia', 0.085), ('google', 0.085), ('andrew', 0.081), ('statistics', 0.078), ('economics', 0.075), ('single', 0.069)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 1297 andrew gelman stats-2012-05-03-New New York data research organizations

Introduction: In a single day, New York City obtained two data analysis/statistics/machine learning organizations: Microsoft Research New York City with John Langford (machine learning), Duncan Watts (networks), and Dave Pennock (algorithmic economics). eBay technology center focusing on data – led by Chris Dixon , the co-founder of the recommendation engine company Hunch, which has recently been acquired by eBay. New York already has Facebook’s engineering unit , Twitter’s East Coast headquarters , and Google’s second-largest engineering office. The data community here is on an upswing, and it might be one of the best places to be if you’re into applied statistics, machine learning or data analysis. Post by Aleks Jakulin . P.S. (from Andrew): The formerly-Yahoo-now-Microsoft researchers have a more-or-less formal connection to Columbia, through the Applied Statistics Center, where some of them will be organizing occasional mini-conferences and workshops!

2 0.1747674 1740 andrew gelman stats-2013-02-26-“Is machine learning a subset of statistics?”

Introduction: Following up on our previous post , Andrew Wilson writes: I agree we are in a really exciting time for statistics and machine learning. There has been a lot of talk lately comparing machine learning with statistics. I am curious whether you think there are many fundamental differences between the fields, or just superficial differences — different popular approximate inference methods, slightly different popular application areas, etc. Is machine learning a subset of statistics? In the paper we discuss how we think machine learning is fundamentally about pattern discovery, and ultimately, fully automating the learning and decision making process. In other words, whatever a human does when he or she uses tools to analyze data, can be written down algorithmically and automated on a computer. I am not sure if the ambitions are similar in statistics — and I don’t have any conventional statistics background, which makes it harder to tell. I think it’s an interesting discussion.

3 0.16009504 1481 andrew gelman stats-2012-09-04-Cool one-day miniconference at Columbia Fri 12 Oct on computational and online social science

Introduction: One thing we do here at the Applied Statistics Center is hold mini-conferences. The next one looks really cool. It’s organized by Sharad Goel and Jake Hofman (Microsoft Research, formerly at Yahoo Research), David Park (Columbia University), and Sergei Vassilvitskii (Google). As with our other conferences, one of our goals is to mix the academic and nonacademic research communities. Here’s the website for the workshop, and here’s the announcement from the organizers: With an explosion of data on every aspect of our everyday existence — from what we buy, to where we travel, to who we know — we are able to observe human behavior with granularity largely thought impossible just a decade ago. The growth of such online activity has further facilitated the design of web-based experiments, enhancing both the scale and efficiency of traditional methods. Together these advances have created an unprecedented opportunity to address longstanding questions in the social sciences, rang

4 0.14875716 719 andrew gelman stats-2011-05-19-Everything is Obvious (once you know the answer)

Introduction: Duncan Watts gave his new book the above title, reflecting his irritation with those annoying people who, upon hearing of the latest social science research, reply with: Duh-I-knew-that. (I don’t know how to say Duh in Australian; maybe someone can translate that for me?) I, like Duncan, am easily irritated, and I looked forward to reading the book. I enjoyed it a lot, even though it has only one graph, and that graph has a problem with its y-axis. (OK, the book also has two diagrams and a graph of fake data, but that doesn’t count.) Before going on, let me say that I agree wholeheartedly with Duncan’s central point: social science research findings are often surprising, but the best results cause us to rethink our world in such a way that they seem completely obvious, in retrospect. (Don Rubin used to tell us that there’s no such thing as a “paradox”: once you fully understand a phenomenon, it should not seem paradoxical any more. When learning science, we sometimes speak

5 0.14802904 1649 andrew gelman stats-2013-01-02-Back when 50 miles was a long way

Introduction: This post is by Phil. Michael Graham Richard has posted some great maps from the 1932 Atlas of the Historical Geography of the United States; the maps show how long it took to get to various places in the U.S. from New York City in 1800, 1830, 1857, and 1930. (I wonder if the atlas has one from around 1900 as well, that didn’t make it into the article? I’d like to see it, too, if it exists.) Worth a look. This post is by Phil. Time to get to anywhere in the conterminous states from New York City, in 1857

6 0.14293629 1131 andrew gelman stats-2012-01-20-Stan: A (Bayesian) Directed Graphical Model Compiler

7 0.14189893 1630 andrew gelman stats-2012-12-18-Postdoc positions at Microsoft Research – NYC

8 0.1241212 2118 andrew gelman stats-2013-11-30-???

9 0.12243156 399 andrew gelman stats-2010-11-07-Challenges of experimental design; also another rant on the practice of mentioning the publication of an article but not naming its author

10 0.12010129 747 andrew gelman stats-2011-06-06-Research Directions for Machine Learning and Algorithms

11 0.11432546 537 andrew gelman stats-2011-01-25-Postdoc Position #1: Missing-Data Imputation, Diagnostics, and Applications

12 0.10931809 1217 andrew gelman stats-2012-03-17-NSF program “to support analytic and methodological research in support of its surveys”

13 0.1080557 1902 andrew gelman stats-2013-06-17-Job opening at new “big data” consulting firm!

14 0.10713884 1992 andrew gelman stats-2013-08-21-Workshop for Women in Machine Learning

15 0.10420468 1482 andrew gelman stats-2012-09-04-Model checking and model understanding in machine learning

16 0.10099384 2366 andrew gelman stats-2014-06-09-On deck this week

17 0.10033891 1126 andrew gelman stats-2012-01-18-Bob on Stan

18 0.097079746 903 andrew gelman stats-2011-09-13-Duke postdoctoral fellowships in nonparametric Bayes & high-dimensional data

19 0.09664394 289 andrew gelman stats-2010-09-21-“How segregated is your city?”: A story of why every graph, no matter how clear it seems to be, needs a caption to anchor the reader in some numbers

20 0.095663883 538 andrew gelman stats-2011-01-25-Postdoc Position #2: Hierarchical Modeling and Statistical Graphics


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.105), (1, -0.028), (2, -0.075), (3, 0.02), (4, 0.024), (5, 0.103), (6, -0.098), (7, 0.005), (8, -0.087), (9, 0.058), (10, -0.052), (11, -0.007), (12, 0.056), (13, -0.001), (14, -0.084), (15, 0.065), (16, 0.019), (17, -0.019), (18, -0.048), (19, -0.113), (20, -0.011), (21, -0.03), (22, 0.038), (23, 0.041), (24, -0.023), (25, -0.009), (26, -0.039), (27, -0.006), (28, 0.034), (29, -0.007), (30, 0.017), (31, -0.055), (32, -0.031), (33, 0.008), (34, 0.044), (35, 0.008), (36, -0.001), (37, -0.024), (38, -0.031), (39, -0.03), (40, -0.06), (41, -0.022), (42, -0.012), (43, 0.124), (44, -0.002), (45, 0.078), (46, 0.08), (47, 0.028), (48, 0.044), (49, -0.041)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94566357 1297 andrew gelman stats-2012-05-03-New New York data research organizations

Introduction: In a single day, New York City obtained two data analysis/statistics/machine learning organizations: Microsoft Research New York City with John Langford (machine learning), Duncan Watts (networks), and Dave Pennock (algorithmic economics). eBay technology center focusing on data – led by Chris Dixon , the co-founder of the recommendation engine company Hunch, which has recently been acquired by eBay. New York already has Facebook’s engineering unit , Twitter’s East Coast headquarters , and Google’s second-largest engineering office. The data community here is on an upswing, and it might be one of the best places to be if you’re into applied statistics, machine learning or data analysis. Post by Aleks Jakulin . P.S. (from Andrew): The formerly-Yahoo-now-Microsoft researchers have a more-or-less formal connection to Columbia, through the Applied Statistics Center, where some of them will be organizing occasional mini-conferences and workshops!

2 0.72839779 1777 andrew gelman stats-2013-03-26-Data Science for Social Good summer fellowship program

Introduction: Juan-Pablo Velez writes: I’m helping with a  Data Science for Social Good  summer fellowship program at the University of Chicago. The goal is to train data scientists that can tackle problems in education, healthcare, energy, transportation, and more. Working with full-time mentors from academia, industry, and the  Obama campaign , fellows will build high-impact analytics projects using statistics, machine learning, data mining, and big data technologies. For fellows, we’re looking for grad students, advanced undergrads, and professionals in computer science, machine learning, statistics, and the computational and quantitative sciences. For mentors, we’re looking for folks with practical data science experience. Fellows and mentors will be paid competitively and housed in Chicago for duration of the program, from early June to late August. Rayid Ghani , former Chief Scientist of the Obama 2012 campaign, is leading the program.  Eric Sch

3 0.70653445 1902 andrew gelman stats-2013-06-17-Job opening at new “big data” consulting firm!

Introduction: David Shor sends along a job announcement for Civis Analytics, which he describes as “basically Obama’s Analytics team reconstituted as a company”: Data Scientist Position Overview Data Scientists are responsible for providing the fundamental data science that powers our work – including predictive analytics, data mining, experimental design and ad-hoc statistical analysis. As a Data Scientist, you will join our Chicago-based data science team, working closely and collaboratively with analysts and engineers to identify, quantify and solve big, meaningful problems. Data Scientists will have the opportunity to dive deeply into big problems and work in a variety of areas. Civis Analytics has opportunities for applicants who are seasoned professionals, brilliant new comers, and anywhere in between. Qualifications · Master’s degree in statistics, machine learning, computer science with heavy quant focus, a related subject, or a Bachelor’s degree and significant work ex

4 0.69384819 1630 andrew gelman stats-2012-12-18-Postdoc positions at Microsoft Research – NYC

Introduction: Sharad Goel sends this in: Microsoft Research NYC [ http://research.microsoft.com/newyork/ ] seeks outstanding applicants for 2-year postdoctoral researcher positions. We welcome applicants with a strong academic record in one of the following areas: * Computational social science: http://research.microsoft.com/cssnyc * Online experimental social science: http://research.microsoft.com/oess_nyc * Algorithmic economics and market design: http://research.microsoft.com/algorithmic-economics/ * Machine learning: http://research.microsoft.com/mlnyc/ We will also consider applicants in other focus areas of the lab, including information retrieval, and behavioral & empirical economics. Additional information about these areas is included below. Please submit all application materials by January 11, 2013. ———- COMPUTATIONAL SOCIAL SCIENCE http://research.microsoft.com/cssnyc With an increasing amount of data on every aspect of our daily activities — from what we buy, to wh

5 0.67099559 903 andrew gelman stats-2011-09-13-Duke postdoctoral fellowships in nonparametric Bayes & high-dimensional data

Introduction: I hate to announce this one because it’s directly competing with us, but it actually looks pretty good! If I were getting my Ph.D. right now, I’d definitely apply . . . David Dunson announces: There will be several postdoctoral fellowships available at Duke to work with me [Dunson] & others on research related to foundations of nonparametric Bayes in high-dimensional settings, with a particular focus on showing theoretical properties and developing new models and computational approaches in machine learning applications & genomics. Send applications to Ellen Currin, Department of Electrical and Computer Engineering, ecurrin@ee.duke.edu

6 0.66737282 1992 andrew gelman stats-2013-08-21-Workshop for Women in Machine Learning

7 0.65510541 1481 andrew gelman stats-2012-09-04-Cool one-day miniconference at Columbia Fri 12 Oct on computational and online social science

8 0.64179337 1279 andrew gelman stats-2012-04-24-ESPN is looking to hire a research analyst

9 0.59778166 1651 andrew gelman stats-2013-01-03-Faculty Position in Visualization, Visual Analytics, Imaging, and Human Centered Computing

10 0.59361291 1990 andrew gelman stats-2013-08-20-Job opening at an organization that promotes reproducible research!

11 0.59180665 537 andrew gelman stats-2011-01-25-Postdoc Position #1: Missing-Data Imputation, Diagnostics, and Applications

12 0.5901016 999 andrew gelman stats-2011-11-09-I was at a meeting a couple months ago . . .

13 0.58997905 1217 andrew gelman stats-2012-03-17-NSF program “to support analytic and methodological research in support of its surveys”

14 0.58589566 2047 andrew gelman stats-2013-10-02-Bayes alert! Cool postdoc position here on missing data imputation and applications in health disparities research!

15 0.58346677 1904 andrew gelman stats-2013-06-18-Job opening! Come work with us!

16 0.58258301 1909 andrew gelman stats-2013-06-21-Job openings at conservative political analytics firm!

17 0.57238179 1821 andrew gelman stats-2013-04-24-My talk midtown this Friday noon (and at Columbia Monday afternoon)

18 0.57166612 412 andrew gelman stats-2010-11-13-Time to apply for the hackNY summer fellows program

19 0.57113296 1923 andrew gelman stats-2013-07-03-Bayes pays!

20 0.56071603 2345 andrew gelman stats-2014-05-24-An interesting mosaic of a data programming course


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.017), (15, 0.013), (16, 0.098), (21, 0.025), (24, 0.072), (31, 0.059), (41, 0.16), (48, 0.015), (59, 0.015), (69, 0.017), (77, 0.096), (79, 0.042), (86, 0.014), (93, 0.016), (97, 0.032), (99, 0.208)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.90957248 1297 andrew gelman stats-2012-05-03-New New York data research organizations

Introduction: In a single day, New York City obtained two data analysis/statistics/machine learning organizations: Microsoft Research New York City with John Langford (machine learning), Duncan Watts (networks), and Dave Pennock (algorithmic economics). eBay technology center focusing on data – led by Chris Dixon , the co-founder of the recommendation engine company Hunch, which has recently been acquired by eBay. New York already has Facebook’s engineering unit , Twitter’s East Coast headquarters , and Google’s second-largest engineering office. The data community here is on an upswing, and it might be one of the best places to be if you’re into applied statistics, machine learning or data analysis. Post by Aleks Jakulin . P.S. (from Andrew): The formerly-Yahoo-now-Microsoft researchers have a more-or-less formal connection to Columbia, through the Applied Statistics Center, where some of them will be organizing occasional mini-conferences and workshops!

2 0.87435722 1626 andrew gelman stats-2012-12-16-The lamest, grudgingest, non-retraction retraction ever

Introduction: In politics we’re familiar with the non-apology apology (well described in Wikipedia as “a statement that has the form of an apology but does not express the expected contrition”). Here’s the scientific equivalent: the non-retraction retraction. Sanjay Srivastava points to an amusing yet barfable story of a pair of researchers who (inadvertently, I assume) made a data coding error and were eventually moved to issue a correction notice, but even then refused to fully admit their error. As Srivastava puts it, the story “ended up with Lew [Goldberg] and colleagues [Kibeom Lee and Michael Ashton] publishing a comment on an erratum – the only time I’ve ever heard of that happening in a scientific journal.” From the comment on the erratum: In their “erratum and addendum,” Anderson and Ones (this issue) explained that we had brought their attention to the “potential” of a “possible” misalignment and described the results computed from re-aligned data as being based on a “post-ho

3 0.86415648 303 andrew gelman stats-2010-09-28-“Genomics” vs. genetics

Introduction: John Cook and Joseph Delaney point to an article by Yurii Aulchenko et al., who write: 54 loci showing strong statistical evidence for association to human height were described, providing us with potential genomic means of human height prediction. In a population-based study of 5748 people, we find that a 54-loci genomic profile explained 4-6% of the sex- and age-adjusted height variance, and had limited ability to discriminate tall/short people. . . . In a family-based study of 550 people, with both parents having height measurements, we find that the Galtonian mid-parental prediction method explained 40% of the sex- and age-adjusted height variance, and showed high discriminative accuracy. . . . The message is that the simple approach of predicting child’s height using a regression model given parents’ average height performs much better than the method they have based on combining 54 genes. They also find that, if you start with the prediction based on parents’ heigh

4 0.82951593 1214 andrew gelman stats-2012-03-15-Of forecasts and graph theory and characterizing a statistical method by the information it uses

Introduction: Wayne Folta points me to “EigenBracket 2012: Using Graph Theory to Predict NCAA March Madness Basketball” and writes, “I [Folta] have got to believe that he’s simply re-invented a statistical method in a graph-ish context, but don’t know enough to judge.” I have not looked in detail at the method being presented here—I’m not much of college basketball fan—but I’d like to use this as an excuse to make one of my favorite general point, which is that a good way to characterize any statistical method is by what information it uses. The basketball ranking method here uses score differentials between teams in the past season. On the plus side, that is better than simply using one-loss records (which (a) discards score differentials and (b) discards information on who played whom). On the minus side, the method appears to be discretizing the scores (thus throwing away information on the exact score differential) and doesn’t use any external information such as external ratings. A

5 0.82850051 1300 andrew gelman stats-2012-05-05-Recently in the sister blog

Introduction: Culture war: The rules You can only accept capital punishment if you’re willing to have innocent people executed every now and then The politics of America’s increasing economic inequality

6 0.82719636 685 andrew gelman stats-2011-04-29-Data mining and allergies

7 0.82704461 1013 andrew gelman stats-2011-11-16-My talk at Math for America on Saturday

8 0.81897736 1895 andrew gelman stats-2013-06-12-Peter Thiel is writing another book!

9 0.81654549 516 andrew gelman stats-2011-01-14-A new idea for a science core course based entirely on computer simulation

10 0.81509465 1669 andrew gelman stats-2013-01-12-The power of the puzzlegraph

11 0.80953783 454 andrew gelman stats-2010-12-07-Diabetes stops at the state line?

12 0.80717123 2204 andrew gelman stats-2014-02-09-Keli Liu and Xiao-Li Meng on Simpson’s paradox

13 0.80534637 2185 andrew gelman stats-2014-01-25-Xihong Lin on sparsity and density

14 0.80460471 1684 andrew gelman stats-2013-01-20-Ugly ugly ugly

15 0.80206263 1784 andrew gelman stats-2013-04-01-Wolfram on Mandelbrot

16 0.80091524 978 andrew gelman stats-2011-10-28-Cool job opening with brilliant researchers at Yahoo

17 0.79546541 2139 andrew gelman stats-2013-12-19-Happy birthday

18 0.79268229 1373 andrew gelman stats-2012-06-09-Cognitive psychology research helps us understand confusion of Jonathan Haidt and others about working-class voters

19 0.79130036 1816 andrew gelman stats-2013-04-21-Exponential increase in the number of stat majors

20 0.78725928 447 andrew gelman stats-2010-12-03-Reinventing the wheel, only more so.