hunch_net hunch_net-2012 hunch_net-2012-475 knowledge-graph by maker-knowledge-mining

475 hunch net-2012-10-26-ML Symposium and Strata-Hadoop World


meta infos for this blog

Source: html

Introduction: The New York ML symposium was last Friday. There were 303 registrations, up a bit from last year . I particularly enjoyed talks by Bill Freeman on vision and ML, Jon Lenchner on strategy in Jeopardy, and Tara N. Sainath and Brian Kingsbury on deep learning for speech recognition . If anyone has suggestions or thoughts for next year, please speak up. I also attended Strata + Hadoop World for the first time. This is primarily a trade conference rather than an academic conference, but I found it pretty interesting as a first time attendee. This is ground zero for the Big data buzzword, and I see now why. It’s about data, and the word “big” is so ambiguous that everyone can lay claim to it. There were essentially zero academic talks. Instead, the focus was on war stories, product announcements, and education. The general level of education is much lower—explaining Machine Learning to the SQL educated is the primary operating point. Nevertheless that’s happening, a


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 There were 303 registrations, up a bit from last year . [sent-2, score-0.089]

2 I particularly enjoyed talks by Bill Freeman on vision and ML, Jon Lenchner on strategy in Jeopardy, and Tara N. [sent-3, score-0.361]

3 This is primarily a trade conference rather than an academic conference, but I found it pretty interesting as a first time attendee. [sent-7, score-0.546]

4 This is ground zero for the Big data buzzword, and I see now why. [sent-8, score-0.385]

5 It’s about data, and the word “big” is so ambiguous that everyone can lay claim to it. [sent-9, score-0.24]

6 Instead, the focus was on war stories, product announcements, and education. [sent-11, score-0.224]

7 The general level of education is much lower—explaining Machine Learning to the SQL educated is the primary operating point. [sent-12, score-0.295]

8 Nevertheless that’s happening, and the fact that machine learning is considered a necessary technology for industry is a giant step for the field. [sent-13, score-0.234]

9 Over time, I expect the industrial side of Machine Learning to grow, and perhaps surpass the academic side, in the same sense as has already occurred for chip design. [sent-14, score-0.695]

10 Amongst the talks I could catch, I particularly liked the Github , Zillow , and Pandas talks. [sent-15, score-0.363]

11 Ted Dunning also gave a particularly masterful talk, although I have doubts about the core Bayesian Bandit approach(*). [sent-16, score-0.252]

12 The streaming k-means algorithm they implemented does look quite handy. [sent-17, score-0.123]

13 (*) The doubt is the following: prior elicitation is generally hard, and Bayesian techniques are not robust to misspecification. [sent-18, score-0.228]

14 This matters in standard supervised settings, but it may matter more in exploration settings where misspecification can imply data starvation. [sent-19, score-0.498]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('academic', 0.223), ('zero', 0.172), ('settings', 0.163), ('side', 0.142), ('talks', 0.142), ('sql', 0.133), ('registrations', 0.133), ('misspecification', 0.133), ('jeopardy', 0.133), ('kingsbury', 0.133), ('particularly', 0.129), ('ambiguous', 0.123), ('doubts', 0.123), ('chip', 0.123), ('freeman', 0.123), ('announcements', 0.123), ('giant', 0.123), ('bill', 0.123), ('streaming', 0.123), ('ml', 0.12), ('lay', 0.117), ('war', 0.117), ('elicitation', 0.117), ('brian', 0.117), ('trade', 0.117), ('primarily', 0.117), ('github', 0.117), ('ground', 0.111), ('doubt', 0.111), ('educated', 0.111), ('industry', 0.111), ('bayesian', 0.111), ('occurred', 0.107), ('stories', 0.107), ('catch', 0.107), ('product', 0.107), ('data', 0.102), ('grow', 0.1), ('industrial', 0.1), ('matters', 0.1), ('hadoop', 0.1), ('big', 0.098), ('explaining', 0.094), ('education', 0.094), ('liked', 0.092), ('operating', 0.09), ('happening', 0.09), ('strategy', 0.09), ('conference', 0.089), ('last', 0.089)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999982 475 hunch net-2012-10-26-ML Symposium and Strata-Hadoop World

Introduction: The New York ML symposium was last Friday. There were 303 registrations, up a bit from last year . I particularly enjoyed talks by Bill Freeman on vision and ML, Jon Lenchner on strategy in Jeopardy, and Tara N. Sainath and Brian Kingsbury on deep learning for speech recognition . If anyone has suggestions or thoughts for next year, please speak up. I also attended Strata + Hadoop World for the first time. This is primarily a trade conference rather than an academic conference, but I found it pretty interesting as a first time attendee. This is ground zero for the Big data buzzword, and I see now why. It’s about data, and the word “big” is so ambiguous that everyone can lay claim to it. There were essentially zero academic talks. Instead, the focus was on war stories, product announcements, and education. The general level of education is much lower—explaining Machine Learning to the SQL educated is the primary operating point. Nevertheless that’s happening, a

2 0.12889715 494 hunch net-2014-03-11-The New York ML Symposium, take 2

Introduction: The 20 13 14 is New York Machine Learning Symposium is finally happening on March 28th at the New York Academy of Science . Every invited speaker interests me personally. They are: Rayid Ghani (Chief Scientist at Obama 2012) Brian Kingsbury (Speech Recognition @ IBM) Jorge Nocedal (who did LBFGS) We’ve been somewhat disorganized in advertising this. As a consequence, anyone who has not submitted an abstract but would like to do so may send one directly to me (jl@hunch.net title NYASMLS) by Friday March 14. I will forward them to the rest of the committee for consideration.

3 0.12064226 455 hunch net-2012-02-20-Berkeley Streaming Data Workshop

Introduction: The From Data to Knowledge workshop May 7-11 at Berkeley should be of interest to the many people encountering streaming data in different disciplines. It’s run by a group of astronomers who encounter streaming data all the time. I met Josh Bloom recently and he is broadly interested in a workshop covering all aspects of Machine Learning on streaming data. The hope here is that techniques developed in one area turn out useful in another which seems quite plausible. Particularly if you are in the bay area, consider checking it out.

4 0.11894104 448 hunch net-2011-10-24-2011 ML symposium and the bears

Introduction: The New York ML symposium was last Friday. Attendance was 268, significantly larger than last year . My impression was that the event mostly still fit the space, although it was crowded. If anyone has suggestions for next year, speak up. The best student paper award went to Sergiu Goschin for a cool video of how his system learned to play video games (I can’t find the paper online yet). Choosing amongst the submitted talks was pretty difficult this year, as there were many similarly good ones. By coincidence all the invited talks were (at least potentially) about faster learning algorithms. Stephen Boyd talked about ADMM . Leon Bottou spoke on single pass online learning via averaged SGD . Yoav Freund talked about parameter-free hedging . In Yoav’s case the talk was mostly about a better theoretical learning algorithm, but it has the potential to unlock an exponential computational complexity improvement via oraclization of experts algorithms… but some serious

5 0.10929622 335 hunch net-2009-01-08-Predictive Analytics World

Introduction: Carla Vicens and Eric Siegel contacted me about Predictive Analytics World in San Francisco February 18&19, which I wasn’t familiar with. A quick look at the agenda reveals several people I know working on applications of machine learning in businesses, covering deployed applications topics. It’s interesting to see a business-focused machine learning conference, as it says that we are succeeding as a field. If you are interested in deployed applications, you might attend. Eric and I did a quick interview by email. John > I’ve mostly published and participated in academic machine learning conferences like ICML, COLT, and NIPS. When I look at the set of speakers and subjects for your conference I think “machine learning for business”. Is that your understanding of things? What I’m trying to ask is: what do you view as the primary goal for this conference? Eric > You got it. This is the business event focused on the commercial deployment of technology developed at

6 0.10219691 474 hunch net-2012-10-18-7th Annual Machine Learning Symposium

7 0.10085516 452 hunch net-2012-01-04-Why ICML? and the summer conferences

8 0.095814563 477 hunch net-2013-01-01-Deep Learning 2012

9 0.095216177 369 hunch net-2009-08-27-New York Area Machine Learning Events

10 0.094091758 60 hunch net-2005-04-23-Advantages and Disadvantages of Bayesian Learning

11 0.09235958 377 hunch net-2009-11-09-NYAS ML Symposium this year.

12 0.0918134 424 hunch net-2011-02-17-What does Watson mean?

13 0.091763787 165 hunch net-2006-03-23-The Approximation Argument

14 0.090870261 110 hunch net-2005-09-10-“Failure” is an option

15 0.089807227 462 hunch net-2012-04-20-Both new: STOC workshops and NEML

16 0.089109793 48 hunch net-2005-03-29-Academic Mechanism Design

17 0.088607766 406 hunch net-2010-08-22-KDD 2010

18 0.08761102 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

19 0.086038232 316 hunch net-2008-09-04-Fall ML Conferences

20 0.084524162 203 hunch net-2006-08-18-Report of MLSS 2006 Taipei


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.204), (1, -0.039), (2, -0.137), (3, -0.01), (4, 0.025), (5, 0.005), (6, -0.044), (7, 0.005), (8, -0.017), (9, -0.16), (10, 0.058), (11, -0.03), (12, 0.049), (13, -0.003), (14, -0.02), (15, -0.006), (16, -0.057), (17, 0.005), (18, 0.087), (19, -0.02), (20, 0.014), (21, -0.073), (22, 0.061), (23, 0.041), (24, -0.061), (25, -0.061), (26, -0.055), (27, 0.061), (28, 0.054), (29, -0.005), (30, -0.022), (31, 0.016), (32, -0.034), (33, 0.085), (34, 0.062), (35, -0.123), (36, 0.02), (37, 0.003), (38, -0.022), (39, -0.02), (40, 0.022), (41, 0.022), (42, 0.031), (43, 0.032), (44, -0.025), (45, 0.029), (46, -0.005), (47, 0.043), (48, -0.023), (49, -0.042)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96774369 475 hunch net-2012-10-26-ML Symposium and Strata-Hadoop World

Introduction: The New York ML symposium was last Friday. There were 303 registrations, up a bit from last year . I particularly enjoyed talks by Bill Freeman on vision and ML, Jon Lenchner on strategy in Jeopardy, and Tara N. Sainath and Brian Kingsbury on deep learning for speech recognition . If anyone has suggestions or thoughts for next year, please speak up. I also attended Strata + Hadoop World for the first time. This is primarily a trade conference rather than an academic conference, but I found it pretty interesting as a first time attendee. This is ground zero for the Big data buzzword, and I see now why. It’s about data, and the word “big” is so ambiguous that everyone can lay claim to it. There were essentially zero academic talks. Instead, the focus was on war stories, product announcements, and education. The general level of education is much lower—explaining Machine Learning to the SQL educated is the primary operating point. Nevertheless that’s happening, a

2 0.70700455 415 hunch net-2010-10-28-NY ML Symposium 2010

Introduction: About 200 people attended the 2010 NYAS ML Symposium this year. (It was about 170 last year .) I particularly enjoyed several talks. Yann has a new live demo of (limited) real-time object recognition learning. Sanjoy gave a fairly convincing and comprehensible explanation of why a modified form of single-linkage clustering is consistent in higher dimensions, and why consistency is a critical feature for clustering algorithms. I’m curious how well this algorithm works in practice. Matt Hoffman ‘s poster covering online LDA seemed pretty convincing to me as an algorithmic improvement. This year, we allocated more time towards posters & poster spotlights. For next year, we are considering some further changes. The format has traditionally been 4 invited Professor speakers, with posters and poster spotlight for students. Demand from other parties to participate is growing, for example from postdocs and startups in the area. Another growing concern is the fa

3 0.69915438 369 hunch net-2009-08-27-New York Area Machine Learning Events

Introduction: Several events are happening in the NY area. Barriers in Computational Learning Theory Workshop, Aug 28. That’s tomorrow near Princeton. I’m looking forward to speaking at this one on “Getting around Barriers in Learning Theory”, but several other talks are of interest, particularly to the CS theory inclined. Claudia Perlich is running the INFORMS Data Mining Contest with a deadline of Sept. 25. This is a contest using real health record data (they partnered with HealthCare Intelligence ) to predict transfers and mortality. In the current US health care reform debate, the case studies of high costs we hear strongly suggest machine learning & statistics can save many billions. The Singularity Summit October 3&4 . This is for the AIists out there. Several of the talks look interesting, although unfortunately I’ll miss it for ALT . Predictive Analytics World, Oct 20-21 . This is stretching the definition of “New York Area” a bit, but the train to DC is reasonable.

4 0.66612691 494 hunch net-2014-03-11-The New York ML Symposium, take 2

Introduction: The 20 13 14 is New York Machine Learning Symposium is finally happening on March 28th at the New York Academy of Science . Every invited speaker interests me personally. They are: Rayid Ghani (Chief Scientist at Obama 2012) Brian Kingsbury (Speech Recognition @ IBM) Jorge Nocedal (who did LBFGS) We’ve been somewhat disorganized in advertising this. As a consequence, anyone who has not submitted an abstract but would like to do so may send one directly to me (jl@hunch.net title NYASMLS) by Friday March 14. I will forward them to the rest of the committee for consideration.

5 0.65276575 125 hunch net-2005-10-20-Machine Learning in the News

Introduction: The New York Times had a short interview about machine learning in datamining being used pervasively by the IRS and large corporations to predict who to audit and who to target for various marketing campaigns. This is a big application area of machine learning. It can be harmful (learning + databases = another way to invade privacy) or beneficial (as google demonstrates, better targeting of marketing campaigns is far less annoying). This is yet more evidence that we can not rely upon “I’m just another fish in the school” logic for our expectations about treatment by government and large corporations.

6 0.64905494 474 hunch net-2012-10-18-7th Annual Machine Learning Symposium

7 0.63470942 410 hunch net-2010-09-17-New York Area Machine Learning Events

8 0.62734765 460 hunch net-2012-03-24-David Waltz

9 0.59784013 462 hunch net-2012-04-20-Both new: STOC workshops and NEML

10 0.59028584 448 hunch net-2011-10-24-2011 ML symposium and the bears

11 0.58262432 322 hunch net-2008-10-20-New York’s ML Day

12 0.58210021 489 hunch net-2013-09-20-No NY ML Symposium in 2013, and some good news

13 0.58039922 406 hunch net-2010-08-22-KDD 2010

14 0.57949704 250 hunch net-2007-06-23-Machine Learning Jobs are Growing on Trees

15 0.57701916 335 hunch net-2009-01-08-Predictive Analytics World

16 0.55016404 260 hunch net-2007-08-25-The Privacy Problem

17 0.54560232 455 hunch net-2012-02-20-Berkeley Streaming Data Workshop

18 0.54053986 316 hunch net-2008-09-04-Fall ML Conferences

19 0.53725684 444 hunch net-2011-09-07-KDD and MUCMD 2011

20 0.53085881 405 hunch net-2010-08-21-Rob Schapire at NYC ML Meetup


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(3, 0.017), (10, 0.032), (27, 0.202), (38, 0.06), (39, 0.337), (53, 0.066), (55, 0.124), (78, 0.011), (92, 0.017), (94, 0.035), (95, 0.019)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.92347151 427 hunch net-2011-03-20-KDD Cup 2011

Introduction: Yehuda points out KDD-Cup 2011 which Markus and Gideon helped setup. This is a prediction and recommendation contest for music. In addition to being a fun chance to show your expertise, there are cash prizes of $5K/$2K/$1K.

2 0.82360882 71 hunch net-2005-05-14-NIPS

Introduction: NIPS is the big winter conference of learning. Paper due date: June 3rd. (Tweaked thanks to Fei Sha .) Location: Vancouver (main program) Dec. 5-8 and Whistler (workshops) Dec 9-10, BC, Canada NIPS is larger than all of the other learning conferences, partly because it’s the only one at that time of year. I recommend the workshops which are often quite interesting and energetic.

same-blog 3 0.8165229 475 hunch net-2012-10-26-ML Symposium and Strata-Hadoop World

Introduction: The New York ML symposium was last Friday. There were 303 registrations, up a bit from last year . I particularly enjoyed talks by Bill Freeman on vision and ML, Jon Lenchner on strategy in Jeopardy, and Tara N. Sainath and Brian Kingsbury on deep learning for speech recognition . If anyone has suggestions or thoughts for next year, please speak up. I also attended Strata + Hadoop World for the first time. This is primarily a trade conference rather than an academic conference, but I found it pretty interesting as a first time attendee. This is ground zero for the Big data buzzword, and I see now why. It’s about data, and the word “big” is so ambiguous that everyone can lay claim to it. There were essentially zero academic talks. Instead, the focus was on war stories, product announcements, and education. The general level of education is much lower—explaining Machine Learning to the SQL educated is the primary operating point. Nevertheless that’s happening, a

4 0.78686839 301 hunch net-2008-05-23-Three levels of addressing the Netflix Prize

Introduction: In October 2006, the online movie renter, Netflix, announced the Netflix Prize contest. They published a comprehensive dataset including more than 100 million movie ratings, which were performed by about 480,000 real customers on 17,770 movies.   Competitors in the challenge are required to estimate a few million ratings.   To win the “grand prize,” they need to deliver a 10% improvement in the prediction error compared with the results of Cinematch, Netflix’s proprietary recommender system. Best current results deliver 9.12% improvement , which is quite close to the 10% goal, yet painfully distant.   The Netflix Prize breathed new life and excitement into recommender systems research. The competition allowed the wide research community to access a large scale, real life dataset. Beyond this, the competition changed the rules of the game. Claiming that your nice idea could outperform some mediocre algorithms on some toy dataset is no longer acceptable. Now researcher

5 0.59950709 437 hunch net-2011-07-10-ICML 2011 and the future

Introduction: Unfortunately, I ended up sick for much of this ICML. I did manage to catch one interesting paper: Richard Socher , Cliff Lin , Andrew Y. Ng , and Christopher D. Manning Parsing Natural Scenes and Natural Language with Recursive Neural Networks . I invited Richard to share his list of interesting papers, so hopefully we’ll hear from him soon. In the meantime, Paul and Hal have posted some lists. the future Joelle and I are program chairs for ICML 2012 in Edinburgh , which I previously enjoyed visiting in 2005 . This is a huge responsibility, that we hope to accomplish well. A part of this (perhaps the most fun part), is imagining how we can make ICML better. A key and critical constraint is choosing things that can be accomplished. So far we have: Colocation . The first thing we looked into was potential colocations. We quickly discovered that many other conferences precomitted their location. For the future, getting a colocation with ACL or SIGI

6 0.58998287 454 hunch net-2012-01-30-ICML Posters and Scope

7 0.5891667 343 hunch net-2009-02-18-Decision by Vetocracy

8 0.58739918 225 hunch net-2007-01-02-Retrospective

9 0.58417386 371 hunch net-2009-09-21-Netflix finishes (and starts)

10 0.58356816 484 hunch net-2013-06-16-Representative Reviewing

11 0.5830664 320 hunch net-2008-10-14-Who is Responsible for a Bad Review?

12 0.58261329 194 hunch net-2006-07-11-New Models

13 0.58160985 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006

14 0.58045125 44 hunch net-2005-03-21-Research Styles in Machine Learning

15 0.57948804 403 hunch net-2010-07-18-ICML & COLT 2010

16 0.57910335 297 hunch net-2008-04-22-Taking the next step

17 0.5789519 466 hunch net-2012-06-05-ICML acceptance statistics

18 0.57870638 51 hunch net-2005-04-01-The Producer-Consumer Model of Research

19 0.57753086 452 hunch net-2012-01-04-Why ICML? and the summer conferences

20 0.57699472 89 hunch net-2005-07-04-The Health of COLT