hunch_net hunch_net-2008 hunch_net-2008-313 knowledge-graph by maker-knowledge-mining

313 hunch net-2008-08-18-Radford Neal starts a blog


meta infos for this blog

Source: html

Introduction: here on statistics, ML, CS, and other things he knows well.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 here on statistics, ML, CS, and other things he knows well. [sent-1, score-0.791]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('knows', 0.586), ('cs', 0.54), ('statistics', 0.418), ('ml', 0.354), ('things', 0.205), ('well', 0.152)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 313 hunch net-2008-08-18-Radford Neal starts a blog

Introduction: here on statistics, ML, CS, and other things he knows well.

2 0.27753794 59 hunch net-2005-04-22-New Blog: [Lowerbounds,Upperbounds]

Introduction: Maverick Woo and the Aladdin group at CMU have started a CS theory-related blog here .

3 0.13603322 228 hunch net-2007-01-15-The Machine Learning Department

Introduction: Carnegie Mellon School of Computer Science has the first academic Machine Learning department . This department already existed as the Center for Automated Learning and Discovery , but recently changed it’s name. The reason for changing the name is obvious: very few people think of themselves as “Automated Learner and Discoverers”, but there are number of people who think of themselves as “Machine Learners”. Machine learning is both more succinct and recognizable—good properties for a name. A more interesting question is “Should there be a Machine Learning Department?”. Tom Mitchell has a relevant whitepaper claiming that machine learning is answering a different question than other fields or departments. The fundamental debate here is “Is machine learning different from statistics?” At a cultural level, there is no real debate: they are different. Machine learning is characterized by several very active large peer reviewed conferences, operating in a computer

4 0.12670222 329 hunch net-2008-11-28-A Bumper Crop of Machine Learning Graduates

Introduction: My impression is that this is a particularly strong year for machine learning graduates. Here’s my short list of the strong graduates I know. Analpha (for perversity’s sake) by last name: Jenn Wortmann . When Jenn visited us for the summer, she had one , two , three , four papers. That is typical—she’s smart, capable, and follows up many directions of research. I believe approximately all of her many papers are on different subjects. Ruslan Salakhutdinov . A Science paper on bijective dimensionality reduction , mastered and improved on deep belief nets which seems like an important flavor of nonlinear learning, and in my experience he’s very fast, capable and creative at problem solving. Marc’Aurelio Ranzato . I haven’t spoken with Marc very much, but he had a great visit at Yahoo! this summer, and has an impressive portfolio of applications and improvements on convolutional neural networks and other deep learning algorithms. Lihong Li . Lihong developed the

5 0.11645415 358 hunch net-2009-06-01-Multitask Poisoning

Introduction: There are many ways that interesting research gets done. For example it’s common at a conference for someone to discuss a problem with a partial solution, and for someone else to know how to solve a piece of it, resulting in a paper. In some sense, these are the easiest results we can achieve, so we should ask: Can all research be this easy? The answer is certainly no for fields where research inherently requires experimentation to discover how the real world works. However, mathematics, including parts of physics, computer science, statistics, etc… which are effectively mathematics don’t require experimentation. In effect, a paper can be simply a pure expression of thinking. Can all mathematical-style research be this easy? What’s going on here is research-by-communication. Someone knows something, someone knows something else, and as soon as someone knows both things, a problem is solved. The interesting thing about research-by-communication is that it is becoming radic

6 0.11569185 369 hunch net-2009-08-27-New York Area Machine Learning Events

7 0.10962685 248 hunch net-2007-06-19-How is Compressed Sensing going to change Machine Learning ?

8 0.10921868 309 hunch net-2008-07-10-Interesting papers, ICML 2008

9 0.086072087 164 hunch net-2006-03-17-Multitask learning is Black-Boxable

10 0.084978893 290 hunch net-2008-02-27-The Stats Handicap

11 0.080305427 480 hunch net-2013-03-22-I’m a bandit

12 0.074260876 448 hunch net-2011-10-24-2011 ML symposium and the bears

13 0.071704373 270 hunch net-2007-11-02-The Machine Learning Award goes to …

14 0.067268573 2 hunch net-2005-01-24-Holy grails of machine learning?

15 0.066739708 412 hunch net-2010-09-28-Machined Learnings

16 0.063461214 402 hunch net-2010-07-02-MetaOptimize

17 0.06098336 264 hunch net-2007-09-30-NIPS workshops are out.

18 0.060624253 454 hunch net-2012-01-30-ICML Posters and Scope

19 0.060399823 352 hunch net-2009-05-06-Machine Learning to AI

20 0.057531793 156 hunch net-2006-02-11-Yahoo’s Learning Problems.


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.056), (1, -0.018), (2, -0.063), (3, 0.026), (4, -0.021), (5, 0.003), (6, -0.004), (7, -0.112), (8, -0.016), (9, -0.116), (10, 0.038), (11, 0.017), (12, 0.082), (13, -0.021), (14, -0.043), (15, 0.051), (16, 0.084), (17, 0.054), (18, 0.019), (19, 0.007), (20, 0.063), (21, 0.004), (22, 0.049), (23, 0.075), (24, 0.038), (25, -0.192), (26, -0.004), (27, -0.017), (28, -0.035), (29, 0.075), (30, -0.047), (31, -0.022), (32, 0.043), (33, -0.003), (34, -0.033), (35, 0.092), (36, -0.044), (37, -0.056), (38, 0.094), (39, -0.037), (40, -0.019), (41, 0.07), (42, 0.098), (43, 0.085), (44, 0.064), (45, -0.142), (46, 0.121), (47, 0.1), (48, 0.032), (49, 0.066)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99027383 313 hunch net-2008-08-18-Radford Neal starts a blog

Introduction: here on statistics, ML, CS, and other things he knows well.

2 0.5565719 59 hunch net-2005-04-22-New Blog: [Lowerbounds,Upperbounds]

Introduction: Maverick Woo and the Aladdin group at CMU have started a CS theory-related blog here .

3 0.55610383 372 hunch net-2009-09-29-Machine Learning Protests at the G20

Introduction: The machine learning department at CMU turned out en masse to protest the G20 summit in Pittsburgh. Arthur Gretton uploaded some great photos covering the event

4 0.5487299 402 hunch net-2010-07-02-MetaOptimize

Introduction: Joseph Turian creates MetaOptimize for discussion of NLP and ML on big datasets. This includes a blog , but perhaps more importantly a question and answer section . I’m hopeful it will take off.

5 0.52350843 290 hunch net-2008-02-27-The Stats Handicap

Introduction: Graduating students in Statistics appear to be at a substantial handicap compared to graduating students in Machine Learning, despite being in substantially overlapping subjects. The problem seems to be cultural. Statistics comes from a mathematics background which emphasizes large publications slowly published under review at journals. Machine Learning comes from a Computer Science background which emphasizes quick publishing at reviewed conferences. This has a number of implications: Graduating statistics PhDs often have 0-2 publications while graduating machine learning PhDs might have 5-15. Graduating ML students have had a chance for others to build on their work. Stats students have had no such chance. Graduating ML students have attended a number of conferences and presented their work, giving them a chance to meet people. Stats students have had fewer chances of this sort. In short, Stats students have had relatively few chances to distinguish themselves and

6 0.49140489 412 hunch net-2010-09-28-Machined Learnings

7 0.47296709 405 hunch net-2010-08-21-Rob Schapire at NYC ML Meetup

8 0.46368733 228 hunch net-2007-01-15-The Machine Learning Department

9 0.42662594 448 hunch net-2011-10-24-2011 ML symposium and the bears

10 0.41620633 119 hunch net-2005-10-08-We have a winner

11 0.41186506 164 hunch net-2006-03-17-Multitask learning is Black-Boxable

12 0.40008584 369 hunch net-2009-08-27-New York Area Machine Learning Events

13 0.38730028 410 hunch net-2010-09-17-New York Area Machine Learning Events

14 0.38347316 248 hunch net-2007-06-19-How is Compressed Sensing going to change Machine Learning ?

15 0.36820871 480 hunch net-2013-03-22-I’m a bandit

16 0.36343572 414 hunch net-2010-10-17-Partha Niyogi has died

17 0.34593892 316 hunch net-2008-09-04-Fall ML Conferences

18 0.33918536 270 hunch net-2007-11-02-The Machine Learning Award goes to …

19 0.33535337 469 hunch net-2012-07-09-Videolectures

20 0.3142485 13 hunch net-2005-02-04-JMLG


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(17, 0.696)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.90275437 313 hunch net-2008-08-18-Radford Neal starts a blog

Introduction: here on statistics, ML, CS, and other things he knows well.

2 0.53682739 377 hunch net-2009-11-09-NYAS ML Symposium this year.

Introduction: The NYAS ML symposium grew again this year to 170 participants, despite the need to outsmart or otherwise tunnel through a crowd . Perhaps the most distinct talk was by Bob Bell on various aspects of the Netflix prize competition. I also enjoyed several student posters including Matt Hoffman ‘s cool examples of blind source separation for music. I’m somewhat surprised how much the workshop has grown, as it is now comparable in size to a small conference, although in style more similar to a workshop. At some point as an event grows, it becomes owned by the community rather than the organizers, so if anyone has suggestions on improving it, speak up and be heard.

3 0.324352 366 hunch net-2009-08-03-Carbon in Computer Science Research

Introduction: Al Gore ‘s film and gradually more assertive and thorough science has managed to mostly shift the debate on climate change from “Is it happening?” to “What should be done?” In that context, it’s worthwhile to think a bit about what can be done within computer science research. There are two things we can think about: Doing Research At a cartoon level, computer science research consists of some combination of commuting to&from; work, writing programs, running them on computers, writing papers, and presenting them at conferences. A typical computer has a power usage on the order of 100 Watts, which works out to 2.4 kiloWatt-hours/day. Looking up David MacKay ‘s reference on power usage per person , it becomes clear that this is a relatively minor part of the lifestyle, although it could become substantial if many more computers are required. Much larger costs are associated with commuting (which is in common with many people) and attending conferences. Since local commuti

4 0.24740344 253 hunch net-2007-07-06-Idempotent-capable Predictors

Introduction: One way to distinguish different learning algorithms is by their ability or inability to easily use an input variable as the predicted output. This is desirable for at least two reasons: Modularity If we want to build complex learning systems via reuse of a subsystem, it’s important to have compatible I/O. “Prior” knowledge Machine learning is often applied in situations where we do have some knowledge of what the right solution is, often in the form of an existing system. In such situations, it’s good to start with a learning algorithm that can be at least as good as any existing system. When doing classification, most learning algorithms can do this. For example, a decision tree can split on a feature, and then classify. The real differences come up when we attempt regression. Many of the algorithms we know and commonly use are not idempotent predictors. Logistic regressors can not be idempotent, because all input features are mapped through a nonlinearity.

5 0.19448395 143 hunch net-2005-12-27-Automated Labeling

Introduction: One of the common trends in machine learning has been an emphasis on the use of unlabeled data. The argument goes something like “there aren’t many labeled web pages out there, but there are a huge number of web pages, so we must find a way to take advantage of them.” There are several standard approaches for doing this: Unsupervised Learning . You use only unlabeled data. In a typical application, you cluster the data and hope that the clusters somehow correspond to what you care about. Semisupervised Learning. You use both unlabeled and labeled data to build a predictor. The unlabeled data influences the learned predictor in some way. Active Learning . You have unlabeled data and access to a labeling oracle. You interactively choose which examples to label so as to optimize prediction accuracy. It seems there is a fourth approach worth serious investigation—automated labeling. The approach goes as follows: Identify some subset of observed values to predict

6 0.063432902 415 hunch net-2010-10-28-NY ML Symposium 2010

7 0.041515503 236 hunch net-2007-03-15-Alternative Machine Learning Reductions Definitions

8 0.036967706 419 hunch net-2010-12-04-Vowpal Wabbit, version 5.0, and the second heresy

9 0.033036854 138 hunch net-2005-12-09-Some NIPS papers

10 0.030902753 329 hunch net-2008-11-28-A Bumper Crop of Machine Learning Graduates

11 0.027723921 309 hunch net-2008-07-10-Interesting papers, ICML 2008

12 0.0 1 hunch net-2005-01-19-Why I decided to run a weblog.

13 0.0 2 hunch net-2005-01-24-Holy grails of machine learning?

14 0.0 3 hunch net-2005-01-24-The Humanloop Spectrum of Machine Learning

15 0.0 4 hunch net-2005-01-26-Summer Schools

16 0.0 5 hunch net-2005-01-26-Watchword: Probability

17 0.0 6 hunch net-2005-01-27-Learning Complete Problems

18 0.0 7 hunch net-2005-01-31-Watchword: Assumption

19 0.0 8 hunch net-2005-02-01-NIPS: Online Bayes

20 0.0 9 hunch net-2005-02-01-Watchword: Loss