hunch_net hunch_net-2006 hunch_net-2006-162 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: For most people, a mathematical notation is like a language: you learn it and stick with it. For people doing mathematical research, however, this is not enough: they must design new notations for new problems. The design of good notation is both hard and worthwhile since a bad initial notation can retard a line of research greatly. Before we had mathematical notation, equations were all written out in language. Since words have multiple meanings and variable precedences, long equations written out in language can be extraordinarily difficult and sometimes fundamentally ambiguous. A good representative example of this is the legalese in the tax code. Since we want greater precision and clarity, we adopt mathematical notation. One fundamental thing to understand about mathematical notation, is that humans as logic verifiers, are barely capable. This is the fundamental reason why one notation can be much better than another. This observation is easier to miss than you might
sentIndex sentText sentNum sentScore
1 For most people, a mathematical notation is like a language: you learn it and stick with it. [sent-1, score-0.82]
2 For people doing mathematical research, however, this is not enough: they must design new notations for new problems. [sent-2, score-0.402]
3 The design of good notation is both hard and worthwhile since a bad initial notation can retard a line of research greatly. [sent-3, score-1.594]
4 Before we had mathematical notation, equations were all written out in language. [sent-4, score-0.36]
5 Since words have multiple meanings and variable precedences, long equations written out in language can be extraordinarily difficult and sometimes fundamentally ambiguous. [sent-5, score-0.501]
6 One fundamental thing to understand about mathematical notation, is that humans as logic verifiers, are barely capable. [sent-8, score-0.345]
7 This is the fundamental reason why one notation can be much better than another. [sent-9, score-0.714]
8 This observation is easier to miss than you might expect because, for a problem that you are working on, you have already expended the effort to reach an understanding. [sent-10, score-0.236]
9 I don’t know of any systematic method for designing notation, but there are a set of heuristics learned over time which may be more widely helpful. [sent-11, score-0.233]
10 If notation is only used once, it should be removable (this often arises in presentations). [sent-14, score-0.725]
11 A reasonable mechanism for notation design is to first name and define the quantities you are working with (for example, reward r and time t ), and then make derived quantities by combination (for example r t is reward at time t ). [sent-18, score-1.419]
12 (For example, in reinforcement learning the MDP M you are working with is often suppressable because it never changes. [sent-27, score-0.177]
13 ) A dependence must be either uniformly suppressed or uniformly explicit. [sent-28, score-0.32]
14 There seem to be two styles of theorem statements: long including all definitions and short with definitions made before the statement. [sent-30, score-0.669]
15 It is very easy to forget the quantification of a variable (“for all” or “there exists”) when you are working on a theorem, and it is essential for readers that you specify it explicitly. [sent-33, score-0.285]
16 english lowercase > english upper case > greek lower case > greek upper case > hebrew > other strange things. [sent-36, score-0.956]
17 The definitions section of a paper often should not contain all the definitions in a paper. [sent-37, score-0.431]
18 These heuristics often come into conflict, which can be hard to resolve. [sent-40, score-0.36]
19 When trying to resolve the conflict, it’s important to understand that it’s easy to fail to imagine what a notation would be like. [sent-41, score-0.772]
20 Are there other useful heuristics for notation design? [sent-43, score-0.893]
wordName wordTfidf (topN-words)
[('notation', 0.66), ('heuristics', 0.233), ('definitions', 0.183), ('mathematical', 0.16), ('greek', 0.142), ('notations', 0.142), ('collisions', 0.126), ('suppressed', 0.126), ('reward', 0.118), ('variable', 0.118), ('equations', 0.117), ('working', 0.112), ('english', 0.11), ('design', 0.1), ('uniformly', 0.097), ('conflict', 0.097), ('strange', 0.097), ('theorem', 0.091), ('short', 0.09), ('quantities', 0.087), ('written', 0.083), ('name', 0.077), ('upper', 0.077), ('statements', 0.076), ('humans', 0.076), ('long', 0.067), ('case', 0.067), ('often', 0.065), ('expended', 0.063), ('objectively', 0.063), ('hard', 0.062), ('easier', 0.061), ('language', 0.061), ('example', 0.06), ('resolve', 0.058), ('symbol', 0.058), ('retard', 0.058), ('sticking', 0.058), ('tax', 0.058), ('clarity', 0.055), ('styles', 0.055), ('meanings', 0.055), ('quantification', 0.055), ('barely', 0.055), ('disagreements', 0.055), ('divergence', 0.055), ('unfamiliar', 0.055), ('since', 0.054), ('trying', 0.054), ('fundamental', 0.054)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 162 hunch net-2006-03-09-Use of Notation
Introduction: For most people, a mathematical notation is like a language: you learn it and stick with it. For people doing mathematical research, however, this is not enough: they must design new notations for new problems. The design of good notation is both hard and worthwhile since a bad initial notation can retard a line of research greatly. Before we had mathematical notation, equations were all written out in language. Since words have multiple meanings and variable precedences, long equations written out in language can be extraordinarily difficult and sometimes fundamentally ambiguous. A good representative example of this is the legalese in the tax code. Since we want greater precision and clarity, we adopt mathematical notation. One fundamental thing to understand about mathematical notation, is that humans as logic verifiers, are barely capable. This is the fundamental reason why one notation can be much better than another. This observation is easier to miss than you might
2 0.18906716 35 hunch net-2005-03-04-The Big O and Constants in Learning
Introduction: The notation g(n) = O(f(n)) means that in the limit as n approaches infinity there exists a constant C such that the g(n) is less than Cf(n) . In learning theory, there are many statements about learning algorithms of the form “under assumptions x , y , and z , the classifier learned has an error rate of at most O(f(m)) “. There is one very good reason to use O(): it helps you understand the big picture and neglect the minor details which are not important in the big picture. However, there are some important reasons not to do this as well. Unspeedup In algorithm analysis, the use of O() for time complexity is pervasive and well-justified. Determining the exact value of C is inherently computer architecture dependent. (The “C” for x86 processors might differ from the “C” on PowerPC processors.) Since many learning theorists come from a CS theory background, the O() notation is applied to generalization error. The O() abstraction breaks here—you can not genera
3 0.18650302 202 hunch net-2006-08-10-Precision is not accuracy
Introduction: In my experience, there are two different groups of people who believe the same thing: the mathematics encountered in typical machine learning conference papers is often of questionable value. The two groups who agree on this are applied machine learning people who have given up on math, and mature theoreticians who understand the limits of theory. Partly, this is just a statement about where we are with respect to machine learning. In particular, we have no mechanism capable of generating a prescription for how to solve all learning problems. In the absence of such certainty, people try to come up with formalisms that partially describe and motivate how and why they do things. This is natural and healthy—we might hope that it will eventually lead to just such a mechanism. But, part of this is simply an emphasis on complexity over clarity. A very natural and simple theoretical statement is often obscured by complexifications. Common sources of complexification include:
4 0.11564405 286 hunch net-2008-01-25-Turing’s Club for Machine Learning
Introduction: Many people in Machine Learning don’t fully understand the impact of computation, as demonstrated by a lack of big-O analysis of new learning algorithms. This is important—some current active research programs are fundamentally flawed w.r.t. computation, and other research programs are directly motivated by it. When considering a learning algorithm, I think about the following questions: How does the learning algorithm scale with the number of examples m ? Any algorithm using all of the data is at least O(m) , but in many cases this is O(m 2 ) (naive nearest neighbor for self-prediction) or unknown (k-means or many other optimization algorithms). The unknown case is very common, and it can mean (for example) that the algorithm isn’t convergent or simply that the amount of computation isn’t controlled. The above question can also be asked for test cases. In some applications, test-time performance is of great importance. How does the algorithm scale with the number of
5 0.10739357 194 hunch net-2006-07-11-New Models
Introduction: How should we, as researchers in machine learning, organize ourselves? The most immediate measurable objective of computer science research is publishing a paper. The most difficult aspect of publishing a paper is having reviewers accept and recommend it for publication. The simplest mechanism for doing this is to show theoretical progress on some standard, well-known easily understood problem. In doing this, we often fall into a local minima of the research process. The basic problem in machine learning is that it is very unclear that the mathematical model is the right one for the (or some) real problem. A good mathematical model in machine learning should have one fundamental trait: it should aid the design of effective learning algorithms. To date, our ability to solve interesting learning problems (speech recognition, machine translation, object recognition, etc…) remains limited (although improving), so the “rightness” of our models is in doubt. If our mathematical mod
6 0.079201303 461 hunch net-2012-04-09-ICML author feedback is open
7 0.078900851 95 hunch net-2005-07-14-What Learning Theory might do
8 0.078373179 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006
9 0.073410384 157 hunch net-2006-02-18-Multiplication of Learned Probabilities is Dangerous
10 0.073381193 132 hunch net-2005-11-26-The Design of an Optimal Research Environment
11 0.073229618 351 hunch net-2009-05-02-Wielding a New Abstraction
12 0.072674334 44 hunch net-2005-03-21-Research Styles in Machine Learning
13 0.072078951 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms
14 0.071571 388 hunch net-2010-01-24-Specializations of the Master Problem
15 0.070170172 370 hunch net-2009-09-18-Necessary and Sufficient Research
16 0.06934268 208 hunch net-2006-09-18-What is missing for online collaborative research?
17 0.068825081 317 hunch net-2008-09-12-How do we get weak action dependence for learning with partial observations?
18 0.067974314 22 hunch net-2005-02-18-What it means to do research.
19 0.066669956 106 hunch net-2005-09-04-Science in the Government
20 0.066332921 343 hunch net-2009-02-18-Decision by Vetocracy
topicId topicWeight
[(0, 0.161), (1, 0.027), (2, -0.011), (3, 0.068), (4, -0.025), (5, -0.027), (6, 0.058), (7, 0.052), (8, 0.031), (9, 0.01), (10, -0.001), (11, -0.016), (12, 0.028), (13, 0.04), (14, 0.021), (15, -0.01), (16, 0.05), (17, 0.025), (18, -0.007), (19, 0.009), (20, -0.017), (21, 0.022), (22, 0.003), (23, -0.047), (24, -0.034), (25, -0.029), (26, 0.028), (27, -0.078), (28, -0.051), (29, -0.032), (30, -0.091), (31, 0.074), (32, 0.004), (33, 0.081), (34, 0.021), (35, -0.076), (36, 0.013), (37, 0.034), (38, 0.027), (39, -0.01), (40, -0.012), (41, 0.021), (42, -0.039), (43, -0.061), (44, -0.041), (45, 0.003), (46, 0.033), (47, 0.002), (48, 0.055), (49, 0.019)]
simIndex simValue blogId blogTitle
same-blog 1 0.9608627 162 hunch net-2006-03-09-Use of Notation
Introduction: For most people, a mathematical notation is like a language: you learn it and stick with it. For people doing mathematical research, however, this is not enough: they must design new notations for new problems. The design of good notation is both hard and worthwhile since a bad initial notation can retard a line of research greatly. Before we had mathematical notation, equations were all written out in language. Since words have multiple meanings and variable precedences, long equations written out in language can be extraordinarily difficult and sometimes fundamentally ambiguous. A good representative example of this is the legalese in the tax code. Since we want greater precision and clarity, we adopt mathematical notation. One fundamental thing to understand about mathematical notation, is that humans as logic verifiers, are barely capable. This is the fundamental reason why one notation can be much better than another. This observation is easier to miss than you might
2 0.81235915 202 hunch net-2006-08-10-Precision is not accuracy
Introduction: In my experience, there are two different groups of people who believe the same thing: the mathematics encountered in typical machine learning conference papers is often of questionable value. The two groups who agree on this are applied machine learning people who have given up on math, and mature theoreticians who understand the limits of theory. Partly, this is just a statement about where we are with respect to machine learning. In particular, we have no mechanism capable of generating a prescription for how to solve all learning problems. In the absence of such certainty, people try to come up with formalisms that partially describe and motivate how and why they do things. This is natural and healthy—we might hope that it will eventually lead to just such a mechanism. But, part of this is simply an emphasis on complexity over clarity. A very natural and simple theoretical statement is often obscured by complexifications. Common sources of complexification include:
3 0.74080324 187 hunch net-2006-06-25-Presentation of Proofs is Hard.
Introduction: When presenting part of the Reinforcement Learning theory tutorial at ICML 2006 , I was forcibly reminded of this. There are several difficulties. When creating the presentation, the correct level of detail is tricky. With too much detail, the proof takes too much time and people may be lost to boredom. With too little detail, the steps of the proof involve too-great a jump. This is very difficult to judge. What may be an easy step in the careful thought of a quiet room is not so easy when you are occupied by the process of presentation. What may be easy after having gone over this (and other) proofs is not so easy to follow in the first pass by a viewer. These problems seem only correctable by process of repeated test-and-revise. When presenting the proof, simply speaking with sufficient precision is substantially harder than in normal conversation (where precision is not so critical). Practice can help here. When presenting the proof, going at the right p
4 0.65229779 249 hunch net-2007-06-21-Presentation Preparation
Introduction: A big part of doing research is presenting it at a conference. Since many people start out shy of public presentations, this can be a substantial challenge. Here are a few notes which might be helpful when thinking about preparing a presentation on research. Motivate . Talks which don’t start by describing the problem to solve cause many people to zone out. Prioritize . It is typical that you have more things to say than time to say them, and many presenters fall into the failure mode of trying to say too much. This is an easy-to-understand failure mode as it’s very natural to want to include everything. A basic fact is: you can’t. Example of this are: Your slides are so densely full of equations and words that you can’t cover them. Your talk runs over and a moderator prioritizes for you by cutting you off. You motor-mouth through the presentation, and the information absorption rate of the audience prioritizes in some uncontrolled fashion. The rate of flow of c
5 0.64035696 104 hunch net-2005-08-22-Do you believe in induction?
Introduction: Foster Provost gave a talk at the ICML metalearning workshop on “metalearning” and the “no free lunch theorem” which seems worth summarizing. As a review: the no free lunch theorem is the most complicated way we know of to say that a bias is required in order to learn. The simplest way to see this is in a nonprobabilistic setting. If you are given examples of the form (x,y) and you wish to predict y from x then any prediction mechanism errs half the time in expectation over all sequences of examples. The proof of this is very simple: on every example a predictor must make some prediction and by symmetry over the set of sequences it will be wrong half the time and right half the time. The basic idea of this proof has been applied to many other settings. The simplistic interpretation of this theorem which many people jump to is “machine learning is dead” since there can be no single learning algorithm which can solve all learning problems. This is the wrong way to thi
6 0.64031404 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006
7 0.63854736 42 hunch net-2005-03-17-Going all the Way, Sometimes
8 0.62231457 70 hunch net-2005-05-12-Math on the Web
9 0.6151517 262 hunch net-2007-09-16-Optimizing Machine Learning Programs
10 0.6073786 126 hunch net-2005-10-26-Fallback Analysis is a Secret to Useful Algorithms
11 0.59881443 307 hunch net-2008-07-04-More Presentation Preparation
12 0.59240144 147 hunch net-2006-01-08-Debugging Your Brain
13 0.59188759 57 hunch net-2005-04-16-Which Assumptions are Reasonable?
14 0.58376777 55 hunch net-2005-04-10-Is the Goal Understanding or Prediction?
15 0.57119596 22 hunch net-2005-02-18-What it means to do research.
16 0.56933004 35 hunch net-2005-03-04-The Big O and Constants in Learning
17 0.56366074 110 hunch net-2005-09-10-“Failure” is an option
18 0.56203496 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms
19 0.56028646 148 hunch net-2006-01-13-Benchmarks for RL
20 0.55811924 91 hunch net-2005-07-10-Thinking the Unthought
topicId topicWeight
[(3, 0.055), (10, 0.018), (27, 0.128), (38, 0.107), (48, 0.016), (53, 0.053), (55, 0.048), (79, 0.293), (94, 0.115), (95, 0.06)]
simIndex simValue blogId blogTitle
same-blog 1 0.85709047 162 hunch net-2006-03-09-Use of Notation
Introduction: For most people, a mathematical notation is like a language: you learn it and stick with it. For people doing mathematical research, however, this is not enough: they must design new notations for new problems. The design of good notation is both hard and worthwhile since a bad initial notation can retard a line of research greatly. Before we had mathematical notation, equations were all written out in language. Since words have multiple meanings and variable precedences, long equations written out in language can be extraordinarily difficult and sometimes fundamentally ambiguous. A good representative example of this is the legalese in the tax code. Since we want greater precision and clarity, we adopt mathematical notation. One fundamental thing to understand about mathematical notation, is that humans as logic verifiers, are barely capable. This is the fundamental reason why one notation can be much better than another. This observation is easier to miss than you might
2 0.79168433 248 hunch net-2007-06-19-How is Compressed Sensing going to change Machine Learning ?
Introduction: Compressed Sensing (CS) is a new framework developed by Emmanuel Candes , Terry Tao and David Donoho . To summarize, if you acquire a signal in some basis that is incoherent with the basis in which you know the signal to be sparse in, it is very likely you will be able to reconstruct the signal from these incoherent projections. Terry Tao, the recent Fields medalist , does a very nice job at explaining the framework here . He goes further in the theory description in this post where he mentions the central issue of the Uniform Uncertainty Principle. It so happens that random projections are on average incoherent, within the UUP meaning, with most known basis (sines, polynomials, splines, wavelets, curvelets …) and are therefore an ideal basis for Compressed Sensing. [ For more in-depth information on the subject, the Rice group has done a very good job at providing a central library of papers relevant to the growing subject: http://www.dsp.ece.rice.edu/cs/ ] The Machine
3 0.77982777 254 hunch net-2007-07-12-ICML Trends
Introduction: Mark Reid did a post on ICML trends that I found interesting.
4 0.75040662 354 hunch net-2009-05-17-Server Update
Introduction: The hunch.net server has been updated. I’ve taken the opportunity to upgrade the version of wordpress which caused cascading changes. Old threaded comments are now flattened. The system we used to use ( Brian’s threaded comments ) appears incompatible with the new threading system built into wordpress. I haven’t yet figured out a workaround. I setup a feedburner account . I added an RSS aggregator for both Machine Learning and other research blogs that I like to follow. This is something that I’ve wanted to do for awhile. Many other minor changes in font and format, with some help from Alina . If you have any suggestions for site tweaks, please speak up.
5 0.70698071 423 hunch net-2011-02-02-User preferences for search engines
Introduction: I want to comment on the “Bing copies Google” discussion here , here , and here , because there are data-related issues which the general public may not understand, and some of the framing seems substantially misleading to me. As a not-distant-outsider, let me mention the sources of bias I may have. I work at Yahoo! , which has started using Bing . This might predispose me towards Bing, but on the other hand I’m still at Yahoo!, and have been using Linux exclusively as an OS for many years, including even a couple minor kernel patches. And, on the gripping hand , I’ve spent quite a bit of time thinking about the basic principles of incorporating user feedback in machine learning . Also note, this post is not related to official Yahoo! policy, it’s just my personal view. The issue Google engineers inserted synthetic responses to synthetic queries on google.com, then executed the synthetic searches on google.com using Internet Explorer with the Bing toolbar and later
6 0.6971097 27 hunch net-2005-02-23-Problem: Reinforcement Learning with Classification
7 0.6957776 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006
8 0.56525296 233 hunch net-2007-02-16-The Forgetting
9 0.54940081 49 hunch net-2005-03-30-What can Type Theory teach us about Machine Learning?
10 0.54387957 286 hunch net-2008-01-25-Turing’s Club for Machine Learning
11 0.54381543 306 hunch net-2008-07-02-Proprietary Data in Academic Research?
12 0.54242039 19 hunch net-2005-02-14-Clever Methods of Overfitting
13 0.5412513 136 hunch net-2005-12-07-Is the Google way the way for machine learning?
14 0.54111826 221 hunch net-2006-12-04-Structural Problems in NIPS Decision Making
15 0.5400942 236 hunch net-2007-03-15-Alternative Machine Learning Reductions Definitions
16 0.53815973 95 hunch net-2005-07-14-What Learning Theory might do
17 0.53799379 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models
18 0.53785455 391 hunch net-2010-03-15-The Efficient Robust Conditional Probability Estimation Problem
19 0.53616464 353 hunch net-2009-05-08-Computability in Artificial Intelligence
20 0.53370953 156 hunch net-2006-02-11-Yahoo’s Learning Problems.