hunch_net hunch_net-2007 hunch_net-2007-248 knowledge-graph by maker-knowledge-mining

248 hunch net-2007-06-19-How is Compressed Sensing going to change Machine Learning ?

meta infos for this blog

Source: html

Introduction: Compressed Sensing (CS) is a new framework developed by Emmanuel Candes , Terry Tao and David Donoho . To summarize, if you acquire a signal in some basis that is incoherent with the basis in which you know the signal to be sparse in, it is very likely you will be able to reconstruct the signal from these incoherent projections. Terry Tao, the recent Fields medalist , does a very nice job at explaining the framework here . He goes further in the theory description in this post where he mentions the central issue of the Uniform Uncertainty Principle. It so happens that random projections are on average incoherent, within the UUP meaning, with most known basis (sines, polynomials, splines, wavelets, curvelets …) and are therefore an ideal basis for Compressed Sensing. [ For more in-depth information on the subject, the Rice group has done a very good job at providing a central library of papers relevant to the growing subject: http://www.dsp.ece.rice.edu/cs/ ] The Machine

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 To summarize, if you acquire a signal in some basis that is incoherent with the basis in which you know the signal to be sparse in, it is very likely you will be able to reconstruct the signal from these incoherent projections. [sent-2, score-0.982]

2 It so happens that random projections are on average incoherent, within the UUP meaning, with most known basis (sines, polynomials, splines, wavelets, curvelets …) and are therefore an ideal basis for Compressed Sensing. [sent-5, score-1.197]

3 [ For more in-depth information on the subject, the Rice group has done a very good job at providing a central library of papers relevant to the growing subject: http://www. [sent-6, score-0.192]

4 ) Face Recognition Experiments with Random Projection by Goel, Bebis and Nefian Dimensionality reduction by random mapping: Fast similarity computation for clustering by S. [sent-11, score-0.51]

5 ) but while they seem to give somewhat comparable results with regards to PCA, the number of contributions on the subject does not seem overwhelming. [sent-13, score-0.162]

6 Maybe one of the reason is that in most papers cited above, the main theoretical reason for using Random projections lies with the Johnson-Lindenstrauss (JL) lemma. [sent-14, score-0.708]

7 As a consequence, most random matrices used in these publications come from the Database world and not from the newer framework of Compressed Sensing (a list of these matrices and their properties can be found in the middle of this page ). [sent-15, score-0.758]

8 The uncanny reliance on Random projections within the JL lemma and in the Compressed Sensing setting was explained by Richard Baraniuk , Mark Davenport , Ronald DeVore , and Michael Wakin in this paper entitled: A simple proof of the restricted isometry property for random matrices . [sent-16, score-1.148]

9 However, the most interesting fallout from this comparison between JL and CS comes in the form of the contribution by Richard Baraniuk and Michael Wakin in Random projections of smooth manifolds . [sent-17, score-0.592]

10 Our results bear strong resemblance to the emerging theory of Compressed Sensing (CS), in which sparse signals can be recovered from small numbers of random linear measurements. [sent-19, score-0.689]

11 As in CS, the random measurements we propose can be used to recover the original data in RN. [sent-20, score-0.514]

12 Moreover, like the fundamental bound in CS, our requisite M is linear in the Ã¢â‚¬Å“information levelÃ¢â‚¬Â K and logarithmic in the ambient dimension N; we also identify a logarithmic dependence on the volume and curvature of the manifold. [sent-21, score-0.281]

13 In addition to recovering faithful approximations to manifold-modeled signals, however, the random projections we propose can also be used to discern key properties about the manifold. [sent-22, score-1.129]

14 We discuss connections and contrasts with existing techniques in manifold learning, a setting where dimensionality reducing mappings are typically nonlinear and constructed adaptively from a set of sampled training data. [sent-23, score-0.304]

15 It looks as though, as a result, Universal Dimensionality Reduction is achieved by some properly chosen random projections. [sent-24, score-0.409]

16 On the other hand, even the researchers that model these processes do not realize or point out that this robustness is in part due to random projections. [sent-28, score-0.457]

17 Case in point, the excellent work of Thomas Serre , Aude Oliva and Tomaso Poggio culminating in a paper describing a biology inspired model of brain that shows its ability to process information in a feedforward fashion . [sent-29, score-0.201]

18 The modeling is new in that, in this area of science, there is a central issue as to whether the brain works in a one-way process or with many loops. [sent-30, score-0.198]

19 In the paper, the feature dimension reduction model (which is what this process is) uses random projections as I pointed out recently . [sent-31, score-1.218]

20 Because of the intrinsic dimension reduction capability, Mike Wakin has also shown efficient nearest neighbor searches using few random projections (see figure 3 of this paper ). [sent-32, score-1.121]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('projections', 0.514), ('random', 0.409), ('cs', 0.203), ('wakin', 0.189), ('compressed', 0.168), ('signal', 0.147), ('baraniuk', 0.142), ('jl', 0.14), ('manifold', 0.14), ('sensing', 0.134), ('incoherent', 0.126), ('dimensionality', 0.122), ('basis', 0.116), ('propose', 0.105), ('matrices', 0.105), ('reduction', 0.101), ('dimension', 0.097), ('tao', 0.094), ('terry', 0.094), ('central', 0.094), ('proportional', 0.084), ('signals', 0.084), ('framework', 0.081), ('lemma', 0.078), ('contribution', 0.078), ('richard', 0.073), ('mapping', 0.067), ('logarithmic', 0.065), ('number', 0.06), ('subject', 0.059), ('properties', 0.058), ('sparse', 0.057), ('brain', 0.055), ('linear', 0.054), ('reason', 0.051), ('main', 0.05), ('information', 0.049), ('job', 0.049), ('process', 0.049), ('michael', 0.048), ('model', 0.048), ('experiments', 0.045), ('hand', 0.044), ('key', 0.043), ('results', 0.043), ('within', 0.042), ('resemblance', 0.042), ('pairwise', 0.042), ('cited', 0.042), ('contrasts', 0.042)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000004 248 hunch net-2007-06-19-How is Compressed Sensing going to change Machine Learning ?

2 0.16265567 199 hunch net-2006-07-26-Two more UAI papers of interest

Introduction: In addition to Ed Snelson’s paper, there were (at least) two other papers that caught my eye at UAI. One was this paper by Sanjoy Dasgupta, Daniel Hsu and Nakul Verma at UCSD which shows in a surprisingly general and strong way that almost all linear projections of any jointly distributed vector random variable with finite first and second moments look sphereical and unimodal (in fact look like a scale mixture of Gaussians). Great result, as you’d expect from Sanjoy. The other paper which I found intriguing but which I just haven’t groked yet is this beast by Manfred and Dima Kuzmin. You can check out the (beautiful) slides if that helps. I feel like there is something deep here, but my brain is too small to understand it. The COLT and last NIPS papers/slides are also on Manfred’s page. Hopefully someone here can illuminate.

3 0.124346 219 hunch net-2006-11-22-Explicit Randomization in Learning algorithms

Introduction: There are a number of learning algorithms which explicitly incorporate randomness into their execution. This includes at amongst others: Neural Networks. Neural networks use randomization to assign initial weights. Boltzmann Machines/ Deep Belief Networks . Boltzmann machines are something like a stochastic version of multinode logistic regression. The use of randomness is more essential in Boltzmann machines, because the predicted value at test time also uses randomness. Bagging. Bagging is a process where a learning algorithm is run several different times on several different datasets, creating a final predictor which makes a majority vote. Policy descent. Several algorithms in reinforcement learning such as Conservative Policy Iteration use random bits to create stochastic policies. Experts algorithms. Randomized weighted majority use random bits as a part of the prediction process to achieve better theoretical guarantees. A basic question is: “Should there

4 0.11366428 444 hunch net-2011-09-07-KDD and MUCMD 2011

Introduction: At KDD I enjoyed Stephen Boyd ‘s invited talk about optimization quite a bit. However, the most interesting talk for me was David Haussler ‘s. His talk started out with a formidable load of biological complexity. About half-way through you start wondering, “can this be used to help with cancer?” And at the end he connects it directly to use with a call to arms for the audience: cure cancer. The core thesis here is that cancer is a complex set of diseases which can be distentangled via genetic assays, allowing attacking the specific signature of individual cancers. However, the data quantity and complex dependencies within the data require systematic and relatively automatic prediction and analysis algorithms of the kind that we are best familiar with. Some of the papers which interested me are: Kai-Wei Chang and Dan Roth , Selective Block Minimization for Faster Convergence of Limited Memory Large-Scale Linear Models , which is about effectively using a hard-example

5 0.11355696 59 hunch net-2005-04-22-New Blog: [Lowerbounds,Upperbounds]

Introduction: Maverick Woo and the Aladdin group at CMU have started a CS theory-related blog here .

6 0.10962685 313 hunch net-2008-08-18-Radford Neal starts a blog

7 0.10917402 102 hunch net-2005-08-11-Why Manifold-Based Dimension Reduction Techniques?

8 0.084033847 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009

9 0.083636001 398 hunch net-2010-05-10-Aggregation of estimators, sparsity in high dimension and computational feasibility

10 0.083009973 337 hunch net-2009-01-21-Nearly all natural problems require nonlinearity

11 0.081588991 391 hunch net-2010-03-15-The Efficient Robust Conditional Probability Estimation Problem

12 0.078535378 87 hunch net-2005-06-29-Not EM for clustering at COLT

13 0.077610739 439 hunch net-2011-08-01-Interesting papers at COLT 2011

14 0.071994886 343 hunch net-2009-02-18-Decision by Vetocracy

15 0.071740702 45 hunch net-2005-03-22-Active learning

16 0.06970793 304 hunch net-2008-06-27-Reviewing Horror Stories

17 0.068848528 440 hunch net-2011-08-06-Interesting thing at UAI 2011

18 0.068489358 456 hunch net-2012-02-24-ICML+50%

19 0.067437485 454 hunch net-2012-01-30-ICML Posters and Scope

20 0.066898234 317 hunch net-2008-09-12-How do we get weak action dependence for learning with partial observations?

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.158), (1, 0.019), (2, 0.005), (3, -0.01), (4, 0.065), (5, 0.004), (6, 0.022), (7, -0.057), (8, 0.016), (9, -0.079), (10, 0.02), (11, -0.027), (12, -0.09), (13, -0.03), (14, -0.044), (15, 0.034), (16, 0.006), (17, 0.091), (18, 0.053), (19, -0.043), (20, -0.045), (21, -0.005), (22, -0.038), (23, 0.036), (24, -0.008), (25, -0.024), (26, -0.035), (27, -0.061), (28, -0.007), (29, -0.044), (30, -0.027), (31, -0.08), (32, 0.074), (33, 0.032), (34, 0.035), (35, 0.013), (36, -0.133), (37, 0.046), (38, 0.069), (39, 0.033), (40, -0.048), (41, 0.171), (42, 0.088), (43, 0.001), (44, -0.058), (45, -0.086), (46, 0.061), (47, 0.107), (48, -0.002), (49, -0.033)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97230154 248 hunch net-2007-06-19-How is Compressed Sensing going to change Machine Learning ?

2 0.6126799 150 hunch net-2006-01-23-On Coding via Mutual Information & Bayes Nets

Introduction: Say we have two random variables X,Y with mutual information I(X,Y) . Let’s say we want to represent them with a bayes net of the form X< -M->Y , such that the entropy of M equals the mutual information, i.e. H(M)=I(X,Y) . Intuitively, we would like our hidden state to be as simple as possible (entropy wise). The data processing inequality means that H(M)>=I(X,Y) , so the mutual information is a lower bound on how simple the M could be. Furthermore, if such a construction existed it would have a nice coding interpretation — one could jointly code X and Y by first coding the mutual information, then coding X with this mutual info (without Y ) and coding Y with this mutual info (without X ). It turns out that such a construction does not exist in general (Thx Alina Beygelzimer for a counterexample! see below for the sketch). What are the implications of this? Well, it’s hard for me to say, but it does suggest to me that the ‘generative’ model philosophy might be

3 0.58726376 87 hunch net-2005-06-29-Not EM for clustering at COLT

Introduction: One standard approach for clustering data with a set of gaussians is using EM. Roughly speaking, you pick a set of k random guassians and then use alternating expectation maximization to (hopefully) find a set of guassians that “explain” the data well. This process is difficult to work with because EM can become “stuck” in local optima. There are various hacks like “rerun with t different random starting points”. One cool observation is that this can often be solved via other algorithm which do not suffer from local optima. This is an early paper which shows this. Ravi Kannan presented a new paper showing this is possible in a much more adaptive setting. A very rough summary of these papers is that by projecting into a lower dimensional space, it is computationally tractable to pick out the gross structure of the data. It is unclear how well these algorithms work in practice, but they might be effective, especially if used as a subroutine of the form: Projec

4 0.55742604 140 hunch net-2005-12-14-More NIPS Papers II

Introduction: I thought this was a very good NIPS with many excellent papers. The following are a few NIPS papers which I liked and I hope to study more carefully when I get the chance. The list is not exhaustive and in no particular order… Preconditioner Approximations for Probabilistic Graphical Models. Pradeeep Ravikumar and John Lafferty. I thought the use of preconditioner methods from solving linear systems in the context of approximate inference was novel and interesting. The results look good and I’d like to understand the limitations. Rodeo: Sparse nonparametric regression in high dimensions. John Lafferty and Larry Wasserman. A very interesting approach to feature selection in nonparametric regression from a frequentist framework. The use of lengthscale variables in each dimension reminds me a lot of ‘Automatic Relevance Determination’ in Gaussian process regression — it would be interesting to compare Rodeo to ARD in GPs. Interpolating between types and tokens by estimating

5 0.55719614 199 hunch net-2006-07-26-Two more UAI papers of interest

6 0.51796079 189 hunch net-2006-07-05-more icml papers

7 0.4970952 102 hunch net-2005-08-11-Why Manifold-Based Dimension Reduction Techniques?

8 0.46779177 444 hunch net-2011-09-07-KDD and MUCMD 2011

9 0.44825009 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009

10 0.43803543 334 hunch net-2009-01-07-Interesting Papers at SODA 2009

11 0.43793902 219 hunch net-2006-11-22-Explicit Randomization in Learning algorithms

12 0.43686363 185 hunch net-2006-06-16-Regularization = Robustness

13 0.42965204 398 hunch net-2010-05-10-Aggregation of estimators, sparsity in high dimension and computational feasibility

14 0.42683941 476 hunch net-2012-12-29-Simons Institute Big Data Program

15 0.42639402 325 hunch net-2008-11-10-ICML Reviewing Criteria

16 0.42338756 77 hunch net-2005-05-29-Maximum Margin Mismatch?

17 0.41266546 446 hunch net-2011-10-03-Monday announcements

18 0.4114899 414 hunch net-2010-10-17-Partha Niyogi has died

19 0.40349397 309 hunch net-2008-07-10-Interesting papers, ICML 2008

20 0.4018006 61 hunch net-2005-04-25-Embeddings: what are they good for?

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(3, 0.027), (10, 0.018), (27, 0.179), (30, 0.016), (33, 0.042), (38, 0.033), (53, 0.039), (55, 0.083), (63, 0.019), (64, 0.011), (71, 0.011), (77, 0.025), (79, 0.269), (94, 0.061), (95, 0.055)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.88873315 248 hunch net-2007-06-19-How is Compressed Sensing going to change Machine Learning ?

2 0.85000277 354 hunch net-2009-05-17-Server Update

Introduction: The hunch.net server has been updated. I’ve taken the opportunity to upgrade the version of wordpress which caused cascading changes. Old threaded comments are now flattened. The system we used to use ( Brian’s threaded comments ) appears incompatible with the new threading system built into wordpress. I haven’t yet figured out a workaround. I setup a feedburner account . I added an RSS aggregator for both Machine Learning and other research blogs that I like to follow. This is something that I’ve wanted to do for awhile. Many other minor changes in font and format, with some help from Alina . If you have any suggestions for site tweaks, please speak up.

3 0.84266919 162 hunch net-2006-03-09-Use of Notation

Introduction: For most people, a mathematical notation is like a language: you learn it and stick with it. For people doing mathematical research, however, this is not enough: they must design new notations for new problems. The design of good notation is both hard and worthwhile since a bad initial notation can retard a line of research greatly. Before we had mathematical notation, equations were all written out in language. Since words have multiple meanings and variable precedences, long equations written out in language can be extraordinarily difficult and sometimes fundamentally ambiguous. A good representative example of this is the legalese in the tax code. Since we want greater precision and clarity, we adopt mathematical notation. One fundamental thing to understand about mathematical notation, is that humans as logic verifiers, are barely capable. This is the fundamental reason why one notation can be much better than another. This observation is easier to miss than you might

4 0.79546189 254 hunch net-2007-07-12-ICML Trends

Introduction: Mark Reid did a post on ICML trends that I found interesting.

5 0.77156699 27 hunch net-2005-02-23-Problem: Reinforcement Learning with Classification

Introduction: At an intuitive level, the question here is “Can reinforcement learning be solved with classification?” Problem Construct a reinforcement learning algorithm with near-optimal expected sum of rewards in the direct experience model given access to a classifier learning algorithm which has a small error rate or regret on all posed classification problems. The definition of “posed” here is slightly murky. I consider a problem “posed” if there is an algorithm for constructing labeled classification examples. Past Work There exists a reduction of reinforcement learning to classification given a generative model. A generative model is an inherently stronger assumption than the direct experience model. Other work on learning reductions may be important. Several algorithms for solving reinforcement learning in the direct experience model exist. Most, such as E 3 , Factored-E 3 , and metric-E 3 and Rmax require that the observation be the state. Recent work

6 0.76331407 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006

7 0.76056933 423 hunch net-2011-02-02-User preferences for search engines

8 0.62236565 343 hunch net-2009-02-18-Decision by Vetocracy

9 0.6217882 493 hunch net-2014-02-16-Metacademy: a package manager for knowledge

10 0.62110651 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

11 0.61502033 360 hunch net-2009-06-15-In Active Learning, the question changes

12 0.61333847 194 hunch net-2006-07-11-New Models

13 0.61176002 235 hunch net-2007-03-03-All Models of Learning have Flaws

14 0.61142284 36 hunch net-2005-03-05-Funding Research

15 0.60867763 225 hunch net-2007-01-02-Retrospective

16 0.60861498 95 hunch net-2005-07-14-What Learning Theory might do

17 0.6082902 220 hunch net-2006-11-27-Continuizing Solutions

18 0.60822231 351 hunch net-2009-05-02-Wielding a New Abstraction

19 0.60789943 320 hunch net-2008-10-14-Who is Responsible for a Bad Review?

20 0.60762084 478 hunch net-2013-01-07-NYU Large Scale Machine Learning Class