hunch_net hunch_net-2006 hunch_net-2006-210 knowledge-graph by maker-knowledge-mining

210 hunch net-2006-09-28-Programming Languages for Machine Learning Implementations

meta infos for this blog

Source: html

Introduction: Machine learning algorithms have a much better chance of being widely adopted if they are implemented in some easy-to-use code. There are several important concerns associated with machine learning which stress programming languages on the ease-of-use vs. speed frontier. Speed The rate at which data sources are growing seems to be outstripping the rate at which computational power is growing, so it is important that we be able to eak out every bit of computational power. Garbage collected languages ( java , ocaml , perl and python ) often have several issues here. Garbage collection often implies that floating point numbers are “boxed”: every float is represented by a pointer to a float. Boxing can cause an order of magnitude slowdown because an extra nonlocalized memory reference is made, and accesses to main memory can are many CPU cycles long. Garbage collection often implies that considerably more memory is used than is necessary. This has a variable effect. I

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 There are several important concerns associated with machine learning which stress programming languages on the ease-of-use vs. [sent-2, score-0.591]

2 Garbage collected languages ( java , ocaml , perl and python ) often have several issues here. [sent-5, score-1.11]

3 Boxing can cause an order of magnitude slowdown because an extra nonlocalized memory reference is made, and accesses to main memory can are many CPU cycles long. [sent-7, score-0.592]

4 Garbage collection often implies that considerably more memory is used than is necessary. [sent-8, score-0.386]

5 In some circumstances it results in no slowdown while in others it can cause a 4-order of magnitude slowdown. [sent-10, score-0.288]

6 Some of these languages are interpreted rather than executed. [sent-12, score-0.513]

7 As a rule of thumb, interpreted languages are an order of magnitude slower than an executed languages. [sent-13, score-0.68]

8 Even when these languages are compiled, there are often issues with how well they are compiled. [sent-14, score-0.653]

9 Programming Ease Ease of use of a language is very subjective because it is always easiest to use the language you are most familiar with. [sent-16, score-0.617]

10 Syntax Syntax is often overlooked, but it can make a huge difference in the ease of both learning to program and using the language. [sent-18, score-0.36]

11 Library Support Languages vary dramatically in terms of library support, and having the right linear algebre/graphics/IO library can make a task dramatically easier. [sent-22, score-0.52]

12 One caveat here is that when you make a speed optimization pass, you often have to avoid these primitives. [sent-29, score-0.277]

13 Scalability Scalability is where otherwise higher level languages often break down. [sent-33, score-0.795]

14 A simple example of this is a language with file I/O built in that fails to perform correctly when the file has size 2 31 or 2 32 . [sent-34, score-0.458]

15 I am particularly familiar with Ocaml which has the following scalability issues: List operations often end up consuming the stack and segfaulting. [sent-35, score-0.444]

16 The Unison crew were annoyed enough by this that they created their own “safelist” library with all the same interfaces as the list type. [sent-36, score-0.324]

17 However, having big arrays, arrays, and strings, often becomes annoying because they have different interfaces for objects which are semantically the same. [sent-39, score-0.364]

18 At the other extreme, you can use a language which many people are familiar with such as C or Java. [sent-42, score-0.352]

19 The higher level languages often can’t execute fast and the lower level ones which can are often quite clumsy. [sent-53, score-1.167]

20 The approach I’ve taken is to first implement in a higher level language (my choice was ocaml) to test ideas and then reimplement in a lower level language (C or C++) for speed where the ideas work out. [sent-54, score-1.094]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('languages', 0.414), ('garbage', 0.241), ('language', 0.208), ('syntax', 0.198), ('implemented', 0.177), ('library', 0.171), ('arrays', 0.161), ('ocaml', 0.161), ('often', 0.154), ('memory', 0.152), ('ease', 0.152), ('scalability', 0.149), ('level', 0.135), ('familiarity', 0.134), ('speed', 0.123), ('java', 0.121), ('perl', 0.121), ('programming', 0.118), ('objects', 0.111), ('magnitude', 0.106), ('interfaces', 0.099), ('slowdown', 0.099), ('interpreted', 0.099), ('higher', 0.092), ('dramatically', 0.089), ('file', 0.089), ('familiar', 0.087), ('issues', 0.085), ('cause', 0.083), ('lower', 0.083), ('collection', 0.08), ('support', 0.074), ('built', 0.072), ('algorithmic', 0.067), ('growing', 0.063), ('rule', 0.061), ('associated', 0.059), ('extreme', 0.058), ('use', 0.057), ('ideas', 0.055), ('huge', 0.054), ('consuming', 0.054), ('polymorphism', 0.054), ('eak', 0.054), ('crew', 0.054), ('compiling', 0.054), ('concise', 0.054), ('desktop', 0.054), ('matlab', 0.054), ('python', 0.054)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999952 210 hunch net-2006-09-28-Programming Languages for Machine Learning Implementations

2 0.36095604 84 hunch net-2005-06-22-Languages of Learning

Introduction: A language is a set of primitives which can be combined to succesfully create complex objects. Languages arise in all sorts of situations: mechanical construction, martial arts, communication, etc… Languages appear to be the key to succesfully creating complex objects—it is difficult to come up with any convincing example of a complex object which is not built using some language. Since languages are so crucial to success, it is interesting to organize various machine learning research programs by language. The most common language in machine learning are languages for representing the solution to machine learning. This includes: Bayes Nets and Graphical Models A language for representing probability distributions. The key concept supporting modularity is conditional independence. Michael Kearns has been working on extending this to game theory. Kernelized Linear Classifiers A language for representing linear separators, possibly in a large space. The key form of

3 0.21994746 262 hunch net-2007-09-16-Optimizing Machine Learning Programs

Introduction: Machine learning is often computationally bounded which implies that the ability to write fast code becomes important if you ever want to implement a machine learning algorithm. Basic tactical optimizations are covered well elsewhere , but I haven’t seen a reasonable guide to higher level optimizations, which are the most important in my experience. Here are some of the higher level optimizations I’ve often found useful. Algorithmic Improvement First . This is Hard, but it is the most important consideration, and typically yields the most benefits. Good optimizations here are publishable. In the context of machine learning, you should be familiar with the arguments for online vs. batch learning. Choice of Language . There are many arguments about the choice of language . Sometimes you don’t have a choice when interfacing with other people. Personally, I favor C/C++ when I want to write fast code. This (admittedly) makes me a slower programmer than when using higher lev

4 0.14791965 49 hunch net-2005-03-30-What can Type Theory teach us about Machine Learning?

Introduction: This post is some combination of belaboring the obvious and speculating wildly about the future. The basic issue to be addressed is how to think about machine learning in terms given to us from Programming Language theory. Types and Reductions John’s research programme (I feel this should be in British spelling to reflect the grandiousness of the idea…) of machine learning reductions StateOfReduction is at some essential level type-theoretic in nature. The fundamental elements are the classifier, a function f: alpha -> beta, and the corresponding classifier trainer g: List of (alpha,beta) -> (alpha -> beta). The research goal is to create *combinators* that produce new f’s and g’s given existing ones. John (probably quite rightly) seems unwilling at the moment to commit to any notion stronger than these combinators are correctly typed. One way to see the result of a reduction is something typed like: (For those denied the joy of the Hindly-Milner type system, “simple” is probab

5 0.13817945 215 hunch net-2006-10-22-Exemplar programming

Introduction: There are many different abstractions for problem definition and solution. Here are a few examples: Functional programming: a set of functions are defined. The composed execution of these functions yields the solution. Linear programming: a set of constraints and a linear objective function are defined. An LP solver finds the constrained optimum. Quadratic programming: Like linear programming, but the language is a little more flexible (and the solution slower). Convex programming: like quadratic programming, but the language is more flexible (and the solutions even slower). Dynamic programming: a recursive definition of the problem is defined and then solved efficiently via caching tricks. SAT programming: A problem is specified as a satisfiability involving a conjunction of a disjunction of boolean variables. A general engine attempts to find a good satisfying assignment. For example Kautz’s blackbox planner. These abstractions have different tradeoffs betw

6 0.11837821 35 hunch net-2005-03-04-The Big O and Constants in Learning

7 0.095744044 58 hunch net-2005-04-21-Dynamic Programming Generalizations and Their Use

8 0.09411446 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

9 0.093684785 277 hunch net-2007-12-12-Workshop Summary—Principles of Learning Problem Design

10 0.093467861 235 hunch net-2007-03-03-All Models of Learning have Flaws

11 0.091247901 120 hunch net-2005-10-10-Predictive Search is Coming

12 0.090709478 473 hunch net-2012-09-29-Vowpal Wabbit, version 7.0

13 0.089924589 454 hunch net-2012-01-30-ICML Posters and Scope

14 0.089754328 229 hunch net-2007-01-26-Parallel Machine Learning Problems

15 0.083782181 419 hunch net-2010-12-04-Vowpal Wabbit, version 5.0, and the second heresy

16 0.081573322 301 hunch net-2008-05-23-Three levels of addressing the Netflix Prize

17 0.08030156 237 hunch net-2007-04-02-Contextual Scaling

18 0.079331696 60 hunch net-2005-04-23-Advantages and Disadvantages of Bayesian Learning

19 0.078256987 70 hunch net-2005-05-12-Math on the Web

20 0.078116529 128 hunch net-2005-11-05-The design of a computing cluster

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.195), (1, 0.041), (2, -0.065), (3, 0.053), (4, 0.004), (5, -0.002), (6, -0.023), (7, 0.018), (8, -0.011), (9, -0.008), (10, -0.151), (11, -0.078), (12, -0.018), (13, -0.002), (14, 0.007), (15, -0.09), (16, 0.041), (17, 0.011), (18, 0.011), (19, -0.102), (20, 0.078), (21, -0.006), (22, -0.016), (23, -0.002), (24, -0.047), (25, 0.001), (26, -0.012), (27, -0.008), (28, -0.005), (29, 0.08), (30, -0.119), (31, 0.064), (32, 0.197), (33, 0.205), (34, 0.005), (35, 0.04), (36, -0.011), (37, -0.022), (38, -0.115), (39, -0.122), (40, 0.046), (41, -0.053), (42, 0.088), (43, -0.133), (44, -0.093), (45, -0.051), (46, 0.017), (47, -0.039), (48, 0.039), (49, 0.163)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96043801 210 hunch net-2006-09-28-Programming Languages for Machine Learning Implementations

2 0.86524481 84 hunch net-2005-06-22-Languages of Learning

3 0.69940764 262 hunch net-2007-09-16-Optimizing Machine Learning Programs

4 0.63459444 128 hunch net-2005-11-05-The design of a computing cluster

Introduction: This is about the design of a computing cluster from the viewpoint of applied machine learning using current technology. We just built a small one at TTI so this is some evidence of what is feasible and thoughts about the design choices. Architecture There are several architectural choices. AMD Athlon64 based system. This seems to have the cheapest bang/buck. Maximum RAM is typically 2-3GB. AMD Opteron based system. Opterons provide the additional capability to buy an SMP motherboard with two chips, and the motherboards often support 16GB of RAM. The RAM is also the more expensive error correcting type. Intel PIV or Xeon based system. The PIV and Xeon based systems are the intel analog of the above 2. Due to architectural design reasons, these chips tend to run a bit hotter and be a bit more expensive. Dual core chips. Both Intel and AMD have chips that actually have 2 processors embedded in them. In the end, we decided to go with option (2). Roughly speaking,

5 0.60547161 215 hunch net-2006-10-22-Exemplar programming

6 0.55080074 49 hunch net-2005-03-30-What can Type Theory teach us about Machine Learning?

7 0.52921695 171 hunch net-2006-04-09-Progress in Machine Translation

8 0.48866585 229 hunch net-2007-01-26-Parallel Machine Learning Problems

9 0.46840522 450 hunch net-2011-12-02-Hadoop AllReduce and Terascale Learning

10 0.4423202 162 hunch net-2006-03-09-Use of Notation

11 0.43799725 152 hunch net-2006-01-30-Should the Input Representation be a Vector?

12 0.43404999 147 hunch net-2006-01-08-Debugging Your Brain

13 0.43297195 277 hunch net-2007-12-12-Workshop Summary—Principles of Learning Problem Design

14 0.42746016 70 hunch net-2005-05-12-Math on the Web

15 0.42496866 122 hunch net-2005-10-13-Site tweak

16 0.42439115 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

17 0.4231829 37 hunch net-2005-03-08-Fast Physics for Learning

18 0.40782481 250 hunch net-2007-06-23-Machine Learning Jobs are Growing on Trees

19 0.40386292 366 hunch net-2009-08-03-Carbon in Computer Science Research

20 0.40383732 228 hunch net-2007-01-15-The Machine Learning Department

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(0, 0.031), (1, 0.024), (10, 0.018), (27, 0.145), (38, 0.064), (49, 0.023), (51, 0.024), (53, 0.046), (55, 0.058), (64, 0.329), (94, 0.119), (95, 0.031)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.9215579 442 hunch net-2011-08-20-The Large Scale Learning Survey Tutorial

Introduction: Ron Bekkerman initiated an effort to create an edited book on parallel machine learning that Misha and I have been helping with. The breadth of efforts to parallelize machine learning surprised me: I was only aware of a small fraction initially. This put us in a unique position, with knowledge of a wide array of different efforts, so it is natural to put together a survey tutorial on the subject of parallel learning for KDD , tomorrow. This tutorial is not limited to the book itself however, as several interesting new algorithms have come out since we started inviting chapters. This tutorial should interest anyone trying to use machine learning on significant quantities of data, anyone interested in developing algorithms for such, and of course who has bragging rights to the fastest learning algorithm on planet earth (Also note the Modeling with Hadoop tutorial just before ours which deals with one way of trying to speed up learning algorithms. We have almost no

2 0.9057045 155 hunch net-2006-02-07-Pittsburgh Mind Reading Competition

Introduction: Francisco Pereira points out a fun Prediction Competition . Francisco says: DARPA is sponsoring a competition to analyze data from an unusual functional Magnetic Resonance Imaging experiment. Subjects watch videos inside the scanner while fMRI data are acquired. Unbeknownst to these subjects, the videos have been seen by a panel of other subjects that labeled each instant with labels in categories such as representation (are there tools, body parts, motion, sound), location, presence of actors, emotional content, etc. The challenge is to predict all of these different labels on an instant-by-instant basis from the fMRI data. A few reasons why this is particularly interesting: This is beyond the current state of the art, but not inconceivably hard. This is a new type of experiment design current analysis methods cannot deal with. This is an opportunity to work with a heavily examined and preprocessed neuroimaging dataset. DARPA is offering prizes!

same-blog 3 0.86103797 210 hunch net-2006-09-28-Programming Languages for Machine Learning Implementations

4 0.82965112 291 hunch net-2008-03-07-Spock Challenge Winners

Introduction: The spock challenge for named entity recognition was won by Berno Stein , Sven Eissen, Tino Rub, Hagen Tonnies, Christof Braeutigam, and Martin Potthast .

5 0.81775343 420 hunch net-2010-12-26-NIPS 2010

Introduction: I enjoyed attending NIPS this year, with several things interesting me. For the conference itself: Peter Welinder , Steve Branson , Serge Belongie , and Pietro Perona , The Multidimensional Wisdom of Crowds . This paper is about using mechanical turk to get label information, with results superior to a majority vote approach. David McAllester , Tamir Hazan , and Joseph Keshet Direct Loss Minimization for Structured Prediction . This is about another technique for directly optimizing the loss in structured prediction, with an application to speech recognition. Mohammad Saberian and Nuno Vasconcelos Boosting Classifier Cascades . This is about an algorithm for simultaneously optimizing loss and computation in a classifier cascade construction. There were several other papers on cascades which are worth looking at if interested. Alan Fern and Prasad Tadepalli , A Computational Decision Theory for Interactive Assistants . This paper carves out some

6 0.81359428 277 hunch net-2007-12-12-Workshop Summary—Principles of Learning Problem Design

7 0.79261762 18 hunch net-2005-02-12-ROC vs. Accuracy vs. AROC

8 0.57568777 343 hunch net-2009-02-18-Decision by Vetocracy

9 0.57349735 136 hunch net-2005-12-07-Is the Google way the way for machine learning?

10 0.568932 49 hunch net-2005-03-30-What can Type Theory teach us about Machine Learning?

11 0.56497699 426 hunch net-2011-03-19-The Ideal Large Scale Learning Class

12 0.54520541 424 hunch net-2011-02-17-What does Watson mean?

13 0.54413038 351 hunch net-2009-05-02-Wielding a New Abstraction

14 0.5376426 423 hunch net-2011-02-02-User preferences for search engines

15 0.53655064 191 hunch net-2006-07-08-MaxEnt contradicts Bayes Rule?

16 0.53366941 256 hunch net-2007-07-20-Motivation should be the Responsibility of the Reviewer

17 0.53351969 262 hunch net-2007-09-16-Optimizing Machine Learning Programs

18 0.53041065 301 hunch net-2008-05-23-Three levels of addressing the Netflix Prize

19 0.53025144 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

20 0.52966809 131 hunch net-2005-11-16-The Everything Ensemble Edge