hunch_net hunch_net-2006 hunch_net-2006-193 knowledge-graph by maker-knowledge-mining

193 hunch net-2006-07-09-The Stock Prediction Machine Learning Problem


meta infos for this blog

Source: html

Introduction: …is discussed in this nytimes article . I generally expect such approaches to become more common since computers are getting faster, machine learning is getting better, and data is becoming more plentiful. This is another example where machine learning technology may have a huge economic impact. Some side notes: We-in-research know almost nothing about how these things are done (because it is typically a corporate secret). … but the limited discussion in the article seem naive from a machine learning viewpoint. The learning process used apparently often fails to take into account transaction costs. What little of the approaches is discussed appears modeling based. It seems plausible that more direct prediction methods can yield an edge. One difficulty with stock picking as a research topic is that it is inherently a zero sum game (for every winner, there is a loser). Much of the rest of research is positive sum (basically, everyone wins).


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 I generally expect such approaches to become more common since computers are getting faster, machine learning is getting better, and data is becoming more plentiful. [sent-2, score-1.13]

2 This is another example where machine learning technology may have a huge economic impact. [sent-3, score-0.455]

3 Some side notes: We-in-research know almost nothing about how these things are done (because it is typically a corporate secret). [sent-4, score-0.474]

4 … but the limited discussion in the article seem naive from a machine learning viewpoint. [sent-5, score-0.695]

5 The learning process used apparently often fails to take into account transaction costs. [sent-6, score-0.587]

6 What little of the approaches is discussed appears modeling based. [sent-7, score-0.588]

7 It seems plausible that more direct prediction methods can yield an edge. [sent-8, score-0.422]

8 One difficulty with stock picking as a research topic is that it is inherently a zero sum game (for every winner, there is a loser). [sent-9, score-1.259]

9 Much of the rest of research is positive sum (basically, everyone wins). [sent-10, score-0.656]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('article', 0.268), ('sum', 0.244), ('nytimes', 0.203), ('transaction', 0.203), ('discussed', 0.197), ('getting', 0.195), ('secret', 0.188), ('wins', 0.177), ('corporate', 0.169), ('approaches', 0.159), ('naive', 0.152), ('economic', 0.152), ('stock', 0.147), ('apparently', 0.147), ('picking', 0.143), ('modeling', 0.14), ('winner', 0.137), ('basically', 0.134), ('notes', 0.134), ('zero', 0.131), ('computers', 0.129), ('fails', 0.129), ('direct', 0.129), ('game', 0.126), ('becoming', 0.126), ('nothing', 0.124), ('positive', 0.116), ('technology', 0.113), ('rest', 0.111), ('inherently', 0.11), ('yield', 0.11), ('faster', 0.11), ('account', 0.108), ('side', 0.108), ('limited', 0.106), ('plausible', 0.104), ('topic', 0.103), ('huge', 0.102), ('everyone', 0.096), ('difficulty', 0.092), ('appears', 0.092), ('research', 0.089), ('machine', 0.088), ('become', 0.084), ('discussion', 0.081), ('methods', 0.079), ('expect', 0.078), ('generally', 0.076), ('every', 0.074), ('almost', 0.073)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 193 hunch net-2006-07-09-The Stock Prediction Machine Learning Problem

Introduction: …is discussed in this nytimes article . I generally expect such approaches to become more common since computers are getting faster, machine learning is getting better, and data is becoming more plentiful. This is another example where machine learning technology may have a huge economic impact. Some side notes: We-in-research know almost nothing about how these things are done (because it is typically a corporate secret). … but the limited discussion in the article seem naive from a machine learning viewpoint. The learning process used apparently often fails to take into account transaction costs. What little of the approaches is discussed appears modeling based. It seems plausible that more direct prediction methods can yield an edge. One difficulty with stock picking as a research topic is that it is inherently a zero sum game (for every winner, there is a loser). Much of the rest of research is positive sum (basically, everyone wins).

2 0.13623413 112 hunch net-2005-09-14-The Predictionist Viewpoint

Introduction: Virtually every discipline of significant human endeavor has a way explaining itself as fundamental and important. In all the cases I know of, they are both right (they are vital) and wrong (they are not solely vital). Politics. This is the one that everyone is familiar with at the moment. “What could be more important than the process of making decisions?” Science and Technology. This is the one that we-the-academics are familiar with. “The loss of modern science and technology would be catastrophic.” Military. “Without the military, a nation will be invaded and destroyed.” (insert your favorite here) Within science and technology, the same thing happens again. Mathematics. “What could be more important than a precise language for establishing truths?” Physics. “Nothing is more fundamental than the laws which govern the universe. Understanding them is the key to understanding everything else.” Biology. “Without life, we wouldn’t be here, so clearly the s

3 0.12096928 134 hunch net-2005-12-01-The Webscience Future

Introduction: The internet has significantly effected the way we do research but it’s capabilities have not yet been fully realized. First, let’s acknowledge some known effects. Self-publishing By default, all researchers in machine learning (and more generally computer science and physics) place their papers online for anyone to download. The exact mechanism differs—physicists tend to use a central repository ( Arxiv ) while computer scientists tend to place the papers on their webpage. Arxiv has been slowly growing in subject breadth so it now sometimes used by computer scientists. Collaboration Email has enabled working remotely with coauthors. This has allowed collaborationis which would not otherwise have been possible and generally speeds research. Now, let’s look at attempts to go further. Blogs (like this one) allow public discussion about topics which are not easily categorized as “a new idea in machine learning” (like this topic). Organization of some subfield

4 0.11479516 96 hunch net-2005-07-21-Six Months

Introduction: This is the 6 month point in the “run a research blog” experiment, so it seems like a good point to take stock and assess. One fundamental question is: “Is it worth it?” The idea of running a research blog will never become widely popular and useful unless it actually aids research. On the negative side, composing ideas for a post and maintaining a blog takes a significant amount of time. On the positive side, the process might yield better research because there is an opportunity for better, faster feedback implying better, faster thinking. My answer at the moment is a provisional “yes”. Running the blog has been incidentally helpful in several ways: It is sometimes educational. example More often, the process of composing thoughts well enough to post simply aids thinking. This has resulted in a couple solutions to problems of interest (and perhaps more over time). If you really want to solve a problem, letting the world know is helpful. This isn’t necessarily

5 0.10775023 106 hunch net-2005-09-04-Science in the Government

Introduction: I found the article on “ Political Science ” at the New York Times interesting. Essentially the article is about allegations that the US government has been systematically distorting scientific views. With a petition by some 7000+ scientists alleging such behavior this is clearly a significant concern. One thing not mentioned explicitly in this discussion is that there are fundamental cultural differences between academic research and the rest of the world. In academic research, careful, clear thought is valued. This value is achieved by both formal and informal mechanisms. One example of a formal mechanism is peer review. In contrast, in the land of politics, the basic value is agreement. It is only with some amount of agreement that a new law can be passed or other actions can be taken. Since Science (with a capitol ‘S’) has accomplished many things, it can be a significant tool in persuading people. This makes it compelling for a politician to use science as a mec

6 0.087727301 366 hunch net-2009-08-03-Carbon in Computer Science Research

7 0.086064264 157 hunch net-2006-02-18-Multiplication of Learned Probabilities is Dangerous

8 0.083586112 426 hunch net-2011-03-19-The Ideal Large Scale Learning Class

9 0.083240539 450 hunch net-2011-12-02-Hadoop AllReduce and Terascale Learning

10 0.080063544 335 hunch net-2009-01-08-Predictive Analytics World

11 0.079982616 178 hunch net-2006-05-08-Big machine learning

12 0.079432033 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

13 0.07844907 99 hunch net-2005-08-01-Peekaboom

14 0.077681221 120 hunch net-2005-10-10-Predictive Search is Coming

15 0.077635065 328 hunch net-2008-11-26-Efficient Reinforcement Learning in MDPs

16 0.077094533 235 hunch net-2007-03-03-All Models of Learning have Flaws

17 0.077090003 464 hunch net-2012-05-03-Microsoft Research, New York City

18 0.076557696 27 hunch net-2005-02-23-Problem: Reinforcement Learning with Classification

19 0.075203113 222 hunch net-2006-12-05-Recruitment Conferences

20 0.075160645 347 hunch net-2009-03-26-Machine Learning is too easy


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.18), (1, 0.01), (2, -0.067), (3, 0.075), (4, -0.061), (5, -0.007), (6, -0.017), (7, 0.035), (8, 0.003), (9, -0.001), (10, -0.038), (11, 0.0), (12, -0.0), (13, -0.002), (14, -0.043), (15, -0.008), (16, -0.055), (17, -0.018), (18, 0.026), (19, -0.034), (20, 0.007), (21, -0.113), (22, -0.065), (23, 0.05), (24, -0.043), (25, -0.016), (26, 0.081), (27, 0.09), (28, 0.001), (29, -0.011), (30, 0.04), (31, 0.021), (32, -0.024), (33, 0.014), (34, -0.005), (35, 0.043), (36, -0.028), (37, 0.075), (38, -0.03), (39, -0.006), (40, 0.06), (41, -0.03), (42, -0.104), (43, -0.02), (44, -0.063), (45, -0.017), (46, -0.044), (47, 0.03), (48, -0.034), (49, 0.061)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.93504792 193 hunch net-2006-07-09-The Stock Prediction Machine Learning Problem

Introduction: …is discussed in this nytimes article . I generally expect such approaches to become more common since computers are getting faster, machine learning is getting better, and data is becoming more plentiful. This is another example where machine learning technology may have a huge economic impact. Some side notes: We-in-research know almost nothing about how these things are done (because it is typically a corporate secret). … but the limited discussion in the article seem naive from a machine learning viewpoint. The learning process used apparently often fails to take into account transaction costs. What little of the approaches is discussed appears modeling based. It seems plausible that more direct prediction methods can yield an edge. One difficulty with stock picking as a research topic is that it is inherently a zero sum game (for every winner, there is a loser). Much of the rest of research is positive sum (basically, everyone wins).

2 0.72029871 106 hunch net-2005-09-04-Science in the Government

Introduction: I found the article on “ Political Science ” at the New York Times interesting. Essentially the article is about allegations that the US government has been systematically distorting scientific views. With a petition by some 7000+ scientists alleging such behavior this is clearly a significant concern. One thing not mentioned explicitly in this discussion is that there are fundamental cultural differences between academic research and the rest of the world. In academic research, careful, clear thought is valued. This value is achieved by both formal and informal mechanisms. One example of a formal mechanism is peer review. In contrast, in the land of politics, the basic value is agreement. It is only with some amount of agreement that a new law can be passed or other actions can be taken. Since Science (with a capitol ‘S’) has accomplished many things, it can be a significant tool in persuading people. This makes it compelling for a politician to use science as a mec

3 0.65365434 112 hunch net-2005-09-14-The Predictionist Viewpoint

Introduction: Virtually every discipline of significant human endeavor has a way explaining itself as fundamental and important. In all the cases I know of, they are both right (they are vital) and wrong (they are not solely vital). Politics. This is the one that everyone is familiar with at the moment. “What could be more important than the process of making decisions?” Science and Technology. This is the one that we-the-academics are familiar with. “The loss of modern science and technology would be catastrophic.” Military. “Without the military, a nation will be invaded and destroyed.” (insert your favorite here) Within science and technology, the same thing happens again. Mathematics. “What could be more important than a precise language for establishing truths?” Physics. “Nothing is more fundamental than the laws which govern the universe. Understanding them is the key to understanding everything else.” Biology. “Without life, we wouldn’t be here, so clearly the s

4 0.61607939 366 hunch net-2009-08-03-Carbon in Computer Science Research

Introduction: Al Gore ‘s film and gradually more assertive and thorough science has managed to mostly shift the debate on climate change from “Is it happening?” to “What should be done?” In that context, it’s worthwhile to think a bit about what can be done within computer science research. There are two things we can think about: Doing Research At a cartoon level, computer science research consists of some combination of commuting to&from; work, writing programs, running them on computers, writing papers, and presenting them at conferences. A typical computer has a power usage on the order of 100 Watts, which works out to 2.4 kiloWatt-hours/day. Looking up David MacKay ‘s reference on power usage per person , it becomes clear that this is a relatively minor part of the lifestyle, although it could become substantial if many more computers are required. Much larger costs are associated with commuting (which is in common with many people) and attending conferences. Since local commuti

5 0.60665661 125 hunch net-2005-10-20-Machine Learning in the News

Introduction: The New York Times had a short interview about machine learning in datamining being used pervasively by the IRS and large corporations to predict who to audit and who to target for various marketing campaigns. This is a big application area of machine learning. It can be harmful (learning + databases = another way to invade privacy) or beneficial (as google demonstrates, better targeting of marketing campaigns is far less annoying). This is yet more evidence that we can not rely upon “I’m just another fish in the school” logic for our expectations about treatment by government and large corporations.

6 0.60034579 241 hunch net-2007-04-28-The Coming Patent Apocalypse

7 0.57808089 134 hunch net-2005-12-01-The Webscience Future

8 0.55238658 260 hunch net-2007-08-25-The Privacy Problem

9 0.54625458 464 hunch net-2012-05-03-Microsoft Research, New York City

10 0.54272634 120 hunch net-2005-10-10-Predictive Search is Coming

11 0.54005814 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

12 0.53916162 208 hunch net-2006-09-18-What is missing for online collaborative research?

13 0.53815359 345 hunch net-2009-03-08-Prediction Science

14 0.53444153 397 hunch net-2010-05-02-What’s the difference between gambling and rewarding good prediction?

15 0.53351557 282 hunch net-2008-01-06-Research Political Issues

16 0.53066319 121 hunch net-2005-10-12-The unrealized potential of the research lab

17 0.52629542 250 hunch net-2007-06-23-Machine Learning Jobs are Growing on Trees

18 0.52498233 314 hunch net-2008-08-24-Mass Customized Medicine in the Future?

19 0.51582408 491 hunch net-2013-11-21-Ben Taskar is gone

20 0.51531172 344 hunch net-2009-02-22-Effective Research Funding


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(0, 0.353), (27, 0.135), (38, 0.061), (53, 0.073), (55, 0.093), (94, 0.162), (95, 0.016)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.9260326 87 hunch net-2005-06-29-Not EM for clustering at COLT

Introduction: One standard approach for clustering data with a set of gaussians is using EM. Roughly speaking, you pick a set of k random guassians and then use alternating expectation maximization to (hopefully) find a set of guassians that “explain” the data well. This process is difficult to work with because EM can become “stuck” in local optima. There are various hacks like “rerun with t different random starting points”. One cool observation is that this can often be solved via other algorithm which do not suffer from local optima. This is an early paper which shows this. Ravi Kannan presented a new paper showing this is possible in a much more adaptive setting. A very rough summary of these papers is that by projecting into a lower dimensional space, it is computationally tractable to pick out the gross structure of the data. It is unclear how well these algorithms work in practice, but they might be effective, especially if used as a subroutine of the form: Projec

same-blog 2 0.88469881 193 hunch net-2006-07-09-The Stock Prediction Machine Learning Problem

Introduction: …is discussed in this nytimes article . I generally expect such approaches to become more common since computers are getting faster, machine learning is getting better, and data is becoming more plentiful. This is another example where machine learning technology may have a huge economic impact. Some side notes: We-in-research know almost nothing about how these things are done (because it is typically a corporate secret). … but the limited discussion in the article seem naive from a machine learning viewpoint. The learning process used apparently often fails to take into account transaction costs. What little of the approaches is discussed appears modeling based. It seems plausible that more direct prediction methods can yield an edge. One difficulty with stock picking as a research topic is that it is inherently a zero sum game (for every winner, there is a loser). Much of the rest of research is positive sum (basically, everyone wins).

3 0.88076204 473 hunch net-2012-09-29-Vowpal Wabbit, version 7.0

Introduction: A new version of VW is out . The primary changes are: Learning Reductions : I’ve wanted to get learning reductions working and we’ve finally done it. Not everything is implemented yet, but VW now supports direct: Multiclass Classification –oaa or –ect . Cost Sensitive Multiclass Classification –csoaa or –wap . Contextual Bandit Classification –cb . Sequential Structured Prediction –searn or –dagger In addition, it is now easy to build your own custom learning reductions for various plausible uses: feature diddling, custom structured prediction problems, or alternate learning reductions. This effort is far from done, but it is now in a generally useful state. Note that all learning reductions inherit the ability to do cluster parallel learning. Library interface : VW now has a basic library interface. The library provides most of the functionality of VW, with the limitation that it is monolithic and nonreentrant. These will be improved over

4 0.85437727 215 hunch net-2006-10-22-Exemplar programming

Introduction: There are many different abstractions for problem definition and solution. Here are a few examples: Functional programming: a set of functions are defined. The composed execution of these functions yields the solution. Linear programming: a set of constraints and a linear objective function are defined. An LP solver finds the constrained optimum. Quadratic programming: Like linear programming, but the language is a little more flexible (and the solution slower). Convex programming: like quadratic programming, but the language is more flexible (and the solutions even slower). Dynamic programming: a recursive definition of the problem is defined and then solved efficiently via caching tricks. SAT programming: A problem is specified as a satisfiability involving a conjunction of a disjunction of boolean variables. A general engine attempts to find a good satisfying assignment. For example Kautz’s blackbox planner. These abstractions have different tradeoffs betw

5 0.83487773 62 hunch net-2005-04-26-To calibrate or not?

Introduction: A calibrated predictor is one which predicts the probability of a binary event with the property: For all predictions p , the proportion of the time that 1 is observed is p . Since there are infinitely many p , this definition must be “softened” to make sense for any finite number of samples. The standard method for “softening” is to consider all predictions in a small neighborhood about each possible p . A great deal of effort has been devoted to strategies for achieving calibrated (such as here ) prediction. With statements like: (under minimal conditions) you can always make calibrated predictions. Given the strength of these statements, we might conclude we are done, but that would be a “confusion of ends”. A confusion of ends arises in the following way: We want good probabilistic predictions. Good probabilistic predictions are calibrated. Therefore, we want calibrated predictions. The “Therefore” step misses the fact that calibration is a necessary b

6 0.72440863 74 hunch net-2005-05-21-What is the right form of modularity in structured prediction?

7 0.60395199 133 hunch net-2005-11-28-A question of quantification

8 0.55850863 221 hunch net-2006-12-04-Structural Problems in NIPS Decision Making

9 0.54901624 492 hunch net-2013-12-01-NIPS tutorials and Vowpal Wabbit 7.4

10 0.54555458 262 hunch net-2007-09-16-Optimizing Machine Learning Programs

11 0.54504228 276 hunch net-2007-12-10-Learning Track of International Planning Competition

12 0.54438019 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models

13 0.54335219 49 hunch net-2005-03-30-What can Type Theory teach us about Machine Learning?

14 0.53988642 286 hunch net-2008-01-25-Turing’s Club for Machine Learning

15 0.53903198 5 hunch net-2005-01-26-Watchword: Probability

16 0.53837502 450 hunch net-2011-12-02-Hadoop AllReduce and Terascale Learning

17 0.53656864 120 hunch net-2005-10-10-Predictive Search is Coming

18 0.53577858 263 hunch net-2007-09-18-It’s MDL Jim, but not as we know it…(on Bayes, MDL and consistency)

19 0.5310843 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

20 0.53053975 423 hunch net-2011-02-02-User preferences for search engines