nips nips2011 nips2011-246 nips2011-246-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Dmitry Pidan, Ran El-Yaniv
Abstract: Focusing on short term trend prediction in a Ä?Ĺš nancial context, we consider the problem of selective prediction whereby the predictor can abstain from prediction in order to improve performance. We examine two types of selective mechanisms for HMM predictors. The Ä?Ĺš rst is a rejection in the spirit of Chow’s well-known ambiguity principle. The second is a specialized mechanism for HMMs that identiÄ?Ĺš es low quality HMM states and abstain from prediction in those states. We call this model selective HMM (sHMM). In both approaches we can trade-off prediction coverage to gain better accuracy in a controlled manner. We compare performance of the ambiguity-based rejection technique with that of the sHMM approach. Our results indicate that both methods are effective, and that the sHMM model is superior. 1
[1] P. L. Bartlett and M. H. Wegkamp. ClassiÄ?Ĺš cation with a reject option using a hinge loss. Journal of Machine Learning Research, 9:1823–1840, 2008.
[2] L. E. Baum, T. Petrie, G. Soules, and N. Weiss. A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. The Annals of Mathematical Statistics, 41(1):164–171, 1970.
[3] M. Bicego, E. Grosso, and E. Otranto. A Hidden Markov Model approach to classify and predict the sign of Ä?Ĺš nancial local trends. SSPR, 5342:852–861, 2008.
[4] M. Brand. Coupled Hidden Markov Models for modeling interacting processes. Technical Report 405, MIT Media Lab, 1997.
[5] C. Chow. On optimum recognition error and reject tradeoff. IEEE-IT, 16:41–46, 1970.
[6] R. El-Yaniv and Y. Wiener. On the foundations of noise-free selective classiÄ?Ĺš cation. JMLR, 11:1605–1641, May 2010.
[7] R. El-Yaniv and Y. Wiener. Agnostic selective classiÄ?Ĺš cation. In NIPS, 2011.
[8] S. Fine, Y. Singer, and N. Tishby. The Hierarchical Hidden Markov Model: Analysis and Applications. Machine Learning, 32(1):41–62, 1998.
[9] Y. Freund, Y. Mansour, and R. E. Schapire. Generalization bounds for averaged classiÄ?Ĺš ers. Annals of Statistics, 32(4):1698–1722, 2004.
[10] Z. Ghahramani and M. I. Jordan. Factorial Hidden Markov Models. Machine Learning, 29(2– 3):245–273, 1997.
[11] J. Hamilton. Analysis of time series subject to changes in regime. Journal of Econometrics, 45(1–2):39–70, 1990.
[12] B. Hanczar and E. R. Dougherty. ClassiÄ?Ĺš cation with reject option in gene expression data. Bioinformatics, 24:1889–1895, 2008.
[13] D. Hsu, S. Kakade, and T. Zhang. A spectral algorithm for learning Hidden Markov Models. In COLT, 2009.
[14] A. L. Koerich. Rejection strategies for handwritten word recognition. In IWFHR, 2004.
[15] A. Krogh. Hidden Markov Models for labeled sequences. In Proceedings of the 12th IAPR ICPR’94, pages 140–144, 1994.
[16] L. R. Rabiner. A tutorial on Hidden Markov Models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), February 1989.
[17] S. Rao and J. Hong. Analysis of Hidden Markov Models and Support Vector Machines in Ä?Ĺš nancial applications. Technical Report UCB/EECS-2010-63, Electrical Engineering and Computer Sciences University of California at Berkeley, 2010.
[18] L. K. Saul and M. I. Jordan. Mixed memory Markov models: Decomposing complex stochastic processes as mixtures of simpler ones. Machine Learning, 37:75–87, 1999.
[19] S. Shi and A. S. Weigend. Taking time seriously: Hidden Markov Experts applied to Ä?Ĺš nancial engineering. In IEEE/IAFE, pages 244–252. IEEE, 1997.
[20] S. Siddiqi, G. Gordon, and A. Moore. Fast State Discovery for HMM Model Selection and Learning. In AI-STATS, 2007.
[21] F. Tortorella. Reducing the classiÄ?Ĺš cation cost of support vector classiÄ?Ĺš ers through an ROCbased reject rule. Pattern Anal. Appl., 7:128–143, 2004.
[22] A. Viterbi. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE-IT, 13(2):260–269, 1967.
[23] Y. Zhang. Prediction of Ä?Ĺš nancial time series with Hidden Markov Models. Master’s thesis, The School of Computing Science, Simon Frazer University, Canada, 2004. 9