nips nips2004 nips2004-143 nips2004-143-reference knowledge-graph by maker-knowledge-mining

143 nips-2004-PAC-Bayes Learning of Conjunctions and Classification of Gene-Expression Data

Source: pdf

Author: Mario Marchand, Mohak Shah

Abstract: We propose a “soft greedy” learning algorithm for building small conjunctions of simple threshold functions, called rays, deﬁned on single real-valued attributes. We also propose a PAC-Bayes risk bound which is minimized for classiﬁers achieving a non-trivial tradeoﬀ between sparsity (the number of rays used) and the magnitude of the separating margin of each ray. Finally, we test the soft greedy algorithm on four DNA micro-array data sets. 1

reference text

U. Alon, N. Barkai, D.A. Notterman, K. Gish, S. Ybarra, D. Mack, and A.J. Levine. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. PNAS USA, 96:6745–6750, 1999. C. Ambroise and G. J. McLachlan. Selection bias in gene extraction on the basis of microarray gene-expression data. Proc. Natl. Acad. Sci. USA, 99:6562–6566, 2002. T. S. Furey, N. Cristianini, N. Duﬀy, D. W. Bednarski, M. Schummer, and D. Haussler. Support vector machine classiﬁcation and validation of cancer tissue samples using microarray expression data. Bioinformatics, 16:906–914, 2000. T.R. Golub, D.K. Slonim, and Many More Authors. Molecular classiﬁcation of cancer: class discovery and class prediction by gene expression monitoring. Science, 286:531– 537, 1999. I. Guyon, J. Weston, S. Barnhill, and V. Vapnik. Gene selection for cancer classiﬁcation using support vector machines. Machine Learning, 46:389–422, 2002. D. Haussler. Quantifying inductive bias: AI learning algorithms and Valiant’s learning framework. Artiﬁcial Intelligence, 36:177–221, 1988. John Langford. Tutorial on practical prediction theory for classiﬁcation. http://hunch.net/~jl/projects/prediction_bounds/tutorial/tutorial.ps, 2003. N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2(4):285–318, 1988. Mario Marchand and John Shawe-Taylor. The set covering machine. Journal of Machine Learning Reasearch, 3:723–746, 2002. David McAllester. Some PAC-Bayesian theorems. Machine Learning, 37:355–363, 1999. David McAllester. PAC-Bayesian stochastic model selection. Machine Learning, 51:5–21, 2003. A priliminary version appeared in proceedings of COLT’99. S. L. Pomeroy, P. Tamayo, and Many More Authors. Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature, 415:436–442, 2002. Matthias Seeger. PAC-Bayesian generalization bounds for gaussian processes. Journal of Machine Learning Research, 3:233–269, 2002.