nips nips2004 nips2004-143 nips2004-143-reference knowledge-graph by maker-knowledge-mining

143 nips-2004-PAC-Bayes Learning of Conjunctions and Classification of Gene-Expression Data


Source: pdf

Author: Mario Marchand, Mohak Shah

Abstract: We propose a “soft greedy” learning algorithm for building small conjunctions of simple threshold functions, called rays, defined on single real-valued attributes. We also propose a PAC-Bayes risk bound which is minimized for classifiers achieving a non-trivial tradeoff between sparsity (the number of rays used) and the magnitude of the separating margin of each ray. Finally, we test the soft greedy algorithm on four DNA micro-array data sets. 1


reference text

U. Alon, N. Barkai, D.A. Notterman, K. Gish, S. Ybarra, D. Mack, and A.J. Levine. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. PNAS USA, 96:6745–6750, 1999. C. Ambroise and G. J. McLachlan. Selection bias in gene extraction on the basis of microarray gene-expression data. Proc. Natl. Acad. Sci. USA, 99:6562–6566, 2002. T. S. Furey, N. Cristianini, N. Duffy, D. W. Bednarski, M. Schummer, and D. Haussler. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics, 16:906–914, 2000. T.R. Golub, D.K. Slonim, and Many More Authors. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 286:531– 537, 1999. I. Guyon, J. Weston, S. Barnhill, and V. Vapnik. Gene selection for cancer classification using support vector machines. Machine Learning, 46:389–422, 2002. D. Haussler. Quantifying inductive bias: AI learning algorithms and Valiant’s learning framework. Artificial Intelligence, 36:177–221, 1988. John Langford. Tutorial on practical prediction theory for classification. http://hunch.net/~jl/projects/prediction_bounds/tutorial/tutorial.ps, 2003. N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2(4):285–318, 1988. Mario Marchand and John Shawe-Taylor. The set covering machine. Journal of Machine Learning Reasearch, 3:723–746, 2002. David McAllester. Some PAC-Bayesian theorems. Machine Learning, 37:355–363, 1999. David McAllester. PAC-Bayesian stochastic model selection. Machine Learning, 51:5–21, 2003. A priliminary version appeared in proceedings of COLT’99. S. L. Pomeroy, P. Tamayo, and Many More Authors. Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature, 415:436–442, 2002. Matthias Seeger. PAC-Bayesian generalization bounds for gaussian processes. Journal of Machine Learning Research, 3:233–269, 2002.