nips nips2006 nips2006-140 nips2006-140-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Murat Dundar, Balaji Krishnapuram, R. B. Rao, Glenn M. Fung
Abstract: Many computer aided diagnosis (CAD) problems can be best modelled as a multiple-instance learning (MIL) problem with unbalanced data: i.e. , the training data typically consists of a few positive bags, and a very large number of negative instances. Existing MIL algorithms are much too computationally expensive for these datasets. We describe CH, a framework for learning a Convex Hull representation of multiple instances that is significantly faster than existing MIL algorithms. Our CH framework applies to any standard hyperplane-based learning algorithm, and for some algorithms, is guaranteed to find the global optimal solution. Experimental studies on two different CAD applications further demonstrate that the proposed algorithm significantly improves diagnostic accuracy when compared to both MIL and traditional classifiers. Although not designed for standard MIL problems (which have both positive and negative bags and relatively balanced datasets), comparisons against other MIL methods on benchmark problems also indicate that the proposed method is competitive with the state-of-the-art.
[1] O. L. Mangasarian and E. W. Wild. Multiple instance classification via successive linear programming. Technical Report 05-02, Data Mining Institute, Univ of Wisconsin, Madison, 2005.
[2] V. N. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, 1995.
[3] O. L. Mangasarian. Generalized support vector machines. In A. Smola, P. Bartlett, B. Sch¨ lkopf, and o D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 135–146, Cambridge, MA, 2000. MIT Press. ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98-14.ps.
[4] Sebastian Mika, Gunnar R¨ tsch, and Klaus-Robert M¨ ller. A mathematical programming approach to the a u kernel fisher algorithm. In NIPS, pages 591–597, 2000.
[5] J. Bezdek and R. Hathaway. Convergence of alternating optimization. Neural, Parallel Sci. Comput., 11(4):351–368, 2003.
[6] J. Warga. Minimizing certain convex functions. Journal of SIAM on Applied Mathematics, 11:588–593, 1963.
[7] Y.-J. Lee and O. L. Mangasarian. RSVM: Reduced support vector machines. Technical Report 00-07, Data Mining Institute, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, July 2000. Proceedings of the First SIAM International Conference on Data Mining, Chicago, April 5-7, 2001, CD-ROM Proceedings. ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/00-07.ps.
[8] Q. Zhang and S. Goldman. Em-dd: An improved multiple-instance learning technique. In Advances in Neural Information Processing Systems, volume 13. The MIT Press, 2001.
[9] Thomas G. Dietterich, Richard H. Lathrop, and Tomas Lozano-Perez. Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence, 89(1-2):31–71, 1997.
[10] Z. Zhou and M. Zhang. Ensembles of multi-instance learners. In Proceedings of the 14th European Conference on Machine Learning, LNAI 2837, pages 492–502, Cavtat-Dubrovnik, Croatia, 2003. Springer.
[11] M. Quist, H. Bouma, C. Van Kuijk, O. Van Delden, and F. Gerritsen. Computer aided detection of pulmonary embolism on multi-detector ct, 2004.
[12] C. Zhou, L. M. Hadjiiski, B. Sahiner, H.-P. Chan, S. Patel, P. Cascade, E. A. Kazerooni, and J. Wei. Computerized detection of pulmonary embolism in 3D computed tomographic (CT) images: vessel tracking and segmentation techniques. In Medical Imaging 2003: Image Processing. Edited by Sonka, Milan; Fitzpatrick, J. Michael. Proceedings of the SPIE, Volume 5032, pp. 1613-1620 (2003)., pages 1613–1620, May 2003.
[13] D. Jemal, R. Tiwari, T. Murray, A. Ghafoor, A. Saumuels, E. Ward, E. Feuer, and M. Thun. Cancer statistics, 2004.
[14] L. Bogoni, P. Cathier, M. Dundar, A. Jerebko, S. Lakare, J. Liang, S. Periaswamy, M. Baker, and M. Macari. Cad for colonography: A tool to address a growing need. British Journal of Radiology, 78:57–62, 2005.
[15] S. Andrews, I. Tsochantaridis, and T. Hofmann. Support vector machines for multiple-instance learning. In S. Thrun S. Becker and K. Obermayer, editors, Advances in Neural Information Processing Systems 15, pages 561–568. MIT Press, Cambridge, MA, 2003.
[16] Oded Maron and Tom´ s Lozano-P´ rez. A framework for multiple-instance learning. In Michael I. Jora e dan, Michael J. Kearns, and Sara A. Solla, editors, Advances in Neural Information Processing Systems, volume 10. The MIT Press, 1998.
[17] J. Ramon and L. De Raedt. Multi instance neural networks, 2000.
[18] V. Vural, G. Fung, B. Krishnapuram, J. G. Dy, and R. B. Rao. Batch classification with applications in computer aided diagnosis. In Proceedings of the ECML’06, Berlin, Germany, 2006.