nips nips2011 nips2011-27 nips2011-27-reference knowledge-graph by maker-knowledge-mining

27 nips-2011-Advice Refinement in Knowledge-Based SVMs


Source: pdf

Author: Gautam Kunapuli, Richard Maclin, Jude W. Shavlik

Abstract: Knowledge-based support vector machines (KBSVMs) incorporate advice from domain experts, which can improve generalization significantly. A major limitation that has not been fully addressed occurs when the expert advice is imperfect, which can lead to poorer models. We propose a model that extends KBSVMs and is able to not only learn from data and advice, but also simultaneously improves the advice. The proposed approach is particularly effective for knowledge discovery in domains with few labeled examples. The proposed model contains bilinear constraints, and is solved using two iterative approaches: successive linear programming and a constrained concave-convex approach. Experimental results demonstrate that these algorithms yield useful refinements to expert advice, as well as improve the performance of the learning algorithm overall.


reference text

[1] K. P. Bennett and E. J. Bredensteiner. A parametric optimization method for machine learning. INFORMS Journal on Computing, 9(3):311–318, 1997.

[2] K. P. Bennett and O. L. Mangasarian. Bilinear separation of two sets in n-space. Computational Optimization and Applications, 2:207–227, 1993.

[3] M. W. Craven and J. W. Shavlik. Extracting tree-structured representations of trained networks. In Advances in Neural Information Processing Systems, volume 8, pages 24–30, 1996.

[4] A. Frank and A. Asuncion. UCI machine learning repository, 2010.

[5] G. Fung, O. L. Mangasarian, and J. W. Shavlik. Knowledge-based nonlinear kernel classifiers. In Sixteenth Annual Conference on Learning Theory, pages 102–113, 2003.

[6] G. Fung, O. L. Mangasarian, and J. W. Shavlik. Knowledge-based support vector classifiers. In Advances in Neural Information Processing Systems, volume 15, pages 521–528, 2003.

[7] G. Fung, S. Sandilya, and R. B. Rao. Rule extraction from linear support vector machines. In Proc. Eleventh ACM SIGKDD Intl. Conference on Knowledge Discovery in Data Mining, pages 32–40, 2005.

[8] M. I. Harris, K. M. Flegal, C. C. Cowie, M. S. Eberhardt, D. E. Goldstein, R. R. Little, H. M. Wiedmeyer, and D. D. Byrd-Holt. Prevalence of diabetes, impaired fasting glucose, and impaired glucose tolerance in U.S. adults. Diabetes Care, 21(4):518–524, 1998.

[9] G. Kunapuli, K. P. Bennett, A. Shabbeer, R. Maclin, and J. W. Shavlik. Online knowledge-based support vector machines. In Proc. of the European Conference on Machine Learning, pages 145–161, 2010.

[10] F. Lauer and G. Bloch. Incorporating prior knowledge in support vector machines for classification: A review. Neurocomputing, 71(7–9):1578–1594, 2008.

[11] Q. V. Le, A. J. Smola, and T. G¨ rtner. Simpler knowledge-based support vector machines. In Proceedings a of the Twenty-Third International Conference on Machine Learning, pages 521–528, 2006.

[12] R. Maclin, E. W. Wild, J. W. Shavlik, L. Torrey, and T. Walker. Refining rules incorporated into knowledge-based support vector learners via successive linear programming. In AAAI Twenty-Second Conference on Artificial Intelligence, pages 584–589, 2007.

[13] O. L. Mangasarian, J. W. Shavlik, and E. W. Wild. Knowledge-based kernel approximation. Journal of Machine Learning Research, 5:1127–1141, 2004.

[14] O. L. Mangasarian and E. W. Wild. Nonlinear knowledge-based classification. IEEE Transactions on Neural Networks, 19(10):1826–1832, 2008.

[15] M. E. Pavkov, R. L. Hanson, W. C. Knowler, P. H. Bennett, J. Krakoff, and R. G. Nelson. Changing patterns of Type 2 diabetes incidence among Pima Indians. Diabetes Care, 30(7):1758–1763, 2007.

[16] M. Pazzani and D. Kibler. The utility of knowledge in inductive learning. Mach. Learn., 9:57–94, 1992.

[17] B. Sch¨ lkopf, P. Simard, A. Smola, and V. Vapnik. Prior knowledge in support vector kernels. In Advances o in Neural Information Processing Systems, volume 10, pages 640–646, 1998.

[18] J. W. Smith, J. E. Everhart, W. C. Dickson, W. C. Knowler, and R. S. Johannes. Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proc. of the Symposium on Comp. Apps. and Medical Care, pages 261–265. IEEE Computer Society Press, 1988.

[19] A. J. Smola and S. V. N. Vishwanathan. Kernel methods for missing variables. In Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, pages 325–332, 2005.

[20] S. Thrun. Extracting rules from artificial neural networks with distributed representations. In Advances in Neural Information Processing Systems, volume 8, 1995.

[21] G. G. Towell and J. W. Shavlik. Knowledge-based artificial neural networks. Artificial Intelligence, 70(1–2):119–165, 1994.

[22] R. H. T¨ t¨ nc¨ , K. C. Toh, and M. J. Todd. Solving semidefinite-quadratic-linear programs using SDPT3. uu u Mathematical Programming, 95(2), 2003.

[23] T. Walker, G. Kunapuli, N. Larsen, D. Page, and J. W. Shavlik. Integrating knowledge capture and supervised learning through a human-computer interface. In Proc. Fifth Intl. Conf. Knowl. Capture, 2011.

[24] A. L. Yuille and A. Rangarajan. The concave-convex procedure (CCCP). In Advances in Neural Information Processing Systems, volume 13, 2001. 9