nips nips2011 nips2011-27 nips2011-27-reference knowledge-graph by maker-knowledge-mining

27 nips-2011-Advice Refinement in Knowledge-Based SVMs

Source: pdf

Author: Gautam Kunapuli, Richard Maclin, Jude W. Shavlik

Abstract: Knowledge-based support vector machines (KBSVMs) incorporate advice from domain experts, which can improve generalization signiﬁcantly. A major limitation that has not been fully addressed occurs when the expert advice is imperfect, which can lead to poorer models. We propose a model that extends KBSVMs and is able to not only learn from data and advice, but also simultaneously improves the advice. The proposed approach is particularly effective for knowledge discovery in domains with few labeled examples. The proposed model contains bilinear constraints, and is solved using two iterative approaches: successive linear programming and a constrained concave-convex approach. Experimental results demonstrate that these algorithms yield useful reﬁnements to expert advice, as well as improve the performance of the learning algorithm overall.

reference text

[1] K. P. Bennett and E. J. Bredensteiner. A parametric optimization method for machine learning. INFORMS Journal on Computing, 9(3):311–318, 1997.

[2] K. P. Bennett and O. L. Mangasarian. Bilinear separation of two sets in n-space. Computational Optimization and Applications, 2:207–227, 1993.

[3] M. W. Craven and J. W. Shavlik. Extracting tree-structured representations of trained networks. In Advances in Neural Information Processing Systems, volume 8, pages 24–30, 1996.

[4] A. Frank and A. Asuncion. UCI machine learning repository, 2010.

[5] G. Fung, O. L. Mangasarian, and J. W. Shavlik. Knowledge-based nonlinear kernel classiﬁers. In Sixteenth Annual Conference on Learning Theory, pages 102–113, 2003.

[6] G. Fung, O. L. Mangasarian, and J. W. Shavlik. Knowledge-based support vector classiﬁers. In Advances in Neural Information Processing Systems, volume 15, pages 521–528, 2003.

[7] G. Fung, S. Sandilya, and R. B. Rao. Rule extraction from linear support vector machines. In Proc. Eleventh ACM SIGKDD Intl. Conference on Knowledge Discovery in Data Mining, pages 32–40, 2005.

[8] M. I. Harris, K. M. Flegal, C. C. Cowie, M. S. Eberhardt, D. E. Goldstein, R. R. Little, H. M. Wiedmeyer, and D. D. Byrd-Holt. Prevalence of diabetes, impaired fasting glucose, and impaired glucose tolerance in U.S. adults. Diabetes Care, 21(4):518–524, 1998.

[9] G. Kunapuli, K. P. Bennett, A. Shabbeer, R. Maclin, and J. W. Shavlik. Online knowledge-based support vector machines. In Proc. of the European Conference on Machine Learning, pages 145–161, 2010.

[10] F. Lauer and G. Bloch. Incorporating prior knowledge in support vector machines for classiﬁcation: A review. Neurocomputing, 71(7–9):1578–1594, 2008.

[11] Q. V. Le, A. J. Smola, and T. G¨ rtner. Simpler knowledge-based support vector machines. In Proceedings a of the Twenty-Third International Conference on Machine Learning, pages 521–528, 2006.

[12] R. Maclin, E. W. Wild, J. W. Shavlik, L. Torrey, and T. Walker. Reﬁning rules incorporated into knowledge-based support vector learners via successive linear programming. In AAAI Twenty-Second Conference on Artiﬁcial Intelligence, pages 584–589, 2007.

[13] O. L. Mangasarian, J. W. Shavlik, and E. W. Wild. Knowledge-based kernel approximation. Journal of Machine Learning Research, 5:1127–1141, 2004.

[14] O. L. Mangasarian and E. W. Wild. Nonlinear knowledge-based classiﬁcation. IEEE Transactions on Neural Networks, 19(10):1826–1832, 2008.

[15] M. E. Pavkov, R. L. Hanson, W. C. Knowler, P. H. Bennett, J. Krakoff, and R. G. Nelson. Changing patterns of Type 2 diabetes incidence among Pima Indians. Diabetes Care, 30(7):1758–1763, 2007.

[16] M. Pazzani and D. Kibler. The utility of knowledge in inductive learning. Mach. Learn., 9:57–94, 1992.

[17] B. Sch¨ lkopf, P. Simard, A. Smola, and V. Vapnik. Prior knowledge in support vector kernels. In Advances o in Neural Information Processing Systems, volume 10, pages 640–646, 1998.

[18] J. W. Smith, J. E. Everhart, W. C. Dickson, W. C. Knowler, and R. S. Johannes. Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proc. of the Symposium on Comp. Apps. and Medical Care, pages 261–265. IEEE Computer Society Press, 1988.

[19] A. J. Smola and S. V. N. Vishwanathan. Kernel methods for missing variables. In Proceedings of the Tenth International Workshop on Artiﬁcial Intelligence and Statistics, pages 325–332, 2005.

[20] S. Thrun. Extracting rules from artiﬁcial neural networks with distributed representations. In Advances in Neural Information Processing Systems, volume 8, 1995.

[21] G. G. Towell and J. W. Shavlik. Knowledge-based artiﬁcial neural networks. Artiﬁcial Intelligence, 70(1–2):119–165, 1994.

[22] R. H. T¨ t¨ nc¨ , K. C. Toh, and M. J. Todd. Solving semideﬁnite-quadratic-linear programs using SDPT3. uu u Mathematical Programming, 95(2), 2003.

[23] T. Walker, G. Kunapuli, N. Larsen, D. Page, and J. W. Shavlik. Integrating knowledge capture and supervised learning through a human-computer interface. In Proc. Fifth Intl. Conf. Knowl. Capture, 2011.

[24] A. L. Yuille and A. Rangarajan. The concave-convex procedure (CCCP). In Advances in Neural Information Processing Systems, volume 13, 2001. 9