nips nips2013 nips2013-359 nips2013-359-reference knowledge-graph by maker-knowledge-mining

359 nips-2013-Σ-Optimality for Active Learning on Gaussian Random Fields

Source: pdf

Author: Yifei Ma, Roman Garnett, Jeff Schneider

Abstract: A common classiﬁer for unlabeled nodes on undirected graphs uses label propagation from the labeled nodes, equivalent to the harmonic predictor on Gaussian random ﬁelds (GRFs). For active learning on GRFs, the commonly used V-optimality criterion queries nodes that reduce the L2 (regression) loss. V-optimality satisﬁes a submodularity property showing that greedy reduction produces a (1 − 1/e) globally optimal solution. However, L2 loss may not characterise the true nature of 0/1 loss in classiﬁcation problems and thus may not be the best choice for active learning. We consider a new criterion we call Σ-optimality, which queries the node that minimizes the sum of the elements in the predictive covariance. Σ-optimality directly optimizes the risk of the surveying problem, which is to determine the proportion of nodes belonging to one class. In this paper we extend submodularity guarantees from V-optimality to Σ-optimality using properties speciﬁc to GRFs. We further show that GRFs satisfy the suppressor-free condition in addition to the conditional independence inherited from Markov random ﬁelds. We test Σoptimality on real-world graphs with both synthetic and real data and show that it outperforms V-optimality and other related methods on classiﬁcation. 1

reference text

Das, Abhimanyu and Kempe, David. Algorithms for subset selection in linear regression. In Proceedings of the 40th annual ACM symposium on Theory of computing, pp. 45–54. ACM, 2008. Friedland, S and Gaubert, S. Submodular spectral functions of principal submatrices of a hermitian matrix, extensions and applications. Linear Algebra and its Applications, 2011. Garnett, Roman, Krishnamurthy, Yamuna, Xiong, Xuehan, Schneider, Jeff, and Mann, Richard. Bayesian optimal active search and surveying. In ICML, 2012. Ji, Ming and Han, Jiawei. A variance minimization criterion to active learning on graphs. In AISTAT, 2012. Krause, Andreas, Singh, Ajit, and Guestrin, Carlos. Near-optimal sensor placements in gaussian processes: Theory, efﬁcient algorithms and empirical studies. Journal of Machine Learning Research (JMLR), 9:235–284, February 2008. Martin, Shawn, Brown, W Michael, Klavans, Richard, and Boyack, Kevin W. Openord: an opensource toolbox for large graph layout. In IS&T;/SPIE Electronic Imaging, pp. 786806–786806. International Society for Optics and Photonics, 2011. Nemhauser, George L, Wolsey, Laurence A, and Fisher, Marshall L. An analysis of approximations for maximizing submodular set functionsi. Mathematical Programming, 14(1):265–294, 1978. Rasmussen, Carl Edward and Williams, Christopher KI. Gaussian processes for machine learning, volume 1. MIT press Cambridge, MA, 2006. Settles, Burr. Active learning literature survey. University of Wisconsin, Madison, 2010. Walker, David A. Suppressor variable (s) importance within a regression model: an example of salary compression from career services. Journal of College Student Development, 44(1):127– 133, 2003. Wu, Xiao-Ming, Li, Zhenguo, So, Anthony Man-Cho, Wright, John, and Chang, Shih-Fu. Learning with partially absorbing random walks. In Advances in Neural Information Processing Systems 25, pp. 3086–3094, 2012. Zhu, Xiaojin and Ghahramani, Zoubin. Learning from labeled and unlabeled data with label propagation. Technical report, Technical Report CMU-CALD-02-107, Carnegie Mellon University, 2002. Zhu, Xiaojin, Lafferty, John, and Ghahramani, Zoubin. Combining active learning and semisupervised learning using gaussian ﬁelds and harmonic functions. In ICML 2003 workshop on The Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining, pp. 58–65, 2003. 9