nips nips2011 nips2011-150 nips2011-150-reference knowledge-graph by maker-knowledge-mining

150 nips-2011-Learning a Distance Metric from a Network

Source: pdf

Author: Blake Shaw, Bert Huang, Tony Jebara

Abstract: Many real-world networks are described by both connectivity information and features for every node. To better model and understand these networks, we present structure preserving metric learning (SPML), an algorithm for learning a Mahalanobis distance metric from a network such that the learned distances are tied to the inherent connectivity structure of the network. Like the graph embedding algorithm structure preserving embedding, SPML learns a metric which is structure preserving, meaning a connectivity algorithm such as k-nearest neighbors will yield the correct connectivity when applied using the distances from the learned metric. We show a variety of synthetic and real-world experiments where SPML predicts link patterns from node features more accurately than standard techniques. We further demonstrate a method for optimizing SPML based on stochastic gradient descent which removes the running-time dependency on the size of the network and allows the method to easily scale to networks of thousands of nodes and millions of edges. 1

reference text

[1] E. Airoldi, D. Blei, S. Fienberg, and E. Xing. Mixed membership stochastic blockmodels. JMLR, 9:1981– 2014, 2008.

[2] J. Chang and D. Blei. Hierarchical relational models for document networks. Annals of Applied Statistics, 4:124–150, 2010.

[3] G. Chechik, V. Sharma, U. Shalit, and S. Bengio. Large scale online learning of image similarity through ranking. J. Mach. Learn. Res., 11:1109–1135, March 2010.

[4] J. Chen, W. Geyer, C. Dugan, M. Muller, and I. Guy. Make new friends, but keep the old: recommending people on social networking sites. In CHI, pages 201–210. ACM, 2009.

[5] S. Dasgupta and A. Gupta. An elementary proof of a theorem of Johnson and Lindenstrauss. Random Struct. Algorithms, 22:60–65, January 2003.

[6] C. Fremuth-Paeger and D. Jungnickel. Balanced network ﬂows, a unifying framework for design and analysis of matching algorithms. Networks, 33(1):1–28, 1999.

[7] B. Huang and T. Jebara. Loopy belief propagation for bipartite maximum weight b-matching. In Proceedings of the Eleventh International Conference on Artiﬁcial Intelligence and Statistics, volume 2 of JMLR: W&CP;, pages 195–202, 2007.

[8] B. Huang and T. Jebara. Fast b-matching via sufﬁcient selection belief propagation. In Proceedings of the Fourteenth International Conference on Artiﬁcial Intelligence and Statistics, 2011.

[9] T. Joachims. Training linear SVMs in linear time. In ACM SIG International Conference On Knowledge Discovery and Data Mining (KDD), pages 217 – 226, 2006.

[10] T. Joachims, T. Finley, and C. Yu. Cutting-plane training of structural SVMs. Machine Learning, 77(1):27–59, 2009.

[11] W. Johnson and J. Lindenstrauss. Extensions of Lipschitz maps into a Hilbert space. Contemporary Mathematics, (26):189–206, 1984.

[12] J. Leskovec and E. Horvitz. Planetary-scale views on a large instant-messaging network. ACM WWW, 2008.

[13] J. Leskovec, J Kleinberg, and C. Faloutsos. Graphs over time: densiﬁcation laws, shrinking diameters and possible explanations. In Proc. of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, 2005.

[14] M. Middendorf, E. Ziv, C. Adams, J. Hom, R. Koytcheff, C. Levovitz, and G. Woods. Discriminative topological features reveal biological network mechanisms. BMC Bioinformatics, 5:1471–2105, 2004.

[15] G. Namata, H. Sharara, and L. Getoor. A survey of link mining tasks for analyzing noisy and incomplete networks. In Link Mining: Models, Algorithms, and Applications. Springer, 2010.

[16] M. Newman. The structure and function of complex networks. SIAM REVIEW, 45:167–256, 2003.

[17] M. Newman. Analysis of weighted networks. Phys. Rev. E, 70(5):056131, Nov 2004.

[18] J. Rennie and N. Srebro. Fast maximum margin matrix factorization for collaborative prediction. In Proceedings of the Twenty-Second International Conference, volume 119 of ACM International Conference Proceeding Series, pages 713–719. ACM, 2005.

[19] S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal estimated sub-gradient solver for SVM. In Proceedings of the 24th International Conference on Machine Learning, ICML ’07, pages 807–814, New York, NY, USA, 2007. ACM.

[20] S. Shalev-Shwartz, Y. Singer, N. Srebro, and A. Cotter. Pegasos: Primal estimated sub-gradient solver for SVM. Mathematical Programming, To appear.

[21] B. Shaw and T. Jebara. Structure preserving embedding. In Proc. of the 26th International Conference on Machine Learning, 2009.

[22] A. Traud, P. Mucha, and M. Porter. Social structure of Facebook networks. CoRR, abs/1102.2166, 2011.

[23] K. Weinberger and L. Saul. Distance metric learning for large margin nearest neighbor classiﬁcation. Journal of Machine Learning Research, 10:207–244, 2009.

[24] E. Xing, A. Ng, M. Jordan, and S. Russell. Distance metric learning with application to clustering with side-information. In S. Becker, S. Thrun, and K. Obermayer, editors, NIPS, pages 505–512. MIT Press, 2002.

[25] J. Xu and Y. Li. Discovering disease-genes by topological features in human protein-protein interaction network. Bioinformatics, 22(22):2800–2805, 2006.

[26] T. Yang, R. Jin, Y. Chi, and S. Zhu. Combining link and content for community detection: a discriminative approach. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’09, pages 927–936, New York, NY, USA, 2009. ACM. 9