nips nips2012 nips2012-337 nips2012-337-reference knowledge-graph by maker-knowledge-mining

337 nips-2012-The Lovász ϑ function, SVMs and finding large dense subgraphs


Source: pdf

Author: Vinay Jethava, Anders Martinsson, Chiranjib Bhattacharyya, Devdatt Dubhashi

Abstract: The Lov´ sz ϑ function of a graph, a fundamental tool in combinatorial optimizaa tion and approximation algorithms, is computed by solving a SDP. In this paper we establish that the Lov´ sz ϑ function is equivalent to a kernel learning problem a related to one class SVM. This interesting connection opens up many opportunities bridging graph theoretic algorithms and machine learning. We show that there exist graphs, which we call SVM − ϑ graphs, on which the Lov´ sz ϑ function a can be approximated well by a one-class SVM. This leads to novel use of SVM techniques for solving algorithmic problems in large graphs e.g. identifying a √ 1 planted clique of size Θ( n) in a random graph G(n, 2 ). A classic approach for this problem involves computing the ϑ function, however it is not scalable due to SDP computation. We show that the random graph with a planted clique is an example of SVM − ϑ graph. As a consequence a SVM based approach easily identifies the clique in large graphs and is competitive with the state-of-the-art. We introduce the notion of common orthogonal labelling and show that it can be computed by solving a Multiple Kernel learning problem. It is further shown that such a labelling is extremely useful in identifying a large common dense subgraph in multiple graphs, which is known to be a computationally difficult problem. The proposed algorithm achieves an order of magnitude scalability compared to state of the art methods. 1


reference text

[1] Louigi Addario-berry, Nicolas Broutin, Gbor Lugosi, and Luc Devroye. Combinatorial testing problems. Annals of Statistics, 38:3063–3092, 2010.

[2] Noga Alon, Michael Krivelevich, and Benny Sudakov. Finding a large hidden clique in a random graph. Random Structures and Algorithms, pages 457–466, 1998.

[3] B. Bollob´ s. Modern graph theory, volume 184. Springer Verlag, 1998. a

[4] Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, New York, NY, USA, 2004.

[5] T.-H. Hubert Chan, Kevin L. Chang, and Rajiv Raman. An sdp primal-dual algorithm for approximating the lov´ sz-theta function. In ISIT, 2009. a

[6] Amin Coja-Oghlan and Anusch Taraz. Exact and approximative algorithms for coloring g(n, p). Random Struct. Algorithms, 24(3):259–278, 2004.

[7] U. Feige and D. Ron. Finding hidden cliques in linear time. In AofA10, 2010.

[8] Uriel Feige and Robert Krauthgamer. Finding and certifying a large hidden clique in a semirandom graph. Random Struct. Algorithms, 16:195–208, March 2000.

[9] Z. F¨ redi and J. Koml´ s. The eigenvalues of random symmetric matrices. Combinatorica, u o 1:233–241, 1981.

[10] Michel X. Goemans. Semidefinite programming in combinatorial optimization. Math. Program., 79:143–161, 1997.

[11] J. H˚ stad. Clique is hard to approximate within n1−ε . Acta Mathematica, 182(1):105–142, a 1999.

[12] Roger A. Horn and Charles R. Johnson. Matrix Analysis. Cambridge University Press, 1990.

[13] Don R. Hush, Patrick Kelly, Clint Scovel, and Ingo Steinwart. Qp algorithms with guaranteed accuracy and run time for support vector machines. Journal of Machine Learning Research, 7:733–769, 2006.

[14] D. Jiang and J. Pei. Mining frequent cross-graph quasi-cliques. ACM Transactions on Knowledge Discovery from Data (TKDD), 2(4):16, 2009.

[15] D.S. Johnson and M.A. Trick. Cliques, coloring, and satisfiability: second DIMACS implementation challenge, October 11-13, 1993, volume 26. Amer Mathematical Society, 1996.

[16] Donald Knuth. The sandwich theorem. Electronic Journal of Combinatorics, 1(A1), 1994.

[17] Michael Krivelevich and Benny Sudakov. Pseudo-random graphs. In More Sets, Graphs and Numbers, volume 15 of Bolyai Society Mathematical Studies, pages 199–262. Springer Berlin Heidelberg, 2006.

[18] V.E. Lee, N. Ruan, R. Jin, and C. Aggarwal. A survey of algorithms for dense subgraph discovery. Managing and Mining Graph Data, pages 303–336, 2010.

[19] L. Lovasz. On the Shannon capacity of a graph. Information Theory, IEEE Transactions on, 25(1):1–7, 1979.

[20] C.J. Luz and A. Schrijver. A convex quadratic characterization of the lov´ sz theta number. a SIAM Journal on Discrete Mathematics, 19(2):382–387, 2006.

[21] Claire Mathieu and Warren Schudy. Correlation clustering with noisy input. In Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’10, pages 712–728, Philadelphia, PA, USA, 2010. Society for Industrial and Applied Mathematics.

[22] P. Pardalos and S. Rebennack. Computational challenges with cliques, quasi-cliques and clique partitions in graphs. Experimental Algorithms, pages 13–22, 2010.

[23] V. Spirin and L.A. Mirny. Protein complexes and functional modules in molecular networks. Proceedings of the National Academy of Sciences, 100(21):12123, 2003.

[24] Dekel Yael, Gurel-Gurevich Ori, and Peres Yuval. Finding hidden cliques in linear time with high probability. In ANALCO11, 2011. 9