nips nips2008 nips2008-55 nips2008-55-reference knowledge-graph by maker-knowledge-mining

55 nips-2008-Cyclizing Clusters via Zeta Function of a Graph


Source: pdf

Author: Deli Zhao, Xiaoou Tang

Abstract: Detecting underlying clusters from large-scale data plays a central role in machine learning research. In this paper, we tackle the problem of clustering complex data of multiple distributions and multiple scales. To this end, we develop an algorithm named Zeta l-links (Zell) which consists of two parts: Zeta merging with a similarity graph and an initial set of small clusters derived from local l-links of samples. More specifically, we propose to structurize a cluster using cycles in the associated subgraph. A new mathematical tool, Zeta function of a graph, is introduced for the integration of all cycles, leading to a structural descriptor of a cluster in determinantal form. The popularity character of a cluster is conceptualized as the global fusion of variations of such a structural descriptor by means of the leave-one-out strategy in the cluster. Zeta merging proceeds, in the hierarchical agglomerative fashion, according to the maximum incremental popularity among all pairwise clusters. Experiments on toy data clustering, imagery pattern clustering, and image segmentation show the competitive performance of Zell. The 98.1% accuracy, in the sense of the normalized mutual information (NMI), is obtained on the FRGC face data of 16028 samples and 466 facial clusters. 1


reference text

[1] Frey, B.J. & Dueck, D. (2007) Clustering by passing messages between data points. Science 315:972-976.

[2] Ma, Y. Derksen, H. Hong, W. & Wright, J. (2007) Segmentation of multivariate mixed data via lossy data coding and compression. IEEE Trans. on Pattern Recognition and Machine Intelligence 29:1546-1562.

[3] Shi, J.B. & Malik, J. (2000) Normalized cuts and image segmentation. IEEE Trans. on Pattern Recognition and Machine Intelligence 22(8):888-905.

[4] Ng, A.Y., Jordan, M.I. & Weiss, Y. (2001) On spectral clustering: analysis and an algorithm. Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press.

[5] Newman, M.E.J. (2006) Finding community structure in networks using the eigenvectors of matrices. Physical Review E 74(3).

[6] Destexhe, A. & Contreras, D. (2006) Neuronal computations with stochastic network states. Science, 314(6):85-90.

[7] Sporns, O. Tononi, G. & Edelman, G.M. (2000) Theoretical neuroanatomy: relating anatomical and functional connectivity in graphs and cortical connection matrices. Cerebral Cortex, 10:127-141.

[8] Bagrow, J. Bollt, E. & Costa, L.F. (2007) On short cycles and their role in network structure. http://arxiv.org/abs/cond-mat/0612502.

[9] Bianconi, G. & Marsili, M. (2005) Loops of any size and Hamilton cycles in random scale-free networks. Journal of Statistical Mechanics, P06005.

[10] Savchenko, S.V. (1993) The zeta-function and Gibbs measures. Russ. Math. Surv. 48(1):189-190.

[11] Li, S. Ahmed, S. Klimeck, G. & Darve, E. (2008) Computing entries of the inverse of a sparse matrix using the FIND algorithm. Journal of Computational Physics 227:9408-9427.

[12] Strehl, A. & Ghosh, J. (2002) Cluster ensembles — a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3:583617.