nips nips2008 nips2008-48 nips2008-48-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Nikos Komodakis, Nikos Paragios, Georgios Tziritas
Abstract: A novel center-based clustering algorithm is proposed in this paper. We first formulate clustering as an NP-hard linear integer program and we then use linear programming and the duality theory to derive the solution of this optimization problem. This leads to an efficient and very general algorithm, which works in the dual domain, and can cluster data based on an arbitrary set of distances. Despite its generality, it is independent of initialization (unlike EM-like methods such as K-means), has guaranteed convergence, can automatically determine the number of clusters, and can also provide online optimality bounds about the quality of the estimated clustering solutions. To deal with the most critical issue in a centerbased clustering algorithm (selection of cluster centers), we also introduce the notion of stability of a cluster center, which is a well defined LP-based quantity that plays a key role to our algorithm’s success. Furthermore, we also introduce, what we call, the margins (another key ingredient in our algorithm), which can be roughly thought of as dual counterparts to stabilities and allow us to obtain computationally efficient approximations to the latter. Promising experimental results demonstrate the potentials of our method.
[1] A. Ng, M. Jordan, and Y. Weiss, “On spectral clustering: Analysis and an algorithm,” in NIPS, 2001.
[2] D. Verma and M. Meila, “A comparison of spectral clustering algorithms,” Tech. Rep., 2001.
[3] A. Banerjee, S. Merugu, I. S. Dhillon, and J. Ghosh, “Clustering with bregman divergences,” J. Mach. Learn. Res., vol. 6, pp. 1705–1749, 2005.
[4] B. Fischer, V. Roth, and J. Buhmann, “Clustering with the connectivity kernel,” in NIPS, 2004.
[5] B. J. Frey and D. Dueck, “Clustering by passing messages between data points,” Science, vol. 315, 2007. ´
[6] M. Charikar, S. Guha, E. Tardos, and D. B. Shmoys, “A constant-factor approximation algorithm for the k-median problem,” J. Comput. Syst. Sci., vol. 65, no. 1, pp. 129–149, 2002.
[7] N. Komodakis, N. Paragios, and G. Tziritas, “Clustering via LP-based Stabilities,” Tech. Report, 2009.
[8] D. Lashkari and P. Golland, “Convex clustering with exemplar-based models,” in NIPS, 2008.
[9] R. Tron and R. Vidal, “A benchmark for the comparison of 3-d motion segmentation algorithms,” in CVPR, 2007.
[10] M. Leone, Sumedha, and M. Weigt, “Clustering by soft-constraint affinity propagation: applications to gene-expression data,” Bioinformatics, vol. 23, no. 20, pp. 2708–2715, 2007. 8