nips nips2010 nips2010-261 nips2010-261-reference knowledge-graph by maker-knowledge-mining

261 nips-2010-Supervised Clustering


Source: pdf

Author: Pranjal Awasthi, Reza B. Zadeh

Abstract: Despite the ubiquity of clustering as a tool in unsupervised learning, there is not yet a consensus on a formal theory, and the vast majority of work in this direction has focused on unsupervised clustering. We study a recently proposed framework for supervised clustering where there is access to a teacher. We give an improved generic algorithm to cluster any concept class in that model. Our algorithm is query-efficient in the sense that it involves only a small amount of interaction with the teacher. We also present and study two natural generalizations of the model. The model assumes that the teacher response to the algorithm is perfect. We eliminate this limitation by proposing a noisy model and give an algorithm for clustering the class of intervals in this noisy model. We also propose a dynamic model where the teacher sees a random subset of the points. Finally, for datasets satisfying a spectrum of weak to strong properties, we give query bounds, and show that a class of clustering functions containing Single-Linkage will find the target clustering under the strongest property.


reference text

[ABD09] M. Ackerman and S. Ben-David. Clusterability: A theoretical study. Proceedings of AISTATS-09, JMLR: W&CP;, 5:1–8, 2009. [AL10] Ben-David S. Ackerman, M. and D. Loker. Characterization of Linkage-based Clustering. COLT 2010, 2010. [Ang98] [BB08] D. Angluin. Queries and concept learning. Machine Learning, 2:319–342, 1998. Maria-Florina Balcan and Avrim Blum. Clustering with interactive feedback. In ALT, 2008. M.-F. Balcan, A. Blum, and S. Vempala. A discriminative framework for clustering via similarity functions. In Proceedings of the 40th ACM Symposium on Theory of Computing, 2008. Avrim Blum. Thoughts on clustering. In NIPS Workshop on Clustering Theory, 2009. [BBV08] [Blu09] [CGTS99] M. Charikar, S. Guha, E. Tardos, and D. B. Shmoy. A constant-factor approximation algorithm for the k-median problem. In ACM Symposium on Theory of Computing, 1999. [Das99] S. Dasgupta. Learning mixtures of gaussians. In Proceedings of the 40th Annual Symposium on Foundations of Computer Science, 1999. [GvLW09] I. Guyon, U. von Luxburg, and R.C. Williamson. Clustering: Science or Art? In NIPS Workshop on Clustering Theory, 2009. [JS71] [Kle03] N. Jardine and R. Sibson. Mathematical taxonomy. New York, 1971. J. Kleinberg. An impossibility theorem for clustering. In Advances in Neural Information Processing Systems 15: Proceedings of the 2002 Conference, page 463. The MIT Press, 2003. [KVV00] R. Kannan, S. Vempala, and A. Veta. On clusterings-good, bad and spectral. In FOCS ’00: Proceedings of the 41st Annual Symposium on Foundations of Computer Science, 2000. [Lit87] Nick Littlestone. Learning quickly when irrelevant attributes abound: A new linearthreshold algorithm. Machine Learning, 2(4), 1987. [Vap98] [ZBD] V. N. Vapnik. Statistical Learning Theory. John Wiley and Sons Inc., 1998. Reza Bosagh Zadeh and Shai Ben-David. Axiomatic Characterizations of SingleLinkage. In In Submission. [ZBD09] Reza Bosagh Zadeh and Shai Ben-David. A Uniqueness Theorem for Clustering. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, 2009. 9