nips nips2013 nips2013-158 nips2013-158-reference knowledge-graph by maker-knowledge-mining

158 nips-2013-Learning Multiple Models via Regularized Weighting


Source: pdf

Author: Daniel Vainsencher, Shie Mannor, Huan Xu

Abstract: We consider the general problem of Multiple Model Learning (MML) from data, from the statistical and algorithmic perspectives; this problem includes clustering, multiple regression and subspace clustering as special cases. A common approach to solving new MML problems is to generalize Lloyd’s algorithm for clustering (or Expectation-Maximization for soft clustering). However this approach is unfortunately sensitive to outliers and large noise: a single exceptional point may take over one of the models. We propose a different general formulation that seeks for each model a distribution over data points; the weights are regularized to be sufficiently spread out. This enhances robustness by making assumptions on class balance. We further provide generalization bounds and explain how the new iterations may be computed efficiently. We demonstrate the robustness benefits of our approach with some experimental results and prove for the important case of clustering that our approach has a non-trivial breakdown point, i.e., is guaranteed to be robust to a fixed percentage of adversarial unbounded outliers. 1


reference text

[1] S. Lloyd. Least squares quantization in PCM. Information Theory, IEEE Transactions on, 28(2):129–137, 1982.

[2] P. J. Huber. Robust Statistics. John Wiley & Sons, New York, 1981.

[3] J.A. Hartigan and M.A. Wong. Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), 28(1):100–108, 1979.

[4] R. Ostrovsky, Y. Rabani, L.J. Schulman, and C. Swamy. The effectiveness of Lloyd-type methods for the k-means problem. In Foundations of Computer Science, 2006. FOCS’06. 47th Annual IEEE Symposium on, pages 165–176. IEEE, 2006.

[5] P. Hansen, E. Ngai, B.K. Cheung, and N. Mladenovic. Analysis of global k-means, an incremental heuristic for minimum sum-of-squares clustering. Journal of classification, 22(2):287– 310, 2005.

[6] G. J. McLachlan and K. E. Basford. Mixture Models: Inference and Applications to Clustering. Marcel Dekker, New York, 1998.

[7] Mikhail Belkin and Kaushik Sinha. Polynomial learning of distribution families. In FOCS 2010: Proceedings of the 51st Annual IEEE Symposium on Foundations of Computer Science, pages 103–112. IEEE Computer Society, 2010.

[8] G. Chen and M. Maggioni. Multiscale geometric and spectral analysis of plane arrangements. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 2825– 2832. IEEE, 2011.

[9] Yaoliang Yu and Dale Schuurmans. Rank/norm regularization with closed-form solutions: Application to subspace clustering. In Fabio Gagliardi Cozman and Avi Pfeffer, editors, UAI, pages 778–785. AUAI Press, 2011.

[10] M. Soltanolkotabi and E.J. Cand` s. A geometric analysis of subspace clustering with outliers. e Arxiv preprint arXiv:1112.4258, 2011.

[11] A. Maurer and M. Pontil. k-dimensional coding schemes in hilbert spaces. Information Theory, IEEE Transactions on, 56(11):5839–5846, 2010.

[12] A.J. Smola, S. Mika, B. Sch¨ lkopf, and R.C. Williamson. Regularized principal manifolds. o The Journal of Machine Learning Research, 1:179–209, 2001.

[13] A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B, 39(1):1–38, 1977.

[14] R.N. Dav´ and R. Krishnapuram. Robust clustering methods: a unified view. Fuzzy Systems, e IEEE Transactions on, 5(2):270–293, 1997.

[15] Huan Xu, Constantine Caramanis, and Shie Mannor. Outlier-robust PCA: The highdimensional case. IEEE transactions on information theory, 59(1):546–572, 2013.

[16] B. Zhang. Regression clustering. In Data Mining, 2003. ICDM 2003. Third IEEE International Conference on, pages 451–458. IEEE, 2003.

[17] P. J. Rousseeuw and A. M. Leroy. Robust Regression and Outlier Detection. John Wiley & Sons, New York, 1987.

[18] R. A. Maronna, R. D. Martin, and V. J. Yohai. Robust Statistics: Theory and Methods. John Wiley & Sons, New York, 2006.

[19] Olivier Bousquet and Andr´ Elisseeff. Stability and generalization. The Journal of Machine e Learning Research, 2:499–526, 2002.

[20] M. Mahajan, P. Nimbhorkar, and K. Varadarajan. The planar k-means problem is np-hard. WALCOM: Algorithms and Computation, pages 274–285, 2009.

[21] A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1):183–202, 2009.

[22] Roberto Tron and Ren´ Vidal. A benchmark for the comparison of 3-d motion segmentation e algorithms. In CVPR. IEEE Computer Society, 2007.

[23] J. Duchi, S. Shalev-Shwartz, Y. Singer, and T. Chandra. Efficient projections onto the l1ball for learning in high dimensions. In Proceedings of the 25th international conference on Machine learning, pages 272–279, 2008. 9