nips nips2009 nips2009-61 nips2009-61-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Novi Quadrianto, John Lim, Dale Schuurmans, Tibério S. Caetano
Abstract: We develop a convex relaxation of maximum a posteriori estimation of a mixture of regression models. Although our relaxation involves a semidefinite matrix variable, we reformulate the problem to eliminate the need for general semidefinite programming. In particular, we provide two reformulations that admit fast algorithms. The first is a max-min spectral reformulation exploiting quasi-Newton descent. The second is a min-min reformulation consisting of fast alternating steps of closed-form updates. We evaluate the methods against Expectation-Maximization in a real problem of motion segmentation from video data. 1
[1] S. Gaffney and P. Smyth. Trajectory clustering with mixtures of regression models. In ACM SIGKDD, volume 62, pages 63–72, 1999.
[2] S. Finney, L. Kaelbling, and T. Lozano-Perez. Predicting partial paths from planning problem parameters. In Proceedings of Robotics: Science and Systems, Atlanta, GA, USA, June 2007.
[3] A. J. Storkey and M. Sugiyama. Mixture regression for covariate shift. In Sch¨ lkopf, editor, o Advances in Neural Information Processing Systems 19, pages 1337–1344, 2007.
[4] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1):1–38, 1977.
[5] T. De Bie, N. Cristianini, P. Bennett, and E. Parrado-hern¨ ndez. Fast sdp relaxations of graph a cut clustering, transduction, and other combinatorial problems. JMLR, 7:1409–1436, 2006.
[6] Y. Guo and D. Schuurmans. Convex relaxations for latent variable training. In Platt et al., editor, Advances in Neural Information Processing Systems 20, pages 601–608, 2008.
[7] S. M. Goldfeld and R.E. Quandt. Nonlinear methods in econometrics. Amsterdam: NorthHolland Publishing Co., 1972.
[8] D. W. Hosmer. Maximum likelihood estimates of the parameters of a mixture of two regression lines. Communications in Statistics, 3(10):995–1006, 1974.
[9] H. Sp¨ th. Algorithm 39: clusterwise linear regression. Computing, 22:367–373, 1979. a
[10] W.S. DeSarbo and W.L. Cron. A maximum likelihood methodology for clusterwise linear regression. Journal of Classification, 5(1):249–282, 1988.
[11] P.N. Jones and G.J. McLachlan. Fitting finite mixtures models in a regression context. Austral. J. Statistics, 34(2):233–240, 1992.
[12] S. Gaffney and P. Smyth. Curve clustering with random effects regression mixtures. In AISTATS, 2003.
[13] M.I. Jordan and R.A. Jacobs. Hierarchical mixtures of experts and the em algorithm. Neural computation, 6:181–214, 1994.
[14] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
[15] M. Overton and R. Womersley. Optimality conditions and duality theory for minimizing sums of the largest eigenvalues of symmetric matrices. Mathematical Programming, 62:321–357, 1993.
[16] J. Yu, S.V.N. Vishwanathan, S. G¨ nter, and N. Schraudolph. A quasi-Newton approach to u nonsmooth convex optimization. In ICML, 2008.
[17] A. Argyriou, T. Evgeniou, and M. Pontil. Convex multi-task feature learning. Machine Learning, 73:243–272, 2008.
[18] J. Chen, L. Tang, J. Liu, and J. Ye. A convex formulation for learning shared structures from multiple tasks. In ICML, 2009.
[19] N.Vasconcelos and A. Lippman. Empirical bayesian em-based motion segmentation. In CVPR, 1997.
[20] P. Torr. Geometric motion segmentation and model selection. Philosophical Trans. of the Royal Society of London, 356(1740):1321–1340, 1998.
[21] R. Vidal, Y. Ma, S. Soatto, and S. Sastry. Two-view multibody structure from motion. IJCV, 68(1):7–25, 2006.
[22] H. Li. Two-view motion segmentation from linear programming relaxation. In CVPR, 2007.
[23] http://www.vision.jhu.edu/data/hopkins155/. 9