nips nips2007 nips2007-66 nips2007-66-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Tony Jebara, Yingbo Song, Kapil Thadani
Abstract: A method is proposed for semiparametric estimation where parametric and nonparametric criteria are exploited in density estimation and unsupervised learning. This is accomplished by making sampling assumptions on a dataset that smoothly interpolate between the extreme of independently distributed (or id) sample data (as in nonparametric kernel density estimators) to the extreme of independent identically distributed (or iid) sample data. This article makes independent similarly distributed (or isd) sampling assumptions and interpolates between these two using a scalar parameter. The parameter controls a Bhattacharyya affinity penalty between pairs of distributions on samples. Surprisingly, the isd method maintains certain consistency and unimodality properties akin to maximum likelihood estimation. The proposed isd scheme is an alternative for handling nonstationarity in data without making drastic hidden variable assumptions which often make estimation difficult and laden with local optima. Experiments in density estimation on a variety of datasets confirm the value of isd over iid estimation, id estimation and mixture modeling.
Bengio, Y., Larochelle, H., & Vincent, P. (2005). Non-local manifold Parzen windows. Neural Information Processing Systems. Bhattacharyya, A. (1943). On a measure of divergence between two statistical populations defined by their probability distributions. Bull. Calcutta Math Soc. Collins, M., Dasgupta, S., & Schapire, R. (2002). A generalization of principal components analysis to the exponential family. NIPS. Devroye, L., & Gyorfi, L. (1985). Nonparametric density estimation: The l1 view. John Wiley. Efron, B., & Tibshirani, R. (1996). Using specially designed exponential families for density estimation. The Annals of Statistics, 24, 2431–2461. Hjort, N., & Glad, I. (1995). Nonparametric density estimation with a parametric start. The Annals of Statistics, 23, 882–904. Jebara, T., Kondor, R., & Howard, A. (2004). Probability product kernels. Journal of Machine Learning Research, 5, 819–844. Naito, K. (2004). Semiparametric density estimation by local l2 -fitting. The Annals of Statistics, 32, 1162– 1192. Olking, I., & Spiegelman, C. (1987). A semiparametric approach to density estimation. Journal of the American Statistcal Association, 82, 858–865. Prekopa, A. (1973). On logarithmic concave measures and functions. Acta. Sci. Math., 34, 335–343. Rasmussen, C. (1999). The infinite Gaussian mixture model. NIPS. Silverman, B. (1986). Density estimation for statistics and data analysis. Chapman and Hall: London. Teh, Y., Jordan, M., Beal, M., & Blei, D. (2004). Hierarchical Dirichlet processes. NIPS. Topsoe, F. (1999). Some inequalities for information divergence and related measures of discrimination. Journal of Inequalities in Pure and Applied Mathematics, 2. Wand, M., & Jones, M. (1995). Kernel smoothing. CRC Press. 2 Work supported in part by NSF Award IIS-0347499 and ONR Award N000140710507.