jmlr jmlr2013 jmlr2013-16 jmlr2013-16-reference knowledge-graph by maker-knowledge-mining

16 jmlr-2013-Bayesian Nonparametric Hidden Semi-Markov Models


Source: pdf

Author: Matthew J. Johnson, Alan S. Willsky

Abstract: There is much interest in the Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM) as a natural Bayesian nonparametric extension of the ubiquitous Hidden Markov Model for learning from sequential and time-series data. However, in many settings the HDP-HMM’s strict Markovian constraints are undesirable, particularly if we wish to learn or encode non-geometric state durations. We can extend the HDP-HMM to capture such structure by drawing upon explicit-duration semi-Markov modeling, which has been developed mainly in the parametric non-Bayesian setting, to allow construction of highly interpretable models that admit natural prior information on state durations. In this paper we introduce the explicit-duration Hierarchical Dirichlet Process Hidden semiMarkov Model (HDP-HSMM) and develop sampling algorithms for efficient posterior inference. The methods we introduce also provide new methods for sampling inference in the finite Bayesian HSMM. Our modular Gibbs sampling methods can be embedded in samplers for larger hierarchical Bayesian models, adding semi-Markov chain modeling as another tool in the Bayesian inference toolbox. We demonstrate the utility of the HDP-HSMM and our inference methods on both synthetic and real experiments. Keywords: Bayesian nonparametrics, time series, semi-Markov, sampling algorithms, Hierarchical Dirichlet Process Hidden Markov Model


reference text

M. J. Beal, Z. Ghahramani, and C. E. Rasmussen. The infinite hidden markov model. Advances in Neural Information Processing Systems, 14:577–584, 2002. S. P. Brooks and A. Gelman. General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, pages 434–455, 1998. E. Cinlar. Probability and Stochastics. Springer Verlag, 2010. ¸ M. Dewar, C. Wiggins, and F. Wood. Inference in hidden markov models with explicit state duration distributions. Signal Processing Letters, IEEE, (99):1–1, 2012. E. B. Fox. Bayesian Nonparametric Learning of Complex Dynamical Phenomena. Ph.D. thesis, MIT, Cambridge, MA, 2009. E. B. Fox, E. B. Sudderth, M. I. Jordan, and A. S. Willsky. An HDP-HMM for systems with state persistence. In Proceedings of the International Conference on Machine Learning, July 2008. Z. Ghahramani and M. I. Jordan. Factorial hidden markov models. Machine Learning, 29(2): 245–273, 1997. Y. Gu´ don. Exploring the state sequence space for hidden markov and semi-markov chains. e Computational Statistics and Data Analysis, 51(5):2379–2409, 2007. ISSN 0167-9473. doi: http://dx.doi.org/10.1016/j.csda.2006.03.015. K. Hashimoto, Y. Nankaku, and K. Tokuda. A bayesian approach to hidden semi-markov model based speech synthesis. In Tenth Annual Conference of the International Speech Communication Association, 2009. 699 J OHNSON AND W ILLSKY Lighting Power (Watts) Power (Watts) Power (Watts) Power (Watts) Power (Watts) 600 500 400 300 200 100 0 450 400 350 300 250 200 150 100 50 0 800 700 600 500 400 300 200 100 0 800 700 600 500 400 300 200 100 0 1800 1600 1400 1200 1000 800 600 400 200 0 Dishwasher Furnace Microwave Total Power 3000 2500 2000 1500 1000 500 00 Refrigerator 1000 2000 3000 Time (sample index) 4000 5000 Figure 20: Example real data observation sequences for the power disaggregation experiments. Device Base Measures Observations Durations Lighting Gauss(300, 2002 ; 52 ) NegBin(5, 220; 12) Refrigerator Gauss(110, 502 ; 102 ) NegBin(100, 600; 10) Dishwasher Gauss(225, 252 ; 102 ) NegBin(100, 200; 10) Furnace Microwave Gauss(600, 1002 ; 202 ) Gauss(1700, 2002 ; 502 ) NegBin(40, 40; 10) NegBin(200, 1; 50) Specific States Observations Durations Gauss(0, 1; 52 ) Gauss(0, 1; 52 ) Gauss(115, 102 ; 102 ) Gauss(425, 302 ; 102 ) Gauss(0, 1; 52 ) Gauss(225, 252 ; 102 ) Gauss(900, 2002 ; 102 ) Gauss(0, 1; 52 ) Gauss(0, 1; 52 ) NegBin(5, 220; 12) NegBin(100, 600; 10) NegBin(100, 600; 10) NegBin(100, 600; 10) NegBin(1, 2000; 1) NegBin(100, 200; 10) NegBin(40, 500; 10) NegBin(1, 50; 1) NegBin(1, 1000; 1) Table 2: Power disaggregation prior parameters for each device. Observation priors encode rough power levels that are expected from devices. Duration priors encode duration statistics that are expected from devices. 700 BAYESIAN N ONPARAMETRIC H IDDEN S EMI -M ARKOV M ODELS K. A. Heller, Y. W. Teh, and D. G¨ r¨ r. Infinite hierarchical hidden Markov models. In Proceedings ou of the International Conference on Artificial Intelligence and Statistics, volume 12, 2009. H. Ishwaran and M. Zarepour. Markov chain monte carlo in approximate dirichlet and beta twoparameter process hierarchical models. Biometrika, 87(2):371–390, 2000. M. J. Johnson and A. S. Willsky. The Hierarchical Dirichlet Process Hidden Semi-Markov Model. In Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, Corvallis, Oregon, USA, 2010. AUAI Press. M. J. Johnson and A. S. Willsky. Dirichlet posterior sampling with truncated multinomial likelihoods. 2012. arXiv:1208.6537v2. H. Kim, M. Marwah, M. Arlitt, G. Lyon, and J. Han. Unsupervised disaggregation of low frequency power measurements. Technical report, HP Labs Tech. Report, 2010. J. Z. Kolter and M. J. Johnson. REDD: A Public Data Set for Energy Disaggregation Research. In SustKDD Workshop on Data Mining Applications in Sustainability, 2011. K. Murphy. Hidden semi-markov models (segment models). Technical Report, November 2002. URL http://www.cs.ubc.ca/˜murphyk/Papers/segment.pdf. K. P. Murphy. Conjugate bayesian analysis of the gaussian distribution. Technical report, 2007. P. Orbanz. Construction of nonparametric bayesian models from parametric bayes equations. Advances in Neural Information Processing Systems, 2009. J. Sethuraman. A constructive definition of dirichlet priors. In Statistica Sinica, volume 4, pages 639–650, 1994. Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476):1566–1581, 2006. S. E. Tranter and D. A. Reynolds. An overview of automatic speaker diarization systems. Audio, Speech, and Language Processing, IEEE Transactions on, 14(5):1557–1565, 2006. D. A. Van Dyk and X. L. Meng. The art of data augmentation. Journal of Computational and Graphical Statistics, 10(1):1–50, 2001. J. Van Gael, Y. Saatci, Y. W. Teh, and Z. Ghahramani. Beam sampling for the infinite hidden markov model. In Proceedings of the 25th International Conference on Machine Learning, pages 1088–1095. ACM, 2008. S. Z. Yu. Hidden semi-markov models. Artificial Intelligence, 174(2):215–243, 2010. M. Zeifman and K. Roth. Nonintrusive appliance load monitoring: Review and outlook. Consumer Electronics, IEEE Transactions on, 57(1):76–84, 2011. 701