nips nips2011 nips2011-116 nips2011-116-reference knowledge-graph by maker-knowledge-mining

116 nips-2011-Hierarchically Supervised Latent Dirichlet Allocation

Source: pdf

Author: Adler J. Perotte, Frank Wood, Noemie Elhadad, Nicholas Bartlett

Abstract: We introduce hierarchically supervised latent Dirichlet allocation (HSLDA), a model for hierarchically and multiply labeled bag-of-word data. Examples of such data include web pages and their placement in directories, product descriptions and associated categories from product hierarchies, and free-text clinical records and their assigned diagnosis codes. Out-of-sample label prediction is the primary goal of this work, but improved lower-dimensional representations of the bagof-word data are also of interest. We demonstrate HSLDA on large-scale data from clinical document labeling and retail product categorization tasks. We show that leveraging the structure from hierarchical labels improves out-of-sample label prediction substantially when compared to models that do not. 1

reference text

[1] DMOZ open directory project. http://www.dmoz.org/, 2002.

[2] Stanford network analysis platform. http://snap.stanford.edu/, 2004.

[3] The computational medicine center’s 2007 medical natural language processing challenge. http://www.computationalmedicine.org/challenge/previous, 2007.

[4] J. Albert and S. Chib. Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association, 88(422):669, 1993. 8

[5] E. Birman-Deych, A. D. Waterman, Y. Yan, D. S. Nilasena, M. J. Radford, and B. F. Gage. Accuracy of ICD-9-CM codes for identifying cardiovascular and stroke risk factors. Medical Care, 43(5):480–5, 2005.

[6] D. Blei and J. McAuliffe. Supervised topic models. Advances in Neural Information Processing, 20: 121–128, 2008.

[7] D. Blei, A.Y. Ng, and M.I. Jordan. Latent Dirichlet allocation. J. Mach. Learn. Res., 3:993–1022, March 2003. ISSN 1532-4435.

[8] S. Chakrabarti, B. Dom, R. Agrawal, and P. Raghavan. Scalable feature selection, classiﬁcation and signature generation for organizing large text databases into hierarchical topic taxonomies. The VLDB Journal, 7:163–178, August 1998. ISSN 1066-8888.

[9] J. Chang and D. M. Blei. Hierarchical relational models for document networks. Annals of Applied Statistics, 4:124–150, 2010. doi: 10.1214/09-AOAS309.

[10] K. Crammer, M. Dredze, K. Ganchev, P.P. Talukdar, and S. Carroll. Automatic code assignment to medical text. Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing, pages 129–136, 2007.

[11] S. Dumais and H. Chen. Hierarchical classiﬁcation of web content. In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’00, pages 256–263, New York, NY, USA, 2000. ACM.

[12] R. Farkas and G. Szarvas. Automatic construction of rule-based ICD-9-CM coding systems. BMC bioinformatics, 9(Suppl 3):S10, 2008.

[13] M. Farzandipour, A. Sheikhtaheri, and F. Sadoughi. Effective factors on accuracy of principal diagnosis coding based on international classiﬁcation of diseases, the 10th revision. International Journal of Information Management, 30:78–84, 2010.

[14] A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin. Bayesian Data Analysis. Chapman and Hall/CRC, 2nd ed. edition, 2004. ¨

[15] I. Goldstein, A. Arzumtsyan, and O. Uzuner. Three approaches to automatic assignment of ICD-9-CM codes to radiology reports. AMIA Annual Symposium Proceedings, 2007:279, 2007.

[16] T. L. Grifﬁths and M. Steyvers. Finding scientiﬁc topics. PNAS, 101(suppl. 1):5228–5235, 2004.

[17] D. Koller and M. Sahami. Hierarchically classifying documents using very few words. Technical Report 1997-75, Stanford InfoLab, February 1997. Previous number = SIDL-WP-1997-0059.

[18] S. Lacoste-Julien, F. Sha, and M. I. Jordan. DiscLDA: Discriminative learning for dimensionality reduction and classiﬁcation. In Neural Information Processing Systems, pages 897–904.

[19] L. Larkey and B. Croft. Automatic assignment of ICD9 codes to discharge summaries. Technical report, University of Massachussets, 1995.

[20] L. V. Lita, S. Yu, S. Niculescu, and J. Bi. Large scale diagnostic code classiﬁcation for medical patient records. In Proceedings of the 3rd International Joint Conference on Natural Language Processing (IJCNLP’08), 2008.

[21] A. McCallum, K. Nigam, J. Rennie, and K. Seymore. Building domain-speciﬁc search engines with machine learning techniques. In Proc. AAAI-99 Spring Symposium on Intelligent Agents in Cyberspace, 1999.

[22] S. Pakhomov, J. Buntrock, and C. Chute. Automating the assignment of diagnosis codes to patient encounters using example-based and machine learning techniques. Journal of the American Medical Informatics Association (JAMIA), 13(5):516–525, 2006.

[23] D. Ramage, D. Hall, R. Nallapati, and C. D. Manning. Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 248–256, 2009.

[24] B. Ribeiro-Neto, A. Laender, and L. De Lima. An experimental study in automatically categorizing medical documents. Journal of the American society for Information science and Technology, 52(5): 391–401, 2001.

[25] Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476):1566–1581, 2006.

[26] H. Wallach, D. Mimno, and A. McCallum. Rethinking LDA: Why priors matter. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 1973–1981. 2009.

[27] C. Wang, D. Blei, and L. Fei-Fei. Simultaneous image classiﬁcation and annotation. In CVPR, pages 1903–1910, 2009. 9