nips nips2012 nips2012-359 nips2012-359-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Qiang Liu, Jian Peng, Alex Ihler
Abstract: Crowdsourcing has become a popular paradigm for labeling large datasets. However, it has given rise to the computational task of aggregating the crowdsourced labels provided by a collection of unreliable annotators. We approach this problem by transforming it into a standard inference problem in graphical models, and applying approximate variational methods, including belief propagation (BP) and mean field (MF). We show that our BP algorithm generalizes both majority voting and a recent algorithm by Karger et al. [1], while our MF method is closely related to a commonly used EM algorithm. In both cases, we find that the performance of the algorithms critically depends on the choice of a prior distribution on the workers’ reliability; by choosing the prior properly, both BP and MF (and EM) perform surprisingly well on both simulated and real-world datasets, competitive with state-of-the-art algorithms based on more complicated modeling assumptions. 1
[1] D.R. Karger, S. Oh, and D. Shah. Iterative learning for reliable crowdsourcing systems. In Neural Information Processing Systems (NIPS), 2011.
[2] A.P. Dawid and A.M. Skene. Maximum likelihood estimation of observer error-rates using the em algorithm. Applied Statistics, pages 20–28, 1979.
[3] P. Smyth, U. Fayyad, M. Burl, P. Perona, and P. Baldi. Inferring ground truth from subjective labelling of venus images. Advances in neural information processing systems, pages 1085– 1092, 1995.
[4] V.C. Raykar, S. Yu, L.H. Zhao, G.H. Valadez, C. Florin, L. Bogoni, and L. Moy. Learning from crowds. The Journal of Machine Learning Research, 11:1297–1322, 2010.
[5] J Whitehill, P Ruvolo, T Wu, J Bergsma, and J Movellan. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In Advances in Neural Information Processing Systems, pages 2035–2043. 2009.
[6] P. Welinder, S. Branson, S. Belongie, and P. Perona. The multidimensional wisdom of crowds. In Neural Information Processing Systems Conference (NIPS), 2010.
[7] V.C. Raykar and S. Yu. Eliminating spammers and ranking annotators for crowdsourced labeling tasks. Journal of Machine Learning Research, 13:491–518, 2012.
[8] Fabian L. Wauthier and Michael I. Jordan. Bayesian bias mitigation for crowdsourcing. In Advances in Neural Information Processing Systems 24, pages 1800–1808. 2011.
[9] B. Carpenter. Multilevel bayesian models of categorical data annotation. Unpublished manuscript, 2008.
[10] D. Koller and N. Friedman. Probabilistic graphical models: principles and techniques. MIT press, 2009.
[11] M. Wainwright and M. Jordan. Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn., 1(1-2):1–305, 2008.
[12] Y. Weiss and W.T. Freeman. On the optimality of solutions of the max-product beliefpropagation algorithm in arbitrary graphs. Information Theory, IEEE Transactions on, 47 (2):736 –744, Feb 2001.
[13] F.R. Kschischang, B.J. Frey, and H.A. Loeliger. Factor graphs and the sum-product algorithm. Information Theory, IEEE Transactions on, 47(2):498–519, 2001.
[14] D. Tarlow, I.E. Givoni, and R.S. Zemel. Hopmap: Efficient message passing with high order potentials. In Proc. of AISTATS, 2010.
[15] A. Zellner. An introduction to Bayesian inference in econometrics, volume 17. John Wiley and sons, 1971.
[16] R.E. Kass and L. Wasserman. The selection of prior distributions by formal rules. Journal of the American Statistical Association, pages 1343–1370, 1996.
[17] F. Tuyl, R. Gerlach, and K. Mengersen. A comparison of bayes-laplace, jeffreys, and other priors. The American Statistician, 62(1):40–44, 2008.
[18] Radford Neal and Geoffrey E. Hinton. A view of the EM algorithm that justifies incremental, sparse, and other variants. In M. Jordan, editor, Learning in Graphical Models, pages 355–368. Kluwer, 1998.
[19] A. Asuncion. Approximate mean field for Dirichlet-based models. In ICML Workshop on Topic Models, 2010.
[20] B. Bollob´ s. Random graphs, volume 73. Cambridge Univ Pr, 2001. a
[21] R. Snow, B. O’Connor, D. Jurafsky, and A.Y. Ng. Cheap and fast—but is it good?: evaluating non-expert annotations for natural language tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 254–263. Association for Computational Linguistics, 2008. 9