nips nips2011 nips2011-293 nips2011-293-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Phillip Isola, Devi Parikh, Antonio Torralba, Aude Oliva
Abstract: Artists, advertisers, and photographers are routinely presented with the task of creating an image that a viewer will remember. While it may seem like image memorability is purely subjective, recent work shows that it is not an inexplicable phenomenon: variation in memorability of images is consistent across subjects, suggesting that some images are intrinsically more memorable than others, independent of a subjects’ contexts and biases. In this paper, we used the publicly available memorability dataset of Isola et al. [13], and augmented the object and scene annotations with interpretable spatial, content, and aesthetic image properties. We used a feature-selection scheme with desirable explaining-away properties to determine a compact set of attributes that characterizes the memorability of any individual image. We find that images of enclosed spaces containing people with visible faces are memorable, while images of vistas and peaceful scenes are not. Contrary to popular belief, unusual or aesthetically pleasing scenes do not tend to be highly memorable. This work represents one of the first attempts at understanding intrinsic image memorability, and opens a new domain of investigation at the interface between human cognition and computer vision. 1
[1] T. F. Brady, T. Konkle, G. A. Alvarez, and A. Oliva. Visual long-term memory has a massive storage capacity for object details. In Proceedings of the National Academy of Sciences, 2008.
[2] L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and regression trees. Boca Raton, FL: CRC Press, 1984.
[3] G. D. A. Brown, I. Neath, and N. Chater. A temporal ratio model of memory. Psych. Review, 2007.
[4] C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001.
[5] D. Cohen-Or, O. Sorkine, R. Gal, T. Leyvand, and Y.-Q. Xu. Color harmonization. ACM Transactions on Graphics (Proceedings of ACM SIGGRAPH), 2006.
[6] A. Das and D. Kempe. Submodular meets spectral: Greedy algorithms for subset selection, sparse approximation and dictionary selection. In arXiv:1102.3975v2 [stat.ML], 2011.
[7] S. Dhar, V. Ordonez, and T. L. Berg. High level describable attributes for predicting aesthetics and interestingness. In IEEE Computer Vision and Pattern Recognition, 2011.
[8] A. Farhadi, I. Endres, D. Hoiem, and D. Forsyth. Describing objects by their attributes. In IEEE Computer Vision and Pattern Recognition, 2009.
[9] C. Fellbaum. Wordnet: an electronic lexical database. In The MIT Press, 1998.
[10] B. Gooch, E. Reinhard, C. Moulding, and P. Shirley. Artistic composition for image creation. In Eurographics Workshop on Rendering, 2001.
[11] M. W. Howard and M. J. Kahana. A distributed representation of temporal context. In Journal ofMathematical Psychology, 2001.
[12] R. R. Hunt and J. B. Worthen. Distinctiveness and memory. In NY:Oxford Univeristy Press, 2006.
[13] P. Isola, J. Xiao, A. Torralba, and A. Oliva. What makes an image memorable? In IEEE Computer Vision and Pattern Recognition, 2011.
[14] L. Itti, C. Koch, and E. Niebur. A model of saliency-based visual attention for rapid scene analysis. In Pattern Analysis and Machine Intelligence, 1998.
[15] T. Konkle, T. F. Brady, G. A. Alvarez, and A. Oliva. Conceptual distinctiveness supports detailed visual long-term memory for realworld objects. In Journal of Experimental Psychology: General, 2010.
[16] T. Konkle, T. F. Brady, G. A. Alvarez, and A. Oliva. Scene memory is more detailed than you think: the role of categories in visual longterm memory. In Psychological Science, 2010.
[17] A. Krause and C. Guestrin. Near-optimal nonmyopic value of information in graphical models. In Conference on Uncertainty in Artificial Intelligence, 2005.
[18] C. H. Lampert, H. Nickisch, and S. Harmeling. Learning to detect unseen object classes by between class attribute transfer. In IEEE Computer Vision and Pattern Recognition, 2009.
[19] J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance. Cost-effective outbreak detection in networks. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2007.
[20] T. Leyvand, D. Cohen-Or, G. Dror, and D. Lischinski. Data-driven enhancement of facial attractiveness. ACM Transactions on Graphics (Proceedings of ACM SIGGRAPH 2008), 2008.
[21] Y. Luo and X. Tang. Photo and video quality evaluation: Focusing on the subject. In European Conference on Computer Vision, 2008.
[22] J. L. McClelland, B. L. McNaughton, and R. C. O’Reilly. Why there are complementary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory. In Psychological Review, 1995.
[23] A. Oliva and A. Torralba. Modeling the shape of the scene: a holistic representation of the spatial envelope. In International Journal of Computer Vision, 2001.
[24] L. Renjie, C. L. Wolf, and D. Cohen-Or. Optimizing photo composition. In Technical report, Tel-Aviv University, 2010.
[25] I. Rock and P. Englestein. A study of memory for visual form. The American Journal of Psychology, 1959.
[26] B. C. Russell, A. Torralba, K. Murphy, and W. T. Freeman. Labelme: A database and web-based tool for image annotation. In International Journal of Computer Vision, 2008.
[27] R. M. Shiffrin and M. Steyvers. A model for recognition memory: Rem - retrieving effectively from memory. In Psychnomic Bulletin and Review, 1997.
[28] A. J. Smola and B. Schlkopf. A tutorial on support vector regression. Statistics and Computing, 14:199– 222, 2004.
[29] M. Spain and P. Perona. Some objects are more equal than others: measuring and predicting importance. In Proceedings of the European Conference on Computer Vision, 2008.
[30] L. Standing. Learning 10,000 pictures. In Quarterly Journal of Experimental Psychology, 1973.
[31] S. Ullman, M. Vidal-Naquet, and E. Sali. Visual features of intermediate complexity and their use in classification. In Nature Neuroscience, 2002.
[32] J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba. Sun database: Large-scale scene recognition from abbey to zoo. In IEEE Conference on Computer Vision and Pattern Recognition, 2010. 9