nips nips2012 nips2012-210 nips2012-210-reference knowledge-graph by maker-knowledge-mining

210 nips-2012-Memorability of Image Regions


Source: pdf

Author: Aditya Khosla, Jianxiong Xiao, Antonio Torralba, Aude Oliva

Abstract: While long term human visual memory can store a remarkable amount of visual information, it tends to degrade over time. Recent works have shown that image memorability is an intrinsic property of an image that can be reliably estimated using state-of-the-art image features and machine learning algorithms. However, the class of features and image information that is forgotten has not been explored yet. In this work, we propose a probabilistic framework that models how and which local regions from an image may be forgotten using a data-driven approach that combines local and global images features. The model automatically discovers memorability maps of individual images without any human annotation. We incorporate multiple image region attributes in our algorithm, leading to improved memorability prediction of images as compared to previous works. 1


reference text

[1] T. F. Brady, T. Konkle, G. A. Alvarez, and A. Oliva. Visual long-term memory has a massive storage capacity for object details. PNAS, pages 14325–14329, 2008.

[2] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, volume 1, pages 886–893. IEEE, 2005.

[3] S. Dhar, V. Ordonez, and T.L. Berg. High level describable attributes for predicting aesthetics and interestingness. In CVPR, pages 1657–1664. IEEE, 2011.

[4] P.F. Felzenszwalb, R.B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part-based models. TPAMI, 2010.

[5] B. Gooch, E. Reinhard, C. Moulding, and P. Shirley. Artistic composition for image creation. In Rendering Techniques 2001: Proceedings of the Eurographics Workshop in London, United Kingdom, June 25-27, 2001, page 83. Springer Verlag Wien, 2001.

[6] P. Isola, D. Parikh, A. Torralba, and A. Oliva. Understanding the intrinsic memorability of images. In Advances in Neural Information Processing Systems (NIPS), 2011. 8

[7] P. Isola, J. Xiao, A. Torralba, and A. Oliva. What makes an image memorable? In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 145–152, 2011.

[8] L. Itti and C. Koch. A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research, 40:1489–1506, 2000.

[9] T. Joachims. Training linear SVMs in linear time. In ACM SIGKDD, pages 217–226, 2006.

[10] C. Kanan, M.H. Tong, L. Zhang, and G.W. Cottrell. Sun: Top-down saliency using natural statistics. Visual Cognition, 17(6-7):979–1003, 2009.

[11] F. S. Khan, J. van de Weijer, A. D. Bagdanov, and M. Vanrell. Portmanteau vocabularies for multi-cue image representation. In NIPS, Granada, Spain, 2011.

[12] A. Khosla∗ , J. Xiao∗ , P. Isola, A. Torralba, and A. Oliva. Image memorability and visual inception. In SIGGRAPH Asia, 2012. ∗ indicates equal contribution.

[13] T. Konkle, T.F. Brady, G.A Alvarez, and A. Oliva. Conceptual distinctiveness supports detailed visual long-term memory for real-world objects. Journal of Experimental Psychology, (139):558–578, 3 2010.

[14] T. Konkle, T.F. Brady, G.A. Alvarez, and A. Oliva. Scene memory is more detailed than you think: the role of categories in visual long-term memory. Psychological Science, (21):1551–1556, 11 2010.

[15] S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR, volume 2, pages 2169–2178. IEEE, 2006.

[16] T. Leyvand, D. Cohen-Or, G. Dror, and D. Lischinski. Data-driven enhancement of facial attractiveness. In ACM Transactions on Graphics (TOG), volume 27, page 38. ACM, 2008.

[17] L.-J. Li, H. Su, E. P. Xing, and L. Fei-Fei. Object bank: A high-level image representation for scene classification & semantic feature sparsification. In NIPS, Vancouver, Canada, December 2010.

[18] L. Liu, R. Chen, L. Wolf, and D. Cohen-Or. Optimizing photo composition. In Computer Graphics Forum, volume 29, pages 469–478. Wiley Online Library, 2010.

[19] Y. Luo and X. Tang. Photo and video quality evaluation: Focusing on the subject. In Proceedings of the 10th European Conference on Computer Vision: Part III, pages 386–399. Springer-Verlag, 2008.

[20] L. Marchesotti, F. Perronnin, D. Larlus, and G. Csurka. Assessing the aesthetic quality of photographs using generic image descriptors. In Computer Vision (ICCV), 2011 IEEE International Conference on, pages 1784–1791. IEEE, 2011.

[21] T. Ojala, M. Pietikainen, and T. Maenpaa. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Analysis and Machine Intelligence, 24(7):971–987, 2002.

[22] R. A. Rensink, J. K. O’Regan, and J. J. Clark. To See or not to See: The Need for Attention to Perceive Changes in Scenes. Psychological Science, 8(5):368–373, September 1997.

[23] E. Shechtman and M. Irani. Matching local self-similarities across images and videos. In Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on, pages 1–8. Ieee, 2007.

[24] M. Spain and P. Perona. Some objects are more equal than others: Measuring and predicting importance. Computer Vision–ECCV 2008, pages 523–536, 2008.

[25] L. Standing. Learning 10000 pictures. The Quarterly journal of experimental psychology, 25(2):207–222, 1973.

[26] L. Standing, J. Conezio, and R.N. Haber. Perception and memory for pictures: Single-trial learning of 2500 visual stimuli. Psychonomic Science; Psychonomic Science, 1970.

[27] J. Van De Weijer, C. Schmid, and J. Verbeek. Learning color names from real-world images. In Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on, pages 1–8. IEEE, 2007.

[28] S. Vogt and S. Magnussen. Long-term memory for 400 pictures on a common theme. Experimental Psychology (formerly Zeitschrift f¨ r Experimentelle Psychologie), 54(4):298–303, 2007. u

[29] J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong. Locality-constrained linear coding for image classification. In CVPR, pages 3360–3367. IEEE, 2010.

[30] J. T. Wixted. The Psychology and Neuroscience of Forgetting. Annual Review of Psychology, 55(1), 20040101.

[31] J. T. Wixted and S. K. Carpenter. The Wickelgren Power Law and the Ebbinghaus Savings Function. Psychological Science, 18(2):133–134, February 2007.

[32] J. Xiao, J. Hays, K.A. Ehinger, A. Oliva, and A. Torralba. SUN database: Large-scale scene recognition from abbey to zoo. In CVPR, pages 3485–3492. IEEE, 2010. 9