emnlp emnlp2013 emnlp2013-78 emnlp2013-78-reference knowledge-graph by maker-knowledge-mining

78 emnlp-2013-Exploiting Language Models for Visual Recognition

Source: pdf

Author: Dieu-Thu Le ; Jasper Uijlings ; Raffaella Bernardi

Abstract: The problem of learning language models from large text corpora has been widely studied within the computational linguistic community. However, little is known about the performance of these language models when applied to the computer vision domain. In this work, we compare representative models: a window-based model, a topic model, a distributional memory and a commonsense knowledge database, ConceptNet, in two visual recognition scenarios: human action recognition and object prediction. We examine whether the knowledge extracted from texts through these models are compatible to the knowledge represented in images. We determine the usefulness of different language models in aiding the two visual recognition tasks. The study shows that the language models built from general text corpora can be used instead of expensive annotated images and even outperform the image model when testing on a big general dataset.

78 emnlp-2013-Exploiting Language Models for Visual Recognition

reference text