Loading [MathJax]/extensions/Safe.js
aclweb.org
scholar.google.com
Comparing Data Sources and Architectures for Deep Visual Representation Learning in Semantics
Kiela, Douwe and Vero, Anita Lilla and Clark, Stephen
Empirical Methods on Natural Language Processing (EMNLP) - 2016 via Local Bibsonomy
Keywords: dblp


[link]
Summary by Marek Rei 7 years ago

The authors compare different image recognition models and image data sources for multimodal word representation learning.

Image recognition models used for vector generation

Experiments are performed on SimLex-999 (similarity) and MEN (relatedness). The performance of different models (AlexNet, GoogLeNet, VGGNet) is found to be quite similar, with VGGNet performing slightly better at the cost of requiring more computation. Using search engines for image sources gives good coverage; ImageNet performs quite well with VGGNet; ESP Game dataset gave the lowest performance. Combining visual and linguistic vectors was found to be beneficial on both English and Italian.

Your comment:

Send Feedback
ShortScience.org allows researchers to publish paper summaries that are voted on and ranked!
About

Sponsored by: