Separating Self-Expression and Visual Content in Hashtag Supervision
Research output: Contribution to journal › Conference article › Research › peer-review
Standard
Separating Self-Expression and Visual Content in Hashtag Supervision. / Veit, Andreas; Nickel, Maximilian; Belongie, Serge; Maaten, Laurens Van Der.
In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 14.12.2018, p. 5919-5927.Research output: Contribution to journal › Conference article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Separating Self-Expression and Visual Content in Hashtag Supervision
AU - Veit, Andreas
AU - Nickel, Maximilian
AU - Belongie, Serge
AU - Maaten, Laurens Van Der
N1 - Publisher Copyright: © 2018 IEEE.
PY - 2018/12/14
Y1 - 2018/12/14
N2 - The variety, abundance, and structured nature of hashtags make them an interesting data source for training vision models. For instance, hashtags have the potential to significantly reduce the problem of manual supervision and annotation when learning vision models for a large number of concepts. However, a key challenge when learning from hashtags is that they are inherently subjective because they are provided by users as a form of self-expression. As a consequence, hashtags may have synonyms (different hashtags referring to the same visual content) and may be polysemous (the same hashtag referring to different visual content). These challenges limit the effectiveness of approaches that simply treat hashtags as image-label pairs. This paper presents an approach that extends upon modeling simple image-label pairs with a joint model of images, hashtags, and users. We demonstrate the efficacy of such approaches in image tagging and retrieval experiments, and show how the joint model can be used to perform user-conditional retrieval and tagging.
AB - The variety, abundance, and structured nature of hashtags make them an interesting data source for training vision models. For instance, hashtags have the potential to significantly reduce the problem of manual supervision and annotation when learning vision models for a large number of concepts. However, a key challenge when learning from hashtags is that they are inherently subjective because they are provided by users as a form of self-expression. As a consequence, hashtags may have synonyms (different hashtags referring to the same visual content) and may be polysemous (the same hashtag referring to different visual content). These challenges limit the effectiveness of approaches that simply treat hashtags as image-label pairs. This paper presents an approach that extends upon modeling simple image-label pairs with a joint model of images, hashtags, and users. We demonstrate the efficacy of such approaches in image tagging and retrieval experiments, and show how the joint model can be used to perform user-conditional retrieval and tagging.
UR - http://www.scopus.com/inward/record.url?scp=85062871459&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2018.00620
DO - 10.1109/CVPR.2018.00620
M3 - Conference article
AN - SCOPUS:85062871459
SP - 5919
EP - 5927
JO - I E E E Conference on Computer Vision and Pattern Recognition. Proceedings
JF - I E E E Conference on Computer Vision and Pattern Recognition. Proceedings
SN - 1063-6919
T2 - 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
Y2 - 18 June 2018 through 22 June 2018
ER -
ID: 301824866