Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions
Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt
Standard
Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions. / Gatt, Albert; Tanti, Marc; Muscat, Adrian; Paggio, Patrizia; Farrugia, Reuben; Borg, Claudia ; Camilleri, Kenneth; Rosner, Mike; van der Plas, Lonneke .
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki : European Language Resources Association, 2018.Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions
AU - Gatt, Albert
AU - Tanti, Marc
AU - Muscat, Adrian
AU - Paggio, Patrizia
AU - Farrugia, Reuben
AU - Borg, Claudia
AU - Camilleri, Kenneth
AU - Rosner, Mike
AU - van der Plas, Lonneke
PY - 2018
Y1 - 2018
N2 - The past few years have witnessed renewed interest in NLP tasks at the interface between vision and language. One intensively-studied problem is that of automatically generating text from images. In this paper, we extend this problem to the more specific domain of face description. Unlike scene descriptions, face descriptions are more fine-grained and rely on attributes extracted from the image, rather than objects and relations. Given that no data exists for this task, we present an ongoing crowdsourcing study to collect a corpus of descriptions of face images taken ‘in the wild’. To gain a better understanding of the variation we find in face description and the possible issues that this may raise, we also conducted an annotation study on a subset of the corpus. Primarily, we found descriptions to refer to a mixture of attributes, not only physical, but also emotional and inferential, which is bound to create further challenges for current image-to-text methods
AB - The past few years have witnessed renewed interest in NLP tasks at the interface between vision and language. One intensively-studied problem is that of automatically generating text from images. In this paper, we extend this problem to the more specific domain of face description. Unlike scene descriptions, face descriptions are more fine-grained and rely on attributes extracted from the image, rather than objects and relations. Given that no data exists for this task, we present an ongoing crowdsourcing study to collect a corpus of descriptions of face images taken ‘in the wild’. To gain a better understanding of the variation we find in face description and the possible issues that this may raise, we also conducted an annotation study on a subset of the corpus. Primarily, we found descriptions to refer to a mixture of attributes, not only physical, but also emotional and inferential, which is bound to create further challenges for current image-to-text methods
M3 - Article in proceedings
BT - Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
PB - European Language Resources Association
CY - Miyazaki
ER -
ID: 209459343