PAELLA - Resultat

PAELLA: Parameter-Efficient Lightweight Language-Agnostic Captioning Model

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt

Dokumenter

Fulltext
Forlagets udgivne version, 989 KB, PDF-dokument

Rita Ramos
Bugliarello, Emanuele
Bruno Martins
Elliott, Desmond

We introduce PAELLA, a Parameter-Efficient Lightweight Language-Agnostic image captioning model designed to be both parameter and data-efficient using retrieval augmentation. The model is trained by learning a small mapping network with 34M parameters between a pre-trained visual model and a multilingual language model that is conditioned on two types of input: (i) the image itself, and (ii) a set of retrieved captions in the target language. The retrieved examples play a key role in guiding the model to generate captions across languages. Through retrieval, the model can be lightweight in terms of the number of trainable parameters, which only exist in its mapping network, and also in the amount of multilingual training data that is required. Experiments on the XM3600 dataset, featuring 36 languages, show that PAELLA can outperform or compete against some models with 3-77× more learned parameters and 35-863× more data, particularly in low-resource languages. We also find that PAELLA can be trained on only monolingual data and still show strong zero-shot abilities in other languages.

Originalsprog	Engelsk
Titel	Findings of the Association for Computational Linguistics : NAACL 2024
Redaktører	Kevin Duh, Helena Gomez, Steven Bethard
Antal sider	16
Forlag	Association for Computational Linguistics (ACL)
Publikationsdato	2024
Sider	3549-3564
ISBN (Elektronisk)	9798891761193
Status	Udgivet - 2024
Begivenhed	2024 Findings of the Association for Computational Linguistics: NAACL 2024 - Mexico City, Mexico Varighed: 16 jun. 2024 → 21 jun. 2024

Konference

Konference	2024 Findings of the Association for Computational Linguistics: NAACL 2024
Land	Mexico
By	Mexico City
Periode	16/06/2024 → 21/06/2024
Sponsor	Baidu, CapitalOne, et al., Grammarly, Megagon Labs, Otter.ai

Bibliografisk note

Funding Information:
This research was supported by the Portuguese Recovery and Resilience Plan through project C645008882-00000055 (i.e., the Center For Responsible AI), and also by Funda\u00E7\u00E3o para a Ci\u00EAncia e Tecnologia (FCT), through the project with reference UIDB/50021/2020 (DOI:10.54499/UIDB/50021/2020) and the Ph.D. scholarship with reference 2020.06106.BD.

Publisher Copyright:
© 2024 Association for Computational Linguistics.

Forskning

PAELLA: Parameter-Efficient Lightweight Language-Agnostic Captioning Model

Dokumenter

Konference

Bibliografisk note

Links