Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze?

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt

Dokumenter

Fulltext
Forlagets udgivne version, 1,06 MB, PDF-dokument

Oliver Eberle
Brandl, Stephanie
Jonas Pilot
Søgaard, Anders

Learned self-attention functions in state-of-the-art NLP models often correlate with human attention. We investigate whether self-attention in large-scale pre-trained language models is as predictive of human eye fixation patterns during task-reading as classical cognitive models of human attention. We compare attention functions across two task-specific reading datasets for sentiment analysis and relation extraction. We find the predictiveness of large-scale pretrained self-attention for human attention depends on 'what is in the tail', e.g., the syntactic nature of rare contexts. Further, we observe that task-specific fine-tuning does not increase the correlation with human task-specific reading. Through an input reduction experiment we give complementary insights on the sparsity and fidelity trade-off, showing that lower-entropy attention vectors are more faithful.

Originalsprog	Engelsk
Titel	ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
Redaktører	Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Forlag	Association for Computational Linguistics (ACL)
Publikationsdato	2022
Sider	4295-4309
ISBN (Elektronisk)	9781955917216
DOI	https://doi.org/10.18653/v1/2022.acl-long.296
Status	Udgivet - 2022
Begivenhed	60th Annual Meeting of the Association for Computational Linguistics, ACL 2022 - Dublin, Irland Varighed: 22 maj 2022 → 27 maj 2022

Konference

Konference	60th Annual Meeting of the Association for Computational Linguistics, ACL 2022
Land	Irland
By	Dublin
Periode	22/05/2022 → 27/05/2022
Sponsor	Amazon Science, Bloomberg Engineering, et al., Google Research, Liveperson, Meta

Navn	Proceedings of the Annual Meeting of the Association for Computational Linguistics
Vol/bind	1
ISSN	0736-587X

Bibliografisk note

Antal downloads er baseret på statistik fra Google Scholar og www.ku.dk

Ingen data tilgængelig

ID: 341489931