Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze?

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Dokumenter

  • Fulltext

    Forlagets udgivne version, 1,06 MB, PDF-dokument

Learned self-attention functions in state-of-the-art NLP models often correlate with human attention. We investigate whether self-attention in large-scale pre-trained language models is as predictive of human eye fixation patterns during task-reading as classical cognitive models of human attention. We compare attention functions across two task-specific reading datasets for sentiment analysis and relation extraction. We find the predictiveness of large-scale pretrained self-attention for human attention depends on 'what is in the tail', e.g., the syntactic nature of rare contexts. Further, we observe that task-specific fine-tuning does not increase the correlation with human task-specific reading. Through an input reduction experiment we give complementary insights on the sparsity and fidelity trade-off, showing that lower-entropy attention vectors are more faithful.

OriginalsprogEngelsk
TitelACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
RedaktørerSmaranda Muresan, Preslav Nakov, Aline Villavicencio
ForlagAssociation for Computational Linguistics (ACL)
Publikationsdato2022
Sider4295-4309
ISBN (Elektronisk)9781955917216
DOI
StatusUdgivet - 2022
Begivenhed60th Annual Meeting of the Association for Computational Linguistics, ACL 2022 - Dublin, Irland
Varighed: 22 maj 202227 maj 2022

Konference

Konference60th Annual Meeting of the Association for Computational Linguistics, ACL 2022
LandIrland
ByDublin
Periode22/05/202227/05/2022
SponsorAmazon Science, Bloomberg Engineering, et al., Google Research, Liveperson, Meta
NavnProceedings of the Annual Meeting of the Association for Computational Linguistics
Vol/bind1
ISSN0736-587X

Bibliografisk note

Publisher Copyright:
© 2022 Association for Computational Linguistics.

Antal downloads er baseret på statistik fra Google Scholar og www.ku.dk


Ingen data tilgængelig

ID: 341489931