BERT Busters: Outlier Dimensions That Disrupt Transformers

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Multiple studies have shown that Transformers are remarkably robust to pruning. Contrary to this received wisdom, we demonstrate that pre-trained Transformer encoders are surprisingly fragile to the removal of a very small number of features in the layer outputs ($
OriginalsprogEngelsk
TitelFindings of the Association for Computational Linguistics: ACL-IJCNLP 2021
Antal sider14
UdgivelsesstedOnline
ForlagAssociation for Computational Linguistics (ACL)
Publikationsdato1 aug. 2021
Sider3392-3405
DOI
StatusUdgivet - 1 aug. 2021

ID: 285387504