BERT Busters: Outlier Dimensions That Disrupt Transformers

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Presentation
Citation formats

Olga Kovaleva
Saurabh Kulshreshtha
Rogers, Anna
Anna Rumshisky

Multiple studies have shown that Transformers are remarkably robust to pruning. Contrary to this received wisdom, we demonstrate that pre-trained Transformer encoders are surprisingly fragile to the removal of a very small number of features in the layer outputs ($

Original language	English
Title of host publication	Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
Number of pages	14
Place of Publication	Online
Publisher	Association for Computational Linguistics (ACL)
Publication date	1 Aug 2021
Pages	3392-3405
DOIs	https://doi.org/10.18653/v1/2021.findings-acl.300
Publication status	Published - 1 Aug 2021

Department of Sociology

BERT Busters: Outlier Dimensions That Disrupt Transformers

Links