BERT Busters: Outlier Dimensions That Disrupt Transformers
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Multiple studies have shown that Transformers are remarkably robust to pruning. Contrary to this received wisdom, we demonstrate that pre-trained Transformer encoders are surprisingly fragile to the removal of a very small number of features in the layer outputs ($
Original language | English |
---|---|
Title of host publication | Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 |
Number of pages | 14 |
Place of Publication | Online |
Publisher | Association for Computational Linguistics (ACL) |
Publication date | 1 Aug 2021 |
Pages | 3392-3405 |
DOIs | |
Publication status | Published - 1 Aug 2021 |
ID: 285387504