Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Prajjwal Bhargava
Aleksandr Drozd
Rogers, Anna

Much of recent progress in NLU was shown to be due to models' learning dataset-specific heuristics. We conduct a case study of generalization in NLI (from MNLI to the adversarially constructed HANS dataset) in a range of BERT-based architectures (adapters, Siamese Transformers, HEX debiasing), as well as with subsampling the data and increasing the model size. We report 2 successful and 3 unsuccessful strategies, all providing insights into how Transformer-based models learn to generalize.

Original language	English
Title of host publication	Proceedings of the Second Workshop on Insights from Negative Results in NLP
Number of pages	11
Place of Publication	Online and Punta Cana, Dominican Republic
Publisher	Association for Computational Linguistics (ACL)
Publication date	1 Nov 2021
Pages	125-135
Publication status	Published - 1 Nov 2021

Research areas

t/generalization, task/NLI

Department of Sociology

Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics

Research areas

Links