Probing neural network comprehension of natural language arguments

Timothy Niven, Hung Yu Kao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Citations (Scopus)

Abstract

We are surprised to find that BERT's peak performance of 77% on the Argument Reasoning Comprehension Task reaches just three points below the average untrained human baseline. However, we show that this result is entirely accounted for by exploitation of spurious statistical cues in the dataset. We analyze the nature of these cues and demonstrate that a range of models all exploit them. This analysis informs the construction of an adversarial dataset on which all models achieve random accuracy. Our adversarial dataset provides a more robust assessment of argument comprehension and should be adopted as the standard in future work.

Original languageEnglish
Title of host publicationACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages4658-4664
Number of pages7
ISBN (Electronic)9781950737482
Publication statusPublished - 2020 Jan 1
Event57th Annual Meeting of the Association for Computational Linguistics, ACL 2019 - Florence, Italy
Duration: 2019 Jul 282019 Aug 2

Publication series

NameACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

Conference

Conference57th Annual Meeting of the Association for Computational Linguistics, ACL 2019
CountryItaly
CityFlorence
Period19-07-2819-08-02

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Computer Science(all)
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Probing neural network comprehension of natural language arguments'. Together they form a unique fingerprint.

Cite this