TY - GEN
T1 - Probing neural network comprehension of natural language arguments
AU - Niven, Timothy
AU - Kao, Hung Yu
N1 - Publisher Copyright:
© 2019 Association for Computational Linguistics
PY - 2020
Y1 - 2020
N2 - We are surprised to find that BERT's peak performance of 77% on the Argument Reasoning Comprehension Task reaches just three points below the average untrained human baseline. However, we show that this result is entirely accounted for by exploitation of spurious statistical cues in the dataset. We analyze the nature of these cues and demonstrate that a range of models all exploit them. This analysis informs the construction of an adversarial dataset on which all models achieve random accuracy. Our adversarial dataset provides a more robust assessment of argument comprehension and should be adopted as the standard in future work.
AB - We are surprised to find that BERT's peak performance of 77% on the Argument Reasoning Comprehension Task reaches just three points below the average untrained human baseline. However, we show that this result is entirely accounted for by exploitation of spurious statistical cues in the dataset. We analyze the nature of these cues and demonstrate that a range of models all exploit them. This analysis informs the construction of an adversarial dataset on which all models achieve random accuracy. Our adversarial dataset provides a more robust assessment of argument comprehension and should be adopted as the standard in future work.
UR - http://www.scopus.com/inward/record.url?scp=85084006698&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85084006698&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85084006698
T3 - ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
SP - 4658
EP - 4664
BT - ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
PB - Association for Computational Linguistics (ACL)
T2 - 57th Annual Meeting of the Association for Computational Linguistics, ACL 2019
Y2 - 28 July 2019 through 2 August 2019
ER -