Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses

Article Status
Published
Authors/contributors
Title
Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses
Date
2017
Proceedings Title
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Conference Name
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Place
Vancouver, Canada
Publisher
Association for Computational Linguistics
Pages
1116-1126
Language
en
Short Title
Towards an Automatic Turing Test
Accessed
24/04/2024, 19:37
Library Catalogue
DOI.org (Crossref)
Extra
<AI Smry>: An evaluation model that learns to predict human-like scores to input responses, using a new dataset of human response scores is presented and it is shown that the ADEM model’s predictions correlate significantly, and at a level much higher than word-overlap metrics such as BLEU, with human judgements at both the utterance and system-level.
Citation
Lowe, R., Noseworthy, M., Serban, I. V., Angelard-Gontier, N., Bengio, Y., & Pineau, J. (2017). Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses. Proceedings of the 55th Annual Meeting of the Association for          Computational Linguistics (Volume 1: Long Papers), 1116–1126. https://doi.org/10.18653/v1/P17-1103
Technical methods
Powered by Zotero and Kerko.