Bot-Adversarial Dialogue for Safe Conversational Agents
Article Status
Published
Authors/contributors
- Xu, Jing (Author)
- Ju, Da (Author)
- Li, Margaret (Author)
- Boureau, Y-Lan (Author)
- Weston, Jason (Author)
- Dinan, Emily (Author)
Title
Bot-Adversarial Dialogue for Safe Conversational Agents
Date
2021
Proceedings Title
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Conference Name
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Place
Online
Publisher
Association for Computational Linguistics
Pages
2950-2968
Language
en
Accessed
22/04/2024, 18:24
Library Catalogue
DOI.org (Crossref)
Extra
<AI Smry>: This work introduces a new human-and-model-in-the-loop framework for evaluating the toxicity of generative models, and proposes two novel methods for safe conversational agents by either training on data from the framework in a two-stage system, or ”baking-in” safety to the generative model itself.
Citation
Xu, J., Ju, D., Li, M., Boureau, Y.-L., Weston, J., & Dinan, E. (2021). Bot-Adversarial Dialogue for Safe Conversational Agents. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2950–2968. https://doi.org/10.18653/v1/2021.naacl-main.235
Technical methods
Link to this record