Bot-Adversarial Dialogue for Safe Conversational Agents

Xu, Jing; Ju, Da; Li, Margaret; Boureau, Y-Lan; Weston, Jason; Dinan, Emily

doi:10.18653/v1/2021.naacl-main.235

Bot-Adversarial Dialogue for Safe Conversational Agents

Article Status

Published

Authors/contributors

Xu, Jing (Author)
Ju, Da (Author)
Li, Margaret (Author)
Boureau, Y-Lan (Author)
Weston, Jason (Author)
Dinan, Emily (Author)

Title

Bot-Adversarial Dialogue for Safe Conversational Agents

Date

2021

Proceedings Title

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Conference Name

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Place

Online

Publisher

Association for Computational Linguistics

Pages

2950-2968

Language

en

DOI

10.18653/v1/2021.naacl-main.235

URL

https://aclanthology.org/2021.naacl-main.235

Accessed

22/04/2024, 18:24

Library Catalogue

DOI.org (Crossref)

Extra

<AI Smry>: This work introduces a new human-and-model-in-the-loop framework for evaluating the toxicity of generative models, and proposes two novel methods for safe conversational agents by either training on data from the framework in a two-stage system, or ”baking-in” safety to the generative model itself.

Citation

Xu, J., Ju, D., Li, M., Boureau, Y.-L., Weston, J., & Dinan, E. (2021). Bot-Adversarial Dialogue for Safe Conversational Agents. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2950–2968. https://doi.org/10.18653/v1/2021.naacl-main.235

Technical methods

model evaluation subgroup

Link to this record

https://aievidencehub.org/lib/QWTAS5AT