InfoLM: A New Metric to Evaluate Summarization & Data2Text Generation
Article Status
Published
Authors/contributors
- Colombo, Pierre Jean A. (Author)
- Clavel, Chloé (Author)
- Piantanida, Pablo (Author)
Title
InfoLM: A New Metric to Evaluate Summarization & Data2Text Generation
Abstract
Assessing the quality of natural language generation (NLG) systems through human annotation is very expensive. Additionally, human annotation campaigns are time-consuming and include non-reusable human labour. In practice, researchers rely on automatic metrics as a proxy of quality. In the last decade, many string-based metrics (e.g., BLEU or ROUGE) have been introduced. However, such metrics usually rely on exact matches and thus, do not robustly handle synonyms. In this paper, we introduce InfoLM a family of untrained metrics that can be viewed as a string-based metric that addresses the aforementioned flaws thanks to a pre-trained masked language model. This family of metrics also makes use of information measures allowing the possibility to adapt InfoLM to different evaluation criteria. Using direct assessment, we demonstrate that InfoLM achieves statistically significant improvement and two figure correlation gains in many configurations compared to existing metrics on both summarization and data2text generation tasks.
Publication
Proceedings of the AAAI Conference on Artificial Intelligence
Volume
36
Issue
10
Pages
10554-10562
Date
2022-6-28
Journal Abbr
AAAI
ISSN
2374-3468
Short Title
InfoLM
Accessed
12/06/2024, 18:18
Library Catalogue
DOI.org (Crossref)
Extra
<AI Smry>: This paper introduces InfoLM a family of untrained metrics that can be viewed as a string-based metric that addresses the aforementioned flaws thanks to a pre-trained masked language model and makes use of information measures allowing the possibility to adapt InfoLM to different evaluation criteria.
Citation
Colombo, P. J. A., Clavel, C., & Piantanida, P. (2022). InfoLM: A New Metric to Evaluate Summarization & Data2Text Generation. Proceedings of the AAAI Conference on Artificial Intelligence, 36(10), 10554–10562. https://doi.org/10.1609/aaai.v36i10.21299
Technical methods
Link to this record