In authors or contributors

2 resources

  • Yang Liu, Dan Iter, Yichong Xu
    |
    May 23rd, 2023
    |
    preprint
    Yang Liu, Dan Iter, Yichong Xu
    May 23rd, 2023

    The quality of texts generated by natural language generation (NLG) systems is hard to measure automatically. Conventional reference-based metrics, such as BLEU and ROUGE, have been shown to have relatively low correlation with human judgments, especially for tasks that require creativity and diversity. Recent studies suggest using large language models (LLMs) as reference-free metrics for NLG evaluation, which have the benefit of being applicable to new tasks that lack human references....

  • Ming Zhong, Yang Liu, Da Yin
    |
    Oct 13th, 2022
    |
    preprint
    Ming Zhong, Yang Liu, Da Yin
    Oct 13th, 2022

    Multi-dimensional evaluation is the dominant paradigm for human evaluation in Natural Language Generation (NLG), i.e., evaluating the generated text from multiple explainable dimensions, such as coherence and fluency. However, automatic evaluation in NLG is still dominated by similarity-based metrics, and we lack a reliable framework for a more comprehensive evaluation of advanced models. In this paper, we propose a unified multi-dimensional evaluator UniEval for NLG. We re-frame NLG...

Last update from database: 28/12/2024, 08:15 (UTC)
Powered by Zotero and Kerko.