In authors or contributors

2 resources

  • Ziang Xiao, Susu Zhang, Vivian Lai
    |
    Oct 22nd, 2023
    |
    preprint
    Ziang Xiao, Susu Zhang, Vivian Lai
    Oct 22nd, 2023

    We address a fundamental challenge in Natural Language Generation (NLG) model evaluation -- the design and evaluation of evaluation metrics. Recognizing the limitations of existing automatic metrics and noises from how current human evaluation was conducted, we propose MetricEval, a framework informed by measurement theory, the foundation of educational test design, for conceptualizing and evaluating the reliability and validity of NLG evaluation metrics. The framework formalizes the source...

  • Ziang Xiao, Susu Zhang, Vivian Lai
    |
    Oct 22nd, 2023
    |
    preprint
    Ziang Xiao, Susu Zhang, Vivian Lai
    Oct 22nd, 2023

    We address a fundamental challenge in Natural Language Generation (NLG) model evaluation -- the design and evaluation of evaluation metrics. Recognizing the limitations of existing automatic metrics and noises from how current human evaluation was conducted, we propose MetricEval, a framework informed by measurement theory, the foundation of educational test design, for conceptualizing and evaluating the reliability and validity of NLG evaluation metrics. The framework formalizes the source...

Last update from database: 28/12/2024, 23:15 (UTC)
Powered by Zotero and Kerko.