4 resources

  • Yang Liu, Dan Iter, Yichong Xu
    |
    May 23rd, 2023
    |
    preprint
    Yang Liu, Dan Iter, Yichong Xu
    May 23rd, 2023

    The quality of texts generated by natural language generation (NLG) systems is hard to measure automatically. Conventional reference-based metrics, such as BLEU and ROUGE, have been shown to have relatively low correlation with human judgments, especially for tasks that require creativity and diversity. Recent studies suggest using large language models (LLMs) as reference-free metrics for NLG evaluation, which have the benefit of being applicable to new tasks that lack human references....

  • Jiaan Wang, Yunlong Liang, Fandong Meng,...
    |
    Apr 4th, 2023
    |
    journalArticle
    Jiaan Wang, Yunlong Liang, Fandong Meng,...
    Apr 4th, 2023

    Recently, the emergence of ChatGPT has attracted wide attention from the computational linguistics community. Many prior studies have shown that ChatGPT achieves remarkable performance on various NLP tasks in terms of automatic evaluation metrics. However, the ability of ChatGPT to serve as an evaluation metric is still underexplored. Considering assessing the quality of natural language generation (NLG) models is an arduous task and NLG metrics notoriously show their poor correlation with...

  • Jason Wei, Xuezhi Wang, Dale Schuurmans,...
    |
    Jan 10th, 2023
    |
    preprint
    Jason Wei, Xuezhi Wang, Dale Schuurmans,...
    Jan 10th, 2023

    We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning. In particular, we show how such reasoning abilities emerge naturally in sufficiently large language models via a simple method called chain of thought prompting, where a few chain of thought demonstrations are provided as exemplars in prompting. Experiments on three large language models show that chain of...

  • Lei Huang, Weijiang Yu, Weitao Ma
    |
    Nov 9th, 2023
    |
    preprint
    Lei Huang, Weijiang Yu, Weitao Ma
    Nov 9th, 2023

    The emergence of large language models (LLMs) has marked a significant breakthrough in natural language processing (NLP), leading to remarkable advancements in text understanding and generation. Nevertheless, alongside these strides, LLMs exhibit a critical tendency to produce hallucinations, resulting in content that is inconsistent with real-world facts or user inputs. This phenomenon poses substantial challenges to their practical deployment and raises concerns over the reliability of...

Last update from database: 04/04/2025, 19:15 (UTC)
Powered by Zotero and Kerko.