In authors or contributors

3 resources

  • Jason Wei, Xuezhi Wang, Dale Schuurmans,...
    |
    Jan 10th, 2023
    |
    preprint
    Jason Wei, Xuezhi Wang, Dale Schuurmans,...
    Jan 10th, 2023

    We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning. In particular, we show how such reasoning abilities emerge naturally in sufficiently large language models via a simple method called chain of thought prompting, where a few chain of thought demonstrations are provided as exemplars in prompting. Experiments on three large language models show that chain of...

  • Jason Wei, Xuezhi Wang, Dale Schuurmans,...
    |
    Jan 10th, 2023
    |
    preprint
    Jason Wei, Xuezhi Wang, Dale Schuurmans,...
    Jan 10th, 2023

    We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning. In particular, we show how such reasoning abilities emerge naturally in sufficiently large language models via a simple method called chain of thought prompting, where a few chain of thought demonstrations are provided as exemplars in prompting. Experiments on three large language models show that chain of...

  • Mirac Suzgun, Nathan Scales, Nathanael S...
    |
    Oct 28th, 2023
    |
    preprint
    Mirac Suzgun, Nathan Scales, Nathanael S...
    Oct 28th, 2023

    BIG-Bench (Srivastava et al., 2022) is a diverse evaluation suite that focuses on tasks believed to be beyond the capabilities of current language models. Language models have already made good progress on this benchmark, with the best model in the BIG-Bench paper outperforming average reported human-rater results on 65% of the BIG-Bench tasks via few-shot prompting. But on what tasks do language models fall short of average human-rater performance, and are those tasks actually unsolvable by...

Last update from database: 28/10/2025, 22:15 (UTC)
Powered by Zotero and Kerko.