In authors or contributors
Publication year

1 resource

  • Mirac Suzgun, Nathan Scales, Nathanael S...
    |
    Oct 17th, 2022
    |
    preprint
    Mirac Suzgun, Nathan Scales, Nathanael S...
    Oct 17th, 2022

    BIG-Bench (Srivastava et al., 2022) is a diverse evaluation suite that focuses on tasks believed to be beyond the capabilities of current language models. Language models have already made good progress on this benchmark, with the best model in the BIG-Bench paper outperforming average reported human-rater results on 65% of the BIG-Bench tasks via few-shot prompting. But on what tasks do language models fall short of average human-rater performance, and are those tasks actually unsolvable by...

Last update from database: 22/10/2025, 22:15 (UTC)
Powered by Zotero and Kerko.