2 resources

  • Mirac Suzgun, Nathan Scales, Nathanael S...
    |
    Oct 17th, 2022
    |
    preprint
    Mirac Suzgun, Nathan Scales, Nathanael S...
    Oct 17th, 2022

    BIG-Bench (Srivastava et al., 2022) is a diverse evaluation suite that focuses on tasks believed to be beyond the capabilities of current language models. Language models have already made good progress on this benchmark, with the best model in the BIG-Bench paper outperforming average reported human-rater results on 65% of the BIG-Bench tasks via few-shot prompting. But on what tasks do language models fall short of average human-rater performance, and are those tasks actually unsolvable by...

  • Rohan Anil, Andrew M. Dai, Orhan Firat
    |
    May 17th, 2023
    |
    preprint
    Rohan Anil, Andrew M. Dai, Orhan Firat
    May 17th, 2023

    We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is a Transformer-based model trained using a mixture of objectives. Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more...

Last update from database: 22/10/2025, 00:15 (UTC)
Powered by Zotero and Kerko.