1 resource

  • Tejal Patwardhan, Rachel Dias, Elizabeth...
    |
    Oct 5th, 2025
    |
    preprint
    Tejal Patwardhan, Rachel Dias, Elizabeth...
    Oct 5th, 2025

    We introduce GDPval, a benchmark evaluating AI model capabilities on real-world economically valuable tasks. GDPval covers the majority of U.S. Bureau of Labor Statistics Work Activities for 44 occupations across the top 9 sectors contributing to U.S. GDP (Gross Domestic Product). Tasks are constructed from the representative work of industry professionals with an average of 14 years of experience. We find that frontier model performance on GDPval is improving roughly linearly over time, and...

Last update from database: 27/10/2025, 18:15 (UTC)
Powered by Zotero and Kerko.