1 resource

  • Imran Chamieh, Torsten Zesch, Klaus Gieb...
    |
    Jun 22nd, 2024
    |
    conferencePaper
    Imran Chamieh, Torsten Zesch, Klaus Gieb...
    Jun 22nd, 2024

    In this work, we investigate the potential of Large Language Models (LLMs) for automated short answer scoring. We test zero-shot and few-shot settings, and compare with fine-tuned models and a supervised upper-bound, across three diverse datasets. Our results, in zero-shot and few-shot settings, show that LLMs perform poorly in these settings: LLMs have difficulty with tasks that require complex reasoning or domain-specific knowledge. While the models show promise on general knowledge tasks....

Last update from database: 22/10/2025, 07:15 (UTC)
Powered by Zotero and Kerko.