Search

In authors or contributors

"Burstein, Jill"

Publication year

Between 2000 and 2026
- Between 2020 and 2026
  - 2024

2 resources

Responsible AI for Test Equity and Quality: The Duolingo English Test as a Case Study

Jill Burstein, Geoffrey T. LaFlair, Kevi...
|
Aug 28th, 2024
|
preprint

Jill Burstein, Geoffrey T. LaFlair, Kevi...

Aug 28th, 2024

Artificial intelligence (AI) creates opportunities for assessments, such as efficiencies for item generation and scoring of spoken and written responses. At the same time, it poses risks (such as bias in AI-generated item content). Responsible AI (RAI) practices aim to mitigate risks associated with AI. This chapter addresses the critical role of RAI practices in achieving test quality (appropriateness of test score inferences), and test equity (fairness to all test takers). To illustrate,...
LLMs in Short Answer Scoring: Limitations and Promise of Zero-Shot and Few-Shot Approaches

Imran Chamieh, Torsten Zesch, Klaus Gieb...
|
Jun 17th, 2024
|
conferencePaper

Imran Chamieh, Torsten Zesch, Klaus Gieb...

Jun 17th, 2024

In this work, we investigate the potential of Large Language Models (LLMs) for automated short answer scoring. We test zero-shot and few-shot settings, and compare with fine-tuned models and a supervised upper-bound, across three diverse datasets. Our results, in zero-shot and few-shot settings, show that LLMs perform poorly in these settings: LLMs have difficulty with tasks that require complex reasoning or domain-specific knowledge. While the models show promise on general knowledge tasks....

Last update from database: 17/03/2026, 16:15 (UTC)

Powered by Zotero and Kerko.