Can AI Provide Useful Holistic Essay Scoring? – Evidence Library – Artificial Intelligence in Measurement and Education

View on zotero.org

View on zotero.org

Can AI Provide Useful Holistic Essay Scoring?

Article Status

Published

Authors/contributors

Tate, Tamara (Author)
Steiss, Jacob (Author)
Bailey, Drew (Author)
Graham, Steve (Author)
Ritchie, Daniel (Author)
Tseng, Waverly (Author)
Moon, Youngsun (Author)
Warschauer,, Mark (Author)

Title

Can AI Provide Useful Holistic Essay Scoring?

Abstract

Researchers have sought for decades to automate holistic essay scoring. Over the years, these programs have improved significantly. However, accuracy requires significant amounts of training on human-scored texts—reducing the expediency and usefulness of such programs for routine uses by teachers across the nation on non-standardized prompts. This study analyzes the output of multiple versions of ChatGPT scoring of secondary student essays from three extant corpora and compares it to quality human ratings. We find that the current iteration of ChatGPT scoring is not statistically significantly different from human scoring, but exact agreement with humans is still difficult. Consistency and agreement within one point, however, is achievable and may be sufficient for low-stakes, formative assessment purposes.

Repository

OSF

Date

2023-12-5

DOI

10.31219/osf.io/7xpre

URL

https://osf.io/7xpre

Accessed

05/06/2024, 09:43

Language

en-us

Library Catalogue

OSF Preprints

Citation

Tate, T., Steiss, J., Bailey, D., Graham, S., Ritchie, D., Tseng, W., Moon, Y., & Warschauer, M. (2023). Can AI Provide Useful Holistic Essay Scoring? OSF. https://doi.org/10.31219/osf.io/7xpre

Link to this record

https://aievidencehub.org/lib/92J7HBX5

Powered by Zotero and Kerko.