Using convolutional neural networks to automatically score eight TIMSS 2019 graphical response items

Article Status
Published
Authors/contributors
Title
Using convolutional neural networks to automatically score eight TIMSS 2019 graphical response items
Abstract
International large-scale assessments (ILSAs) have used graphical response-based items to measure student ability for decades, but they have yet to implement automated scoring of these responses and instead rely on human scoring alone. To investigate how scores provided by machine algorithms compare to those provided by human raters, we applied convolutional neural networks (CNNs) to classify image-based responses from eight Timss 2019 items. Our results show that the most accurate CNN models classified over 99% of the image responses into the appropriate scoring category for dichotomous items and almost 98% for one trichotomous item. Additionally, during the modeling process, the CNNs correctly classified numerous image responses that human raters had scored incorrectly. For most items, the number of incorrectly human-scored responses exceeded the average number of responses misclassified by the most accurate models. These results suggest that automated scoring using CNNs is comparable to, and in many cases more accurate, than human raters, even across a wide variety of graphing tasks. This paper argues that the machine learning procedure explored could be implemented in ILSAs as a verification method to improve the accuracy and consistency of graphical response item scores. In lieu of additional human raters, ILSAs could implement CNN-based automated scoring to provide a second set of scores, thus reducing the workload and costs associated with human scoring.
Publication
Computers and Education: Artificial Intelligence
Volume
6
Pages
100249
Date
2024-6
Journal Abbr
Comput. Educ.: Artif. Intell.
Language
en
ISSN
2666-920X
Accessed
19/06/2024, 08:20
Library Catalogue
ScienceDirect
Citation
Tyack, L., Khorramdel, L., & von Davier, M. (2024). Using convolutional neural networks to automatically score eight TIMSS 2019 graphical response items. Computers and Education: Artificial Intelligence, 6, 100249. https://doi.org/10.1016/j.caeai.2024.100249
Powered by Zotero and Kerko.