NLP-Based Management of Large Multiple-Choice Test Item Repositories
Article Status
Published
Authors/contributors
- Albano, Valentina (Author)
- Firmani, Donatella (Author)
- Laura, Luigi (Author)
- Mathew, Jerin George (Author)
- Paoletti, Anna Lucia (Author)
- Torrente, Irene (Author)
Title
NLP-Based Management of Large Multiple-Choice Test Item Repositories
Abstract
Multiple-choice questions (MCQs) are widely used in educational assessments and professional certification exams. Managing large repositories of MCQs, however, poses several challenges due to the high volume of questions and the need to maintain their quality and relevance over time. One of these challenges is the presence of questions that duplicate concepts but are formulated differently. Such questions can indeed elude syntactic controls but provide no added value to the repository.
In this paper, we focus on this specific challenge and propose a workflow for the discovery and management of potential duplicate questions in large MCQ repositories. Overall, the workflow comprises three main steps: MCQ preprocessing, similarity computation, and finally a graph-based exploration and analysis of the obtained similarity values. For the preprocessing phase, we consider three main strategies: (i) removing the list of candidate answers from each question, (ii) augmenting each question with the correct answer, or (iii) augmenting each question with all candidate answers. Then, we use deep learning–based natural language processing (NLP) techniques, based on the Transformers architecture, to compute similarities between MCQs based on semantics. Finally, we propose a new approach to graph exploration based on graph communities to analyze the similarities and relationships between MCQs in the graph. We illustrate the approach with a case study of the Competenze Digitali program, a large-scale assessment project by the Italian government.
Publication
Journal of Learning Analytics
Volume
10
Issue
3
Pages
28-44
Date
2023-12-15
Journal Abbr
Learning Analytics
ISSN
1929-7750
Accessed
22/01/2024, 18:59
Library Catalogue
DOI.org (Crossref)
Extra
Citation Key: albano2023
<标题>: 基于自然语言处理的大型多项选择题题库管理
<AI Smry>: This paper proposes a workflow for the discovery and management of potential duplicate questions in large MCQ repositories, and uses deep learning–based natural language processing (NLP) techniques, based on the Transformers architecture, to compute similarities between MCQs based on semantics.
Citation
Albano, V., Firmani, D., Laura, L., Mathew, J. G., Paoletti, A. L., & Torrente, I. (2023). NLP-Based Management of Large Multiple-Choice Test Item Repositories. Journal of Learning Analytics, 10(3), 28–44. https://doi.org/10.18608/jla.2023.7897
Link to this record