Finding Words Associated with DIF: Predicting Differential Item Functioning using LLMs and Explainable AI
Article Status
Published
Authors/contributors
- Maeda, Hotaka (Author)
- Lu, Yikai (Author)
Title
Finding Words Associated with DIF: Predicting Differential Item Functioning using LLMs and Explainable AI
Abstract
We fine-tuned and compared several encoder-based Transformer large language models (LLM) to predict differential item functioning (DIF) from the item text. We then applied explainable artificial intelligence (XAI) methods to these models to identify specific words associated with DIF. The data included 42,180 items designed for English language arts and mathematics summative state assessments among students in grades 3 to 11. Prediction $R^2$ ranged from .04 to .32 among eight focal and reference group pairs. Our findings suggest that many words associated with DIF reflect minor sub-domains included in the test blueprint by design, rather than construct-irrelevant item content that should be removed from assessments. This may explain why qualitative reviews of DIF items often yield confusing or inconclusive results. Our approach can be used to screen words associated with DIF during the item-writing process for immediate revision, or help review traditional DIF analysis results by highlighting key words in the text. Extensions of this research can enhance the fairness of assessment programs, especially those that lack resources to build high-quality items, and among smaller subpopulations where we do not have sufficient sample sizes for traditional DIF analyses.
Repository
arXiv
Archive ID
arXiv:2502.07017
Date
2025-02-10
Citation Key
maeda2025
Accessed
24/02/2025, 17:42
Short Title
Finding Words Associated with DIF
Library Catalogue
Extra
arXiv:2502.07017 [cs]
<标题>: 寻找与 DIF 相关的词语:使用大型语言模型和可解释人工智能预测差异项目功能
<AI Smry>: The findings suggest that many words associated with DIF reflect minor sub-domains included in the test blueprint by design, rather than construct-irrelevant item content that should be removed from assessments, which may explain why qualitative reviews of DIF items often yield confusing or inconclusive results.
Citation
Maeda, H., & Lu, Y. (2025). Finding Words Associated with DIF: Predicting Differential Item Functioning using LLMs and Explainable AI (arXiv:2502.07017). arXiv. https://doi.org/10.48550/arXiv.2502.07017
Link to this record