In authors or contributors

6 resources

  • Ehsan Latif, Xiaoming Zhai
    |
    Oct 28th, 2023
    |
    journalArticle
    Ehsan Latif, Xiaoming Zhai
    Oct 28th, 2023

    This study highlights the potential of fine-tuned ChatGPT (GPT-3.5) for automatically scoring student written constructed responses using example assessment tasks in science education. Recent studies on OpenAI's generative model GPT-3.5 proved its superiority in predicting the natural language with high accuracy and human-like responses. GPT-3.5 has been trained over enormous online language materials such as journals and Wikipedia; therefore, more than direct usage of pre-trained GPT-3.5 is...

  • Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...
    |
    Oct 28th, 2023
    |
    journalArticle
    Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...
    Oct 28th, 2023

    This study investigates the application of large language models (LLMs), specifically GPT-3.5 and GPT-4, with Chain-of-Though (CoT) in the automatic scoring of student-written responses to science assessments. We focused on overcoming the challenges of accessibility, technical complexity, and lack of explainability that have previously limited the use of artificial intelligence-based automatic scoring tools among researchers and educators. With a testing dataset comprising six assessment...

  • Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...
    |
    Oct 28th, 2023
    |
    journalArticle
    Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...
    Oct 28th, 2023

    This study investigates the application of large language models (LLMs), specifically GPT-3.5 and GPT-4, with Chain-of-Though (CoT) in the automatic scoring of student-written responses to science assessments. We focused on overcoming the challenges of accessibility, technical complexity, and lack of explainability that have previously limited the use of artificial intelligence-based automatic scoring tools among researchers and educators. With a testing dataset comprising six assessment...

  • Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...
    |
    Jun 28th, 2024
    |
    journalArticle
    Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...
    Jun 28th, 2024
  • Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...
    |
    Jun 28th, 2024
    |
    preprint
    Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...
    Jun 28th, 2024

    This study investigates the application of large language models (LLMs), specifically GPT-3.5 and GPT-4, with Chain-of-Though (CoT) in the automatic scoring of student-written responses to science assessments. We focused on overcoming the challenges of accessibility, technical complexity, and lack of explainability that have previously limited the use of artificial intelligence-based automatic scoring tools among researchers and educators. With a testing dataset comprising six assessment...

  • Xuansheng Wu, Padmaja Pravin Saraf, Gyeo...
    |
    Feb 21st, 2025
    |
    preprint
    Xuansheng Wu, Padmaja Pravin Saraf, Gyeo...
    Feb 21st, 2025

    Large language models (LLMs) have demonstrated strong potential in performing automatic scoring for constructed response assessments. While constructed responses graded by humans are usually based on given grading rubrics, the methods by which LLMs assign scores remain largely unclear. It is also uncertain how closely AI's scoring process mirrors that of humans or if it adheres to the same grading criteria. To address this gap, this paper uncovers the grading rubrics that LLMs used to score...

Last update from database: 28/10/2025, 11:15 (UTC)
Powered by Zotero and Kerko.