Search

In authors or contributors

"Latif, Ehsan"

6 resources

Fine-tuning ChatGPT for Automatic Scoring

Ehsan Latif, Xiaoming Zhai
|
Mar 17th, 2023
|
journalArticle

Ehsan Latif, Xiaoming Zhai

Mar 17th, 2023

This study highlights the potential of fine-tuned ChatGPT (GPT-3.5) for automatically scoring student written constructed responses using example assessment tasks in science education. Recent studies on OpenAI's generative model GPT-3.5 proved its superiority in predicting the natural language with high accuracy and human-like responses. GPT-3.5 has been trained over enormous online language materials such as journals and Wikipedia; therefore, more than direct usage of pre-trained GPT-3.5 is...
Applying Large Language Models and Chain-of-Thought for Automatic Scoring

Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...
|
Mar 17th, 2023
|
journalArticle

Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...

Mar 17th, 2023

This study investigates the application of large language models (LLMs), specifically GPT-3.5 and GPT-4, with Chain-of-Though (CoT) in the automatic scoring of student-written responses to science assessments. We focused on overcoming the challenges of accessibility, technical complexity, and lack of explainability that have previously limited the use of artificial intelligence-based automatic scoring tools among researchers and educators. With a testing dataset comprising six assessment...
Applying Large Language Models and Chain-of-Thought for Automatic Scoring

Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...
|
Jun 17th, 2024
|
preprint

Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...

Jun 17th, 2024

This study investigates the application of large language models (LLMs), specifically GPT-3.5 and GPT-4, with Chain-of-Though (CoT) in the automatic scoring of student-written responses to science assessments. We focused on overcoming the challenges of accessibility, technical complexity, and lack of explainability that have previously limited the use of artificial intelligence-based automatic scoring tools among researchers and educators. With a testing dataset comprising six assessment...
Applying Large Language Models and Chain-of-Thought for Automatic Scoring

Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...
|
Mar 17th, 2023
|
journalArticle

Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...

Mar 17th, 2023

This study investigates the application of large language models (LLMs), specifically GPT-3.5 and GPT-4, with Chain-of-Though (CoT) in the automatic scoring of student-written responses to science assessments. We focused on overcoming the challenges of accessibility, technical complexity, and lack of explainability that have previously limited the use of artificial intelligence-based automatic scoring tools among researchers and educators. With a testing dataset comprising six assessment...
Applying large language models and chain-of-thought for automatic scoring

Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...
|
Jun 17th, 2024
|
journalArticle

Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...

Jun 17th, 2024
Unveiling Scoring Processes: Dissecting the Differences between LLMs and Human Graders in Automatic Scoring

Xuansheng Wu, Padmaja Pravin Saraf, Gyeo...
|
Feb 21st, 2025
|
preprint

Xuansheng Wu, Padmaja Pravin Saraf, Gyeo...

Feb 21st, 2025

Large language models (LLMs) have demonstrated strong potential in performing automatic scoring for constructed response assessments. While constructed responses graded by humans are usually based on given grading rubrics, the methods by which LLMs assign scores remain largely unclear. It is also uncertain how closely AI's scoring process mirrors that of humans or if it adheres to the same grading criteria. To address this gap, this paper uncovers the grading rubrics that LLMs used to score...

Last update from database: 17/03/2026, 19:15 (UTC)

Powered by Zotero and Kerko.