Search
28 resources
-
Shashank Sonkar, Naiming Liu, Debshila M...|Jan 22nd, 2023|conferencePaperShashank Sonkar, Naiming Liu, Debshila M...Jan 22nd, 2023
-
Jinlan Fu, See-Kiong Ng, Zhengbao Jiang,...|Jan 22nd, 2023|journalArticleJinlan Fu, See-Kiong Ng, Zhengbao Jiang,...Jan 22nd, 2023
Generative Artificial Intelligence (AI) has enabled the development of sophisticated models that are capable of producing high-caliber text, images, and other outputs through the utilization of large pre-trained models. Nevertheless, assessing the quality of the generation is an even more arduous task than the generation itself, and this issue has not been given adequate consideration recently. This paper proposes a novel evaluation framework, GPTScore, which utilizes the emergent abilities...
-
Jinlan Fu, See-Kiong Ng, Zhengbao Jiang,...|Jan 22nd, 2023|journalArticleJinlan Fu, See-Kiong Ng, Zhengbao Jiang,...Jan 22nd, 2023
Generative Artificial Intelligence (AI) has enabled the development of sophisticated models that are capable of producing high-caliber text, images, and other outputs through the utilization of large pre-trained models. Nevertheless, assessing the quality of the generation is an even more arduous task than the generation itself, and this issue has not been given adequate consideration recently. This paper proposes a novel evaluation framework, GPTScore, which utilizes the emergent abilities...
-
Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...|Jan 22nd, 2023|journalArticleGyeong-Geon Lee, Ehsan Latif, Xuansheng ...Jan 22nd, 2023
This study investigates the application of large language models (LLMs), specifically GPT-3.5 and GPT-4, with Chain-of-Though (CoT) in the automatic scoring of student-written responses to science assessments. We focused on overcoming the challenges of accessibility, technical complexity, and lack of explainability that have previously limited the use of artificial intelligence-based automatic scoring tools among researchers and educators. With a testing dataset comprising six assessment...
-
Annenberg Institute at Brown...|Jun 3rd, 2023|reportAnnenberg Institute at Brown...Jun 3rd, 2023
Providing consistent, individualized feedback to teachers is essential for improving instruction but can be prohibitively resource-intensive in most educational contexts. We develop M-Powering Teachers, an automated tool based on natural language processing to give teachers feedback on their uptake of student contributions, a high-leverage dialogic teaching practice that makes students feel heard. We conduct a randomized controlled trial in an online computer science course (n=1,136...
-
Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...|Jan 22nd, 2023|journalArticleGyeong-Geon Lee, Ehsan Latif, Xuansheng ...Jan 22nd, 2023
This study investigates the application of large language models (LLMs), specifically GPT-3.5 and GPT-4, with Chain-of-Though (CoT) in the automatic scoring of student-written responses to science assessments. We focused on overcoming the challenges of accessibility, technical complexity, and lack of explainability that have previously limited the use of artificial intelligence-based automatic scoring tools among researchers and educators. With a testing dataset comprising six assessment...
-
Yang Liu, Dan Iter, Yichong Xu|Jan 22nd, 2023|preprintYang Liu, Dan Iter, Yichong XuJan 22nd, 2023
The quality of texts generated by natural language generation (NLG) systems is hard to measure automatically. Conventional reference-based metrics, such as BLEU and ROUGE, have been shown to have relatively low correlation with human judgments, especially for tasks that require creativity and diversity. Recent studies suggest using large language models (LLMs) as reference-free metrics for NLG evaluation, which have the benefit of being applicable to new tasks that lack human references....
-
Chi-Min Chan, Weize Chen, Yusheng Su|Aug 14th, 2023|preprintChi-Min Chan, Weize Chen, Yusheng SuAug 14th, 2023
Text evaluation has historically posed significant challenges, often demanding substantial labor and time cost. With the emergence of large language models (LLMs), researchers have explored LLMs' potential as alternatives for human evaluation. While these single-agent-based approaches show promise, experimental results suggest that further advancements are needed to bridge the gap between their current effectiveness and human-level evaluation quality. Recognizing that best practices of human...
-
Chi-Min Chan, Weize Chen, Yusheng Su|Aug 14th, 2023|preprintChi-Min Chan, Weize Chen, Yusheng SuAug 14th, 2023
Text evaluation has historically posed significant challenges, often demanding substantial labor and time cost. With the emergence of large language models (LLMs), researchers have explored LLMs' potential as alternatives for human evaluation. While these single-agent-based approaches show promise, experimental results suggest that further advancements are needed to bridge the gap between their current effectiveness and human-level evaluation quality. Recognizing that best practices of human...
-
Steven Moore, John Stamper, Richard Tong...|Jul 7th, 2023|conferencePaperSteven Moore, John Stamper, Richard Tong...Jul 7th, 2023
-
Andrew M. Olney, Steven Moore, John Stam...|Jul 7th, 2023|conferencePaperAndrew M. Olney, Steven Moore, John Stam...Jul 7th, 2023
-
Matyáš Boháček, Steven Moore, John Stamp...|Jul 7th, 2023|conferencePaperMatyáš Boháček, Steven Moore, John Stamp...Jul 7th, 2023
-
Shashank Sonkar, Richard G. Baraniuk, St...|Jul 7th, 2023|conferencePaperShashank Sonkar, Richard G. Baraniuk, St...Jul 7th, 2023
-
Md Rayhan Kabir, Fuhua Lin, Steven Moore...|Jul 7th, 2023|conferencePaperMd Rayhan Kabir, Fuhua Lin, Steven Moore...Jul 7th, 2023
-
Gautam Yadav, Ying-Jui Tseng, Xiaolin Ni...|Jul 7th, 2023|conferencePaperGautam Yadav, Ying-Jui Tseng, Xiaolin Ni...Jul 7th, 2023
-
Qianou Christina Ma, Sherry Tongshuang W...|Jul 7th, 2023|conferencePaperQianou Christina Ma, Sherry Tongshuang W...Jul 7th, 2023
-
Benjamin D. Nye, Dillon Mee, Mark G. Cor...|Jul 7th, 2023|conferencePaperBenjamin D. Nye, Dillon Mee, Mark G. Cor...Jul 7th, 2023
-
Shouvik Ahmed Antu, Haiyan Chen, Cindy K...|Jul 7th, 2023|conferencePaperShouvik Ahmed Antu, Haiyan Chen, Cindy K...Jul 7th, 2023
-
Bor-Chen Kuo, Frederic T. Y. Chang, Zong...|Jul 7th, 2023|conferencePaperBor-Chen Kuo, Frederic T. Y. Chang, Zong...Jul 7th, 2023
-
Daniel Leiker, Sara Finnigan, Ashley Ric...|Jul 7th, 2023|conferencePaperDaniel Leiker, Sara Finnigan, Ashley Ric...Jul 7th, 2023