Search
161 resources
-
Guher Gorgun, Okan Bulut|Dec 19th, 2024|journalArticleGuher Gorgun, Okan BulutDec 19th, 2024
Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large‐language models, specifically Llama 3‐8B, for evaluating automatically generated cloze items. The trained large‐language model was able to filter out majority of good and bad items accurately....
-
Yizhou Fan, Luzhen Tang, Huixiao Le|Dec 10th, 2024|journalArticleYizhou Fan, Luzhen Tang, Huixiao LeDec 10th, 2024
With the continuous development of technological and educational innovation, learners nowadays can obtain a variety of supports from agents such as teachers, peers, education technologies, and recently, generative artificial intelligence such as ChatGPT. In particular, there has been a surge of academic interest in human‐AI collaboration and hybrid intelligence in learning. The concept of hybrid intelligence is still at a nascent stage, and how learners can benefit from a symbiotic...
-
Rose E Wang, Ana T Ribeiro, Carly D Robi...|Nov 25th, 2024|journalArticleRose E Wang, Ana T Ribeiro, Carly D Robi...Nov 25th, 2024
-
Andrew Runge, Yigal Attali, Geoffrey T. ...|Nov 4th, 2024|journalArticleAndrew Runge, Yigal Attali, Geoffrey T. ...Nov 4th, 2024
Introduction Assessments of interactional competence have traditionally been limited in large-scale language assessments. The listening portion suffers from construct underrepresentation, whereas the speaking portion suffers from limited task formats such as in-person interviews or role plays. Human-delivered tasks are challenging to administer at large scales, while automated assessments are typically very narrow in their assessment of the construct because they have carried...
-
Renzhe Yu, Zhen Xu, Sky CH-Wang|Nov 2nd, 2024|preprintRenzhe Yu, Zhen Xu, Sky CH-WangNov 2nd, 2024
The universal availability of ChatGPT and other similar tools since late 2022 has prompted tremendous public excitement and experimental effort about the potential of large language models (LLMs) to improve learning experience and outcomes, especially for learners from disadvantaged backgrounds. However, little research has systematically examined the real-world impacts of LLM availability on educational equity beyond theoretical projections and controlled studies of innovative LLM...
-
Junyi Li, Tianyi Tang, Wayne Xin Zhao|Oct 31st, 2024|journalArticleJunyi Li, Tianyi Tang, Wayne Xin ZhaoOct 31st, 2024
Text Generation aims to produce plausible and readable text in human language from input data. The resurgence of deep learning has greatly advanced this field, in particular, with the help of neural generation models based on pre-trained language models (PLMs). Text generation based on PLMs is viewed as a promising approach in both academia and industry. In this article, we provide a survey on the utilization of PLMs in text generation. We begin with introducing two key aspects of applying...
-
Oct 26th, 2024|journalArticleOct 26th, 2024
-
Apollo - University of Cambr...|Oct 23rd, 2024|reportApollo - University of Cambr...Oct 23rd, 2024
We present a new annotated corpus of written learner English, derived from essays submitted to the learning platform Write & Improve (W&I). Users of W&I are presented with automated scoring and feedback on grammatical errors, and are encouraged to act on their error feedback, submitting multiple versions of their essays for any given prompt. We build the corpus on this interplay between users and prompts, collecting sets of essays submitted by users for a selected list of 50...
-
Yavuz Selim Kıyak, Emre Emekli|Oct 18th, 2024|journalArticleYavuz Selim Kıyak, Emre EmekliOct 18th, 2024
Abstract ChatGPT’s role in creating multiple-choice questions (MCQs) is growing but the validity of these artificial-intelligence-generated questions is unclear. This literature review was conducted to address the urgent need for understanding the application of ChatGPT in generating MCQs for medical education. Following the database search and screening of 1920 studies, we found 23 relevant studies. We extracted the prompts for MCQ generation and assessed the validity evidence...
-
Burcu Arslan, Blair Lehman, Caitlin Teni...|Oct 7th, 2024|journalArticleBurcu Arslan, Blair Lehman, Caitlin Teni...Oct 7th, 2024
In line with the positive effects of personalized learning, personalized assessments are expected to maximize learner motivation and engagement, allowing learners to show what they truly know and can do. Considering the advances in Generative Artificial Intelligence (GenAI), in this perspective article, we elaborate on the opportunities of integrating GenAI into personalized educational assessments to maximize learner engagement, performance, and access. We also draw attention to the...
-
Hotaka Maeda|Oct 3rd, 2024|journalArticleHotaka MaedaOct 3rd, 2024
Field-testing is an essential yet often resource-intensive step in the development of high-quality educational assessments. I introduce an innovative method for field-testing newly written exam items by substituting human examinees with artificially intelligent (AI) examinees. The proposed approach is demonstrated using 466 four-option multiple-choice English grammar questions. Pre-trained transformer language models are fine-tuned based on the 2-parameter logistic (2PL) item response model...
-
Rose E. Wang, Ana T. Ribeiro, Carly D. R...|Oct 3rd, 2024|preprintRose E. Wang, Ana T. Ribeiro, Carly D. R...Oct 3rd, 2024
Generative AI, particularly Language Models (LMs), has the potential to transform real-world domains with societal impact, particularly where access to experts is limited. For example, in education, training novice educators with expert guidance is important for effectiveness but expensive, creating significant barriers to improving education quality at scale. This challenge disproportionately harms students from under-served communities, who stand to gain the most from high-quality...
-
Chung Kwan Lo, Khe Foon Hew, Morris Siu-...|Oct 1st, 2024|journalArticleChung Kwan Lo, Khe Foon Hew, Morris Siu-...Oct 1st, 2024
ChatGPT, a state-of-the-art artificial intelligence (AI) chatbot, has gained considerable attention as a transformative yet controversial tool for enhancing teaching and learning experiences. Several reviews and numerous articles have been written about harnessing ChatGPT in education since its release on November 30, 2022. Besides summarising its strengths, weaknesses, opportunities, and threats (SWOT) as identified in previous systematic reviews of ChatGPT research, this systematic review...
-
Loc Nguyen, Jessie S. Barrot|Oct 1st, 2024|journalArticleLoc Nguyen, Jessie S. BarrotOct 1st, 2024
-
Yujie Sun, Dongfang Sheng, Zihan Zhou|Sep 27th, 2024|journalArticleYujie Sun, Dongfang Sheng, Zihan ZhouSep 27th, 2024
-
Lara Lee Russell-Lasalandra, Alexander P...|Sep 12th, 2024|preprintLara Lee Russell-Lasalandra, Alexander P...Sep 12th, 2024
The rapid advancement of artificial intelligence (AI), particularly large language models (LLMs), has introduced powerful tools for various research domains, including psychological scale development. This study presents a fully automated method to efficiently generate and select high-quality, non-redundant items for psychological assessments using LLMs and network psychometrics. Our approach called, Automatic Item Generation and Validation via Network-Integrated Evaluation (AI-GENIE),...
-
Sep 3rd, 2024|webpageSep 3rd, 2024
A collection of Ai2’s evaluation frameworks and benchmarks, open and accessible to compare like-for-like outcomes.
-
Aug 29th, 2024|webpageAug 29th, 2024
-
Aug 29th, 2024|webpageAug 29th, 2024
The UK government announced a new project today that will enhance AI's ability to assist teachers in marking work and planning lessons.
-
Jill Burstein, Geoffrey T. LaFlair, Kevi...|Aug 28th, 2024|preprintJill Burstein, Geoffrey T. LaFlair, Kevi...Aug 28th, 2024
Artificial intelligence (AI) creates opportunities for assessments, such as efficiencies for item generation and scoring of spoken and written responses. At the same time, it poses risks (such as bias in AI-generated item content). Responsible AI (RAI) practices aim to mitigate risks associated with AI. This chapter addresses the critical role of RAI practices in achieving test quality (appropriateness of test score inferences), and test equity (fairness to all test takers). To illustrate,...