Search
161 resources
-
Ben Williamson|Feb 22nd, 2024|blogPostBen WilliamsonFeb 22nd, 2024
Photo by Mick Haupt on Unsplash Over the past year or so, a narrative that AI will inevitably transform education has become widespread. You can find it in the pronouncements of investors, tech ind…
-
Vinu Sankar Sadasivan, Aounon Kumar, Sri...|Feb 19th, 2024|preprintVinu Sankar Sadasivan, Aounon Kumar, Sri...Feb 19th, 2024
The unregulated use of LLMs can potentially lead to malicious consequences such as plagiarism, generating fake news, spamming, etc. Therefore, reliable detection of AI-generated text can be critical to ensure the responsible use of LLMs. Recent works attempt to tackle this problem either using certain model signatures present in the generated text outputs or by applying watermarking techniques that imprint specific patterns onto them. In this paper, we show that these detectors are not...
-
Peiyi Wang, Lei Li, Zhihong Shao|Feb 19th, 2024|preprintPeiyi Wang, Lei Li, Zhihong ShaoFeb 19th, 2024
In this paper, we present an innovative process-oriented math process reward model called \textbf{Math-Shepherd}, which assigns a reward score to each step of math problem solutions. The training of Math-Shepherd is achieved using automatically constructed process-wise supervision data, breaking the bottleneck of heavy reliance on manual annotation in existing work. We explore the effectiveness of Math-Shepherd in two scenarios: 1) \textit{Verification}: Math-Shepherd is utilized for...
-
Siyuan Wang, Zhuohan Long, Zhihao Fan|Feb 17th, 2024|preprintSiyuan Wang, Zhuohan Long, Zhihao FanFeb 17th, 2024
This paper presents a benchmark self-evolving framework to dynamically evaluate rapidly advancing Large Language Models (LLMs), aiming for a more accurate assessment of their capabilities and limitations. We utilize a multi-agent system to manipulate the context or question of original instances, reframing new evolving instances with high confidence that dynamically extend existing benchmarks. Towards a more scalable, robust and fine-grained evaluation, we implement six reframing operations...
-
Siyuan Wang, Zhuohan Long, Zhihao Fan|Feb 17th, 2024|preprintSiyuan Wang, Zhuohan Long, Zhihao FanFeb 17th, 2024
This paper presents a benchmark self-evolving framework to dynamically evaluate rapidly advancing Large Language Models (LLMs), aiming for a more accurate assessment of their capabilities and limitations. We utilize a multi-agent system to manipulate the context or question of original instances, reframing new evolving instances with high confidence that dynamically extend existing benchmarks. Towards a more scalable, robust and fine-grained evaluation, we implement six reframing operations...
-
Nischal Ashok Kumar, Andrew Lan|Feb 10th, 2024|preprintNischal Ashok Kumar, Andrew LanFeb 10th, 2024
In computer science education, test cases are an integral part of programming assignments since they can be used as assessment items to test students' programming knowledge and provide personalized feedback on student-written code. The goal of our work is to propose a fully automated approach for test case generation that can accurately measure student knowledge, which is important for two reasons. First, manually constructing test cases requires expert knowledge and is a labor-intensive...
-
Jan 29th, 2024|webpageJan 29th, 2024
-
Jacob Doughty, Zipiao Wan, Anishka Bompe...|Jan 29th, 2024|conferencePaperJacob Doughty, Zipiao Wan, Anishka Bompe...Jan 29th, 2024
-
Masahiro Kaneko, Danushka Bollegala, Nao...|Jan 28th, 2024|preprintMasahiro Kaneko, Danushka Bollegala, Nao...Jan 28th, 2024
There exist both scalable tasks, like reading comprehension and fact-checking, where model performance improves with model size, and unscalable tasks, like arithmetic reasoning and symbolic reasoning, where model performance does not necessarily improve with model size. Large language models (LLMs) equipped with Chain-of-Thought (CoT) prompting are able to make accurate incremental predictions even on unscalable tasks. Unfortunately, despite their exceptional reasoning abilities, LLMs tend...
-
Agariadne Dwinggo Samala, Xiaoming Zhai,...|Jan 25th, 2024|journalArticleAgariadne Dwinggo Samala, Xiaoming Zhai,...Jan 25th, 2024
As technology progresses, there has been an increasing interest in using Chatbot GPT (Generative Pre-trained Transformer) in education. Chatbot GPT, or ChatGPT, gained one million users within the first week of launching in November 2022 and had amassed over 100 million active users by February 2023. This type of artificial intelligence uses natural language processing to convert it into a user. This paper presents a comprehensive analysis and review of 34 articles published on ChatGPT and...
-
Hunter McNichols, Wanyong Feng, Jaewook ...|Jan 11th, 2024|preprintHunter McNichols, Wanyong Feng, Jaewook ...Jan 11th, 2024
Multiple-choice questions (MCQs) are ubiquitous in almost all levels of education since they are easy to administer, grade, and are a reliable form of assessment. An important aspect of MCQs is the distractors, i.e., incorrect options that are designed to target specific misconceptions or insufficient knowledge among students. To date, the task of crafting high-quality distractors has largely remained a labor-intensive process for teachers and learning content designers, which has limited...
-
S. M. Towhidul Islam Tonmoy, S. M. Mehed...|Jan 8th, 2024|preprintS. M. Towhidul Islam Tonmoy, S. M. Mehed...Jan 8th, 2024
As Large Language Models (LLMs) continue to advance in their ability to write human-like text, a key challenge remains around their tendency to hallucinate generating content that appears factual but is ungrounded. This issue of hallucination is arguably the biggest hindrance to safely deploying these powerful LLMs into real-world production systems that impact people's lives. The journey toward widespread adoption of LLMs in practical settings heavily relies on addressing and mitigating...
-
Hassan Abuhassna, Fareed Awae, Mahyudin ...|Dec 1st, 2024|journalArticleHassan Abuhassna, Fareed Awae, Mahyudin ...Dec 1st, 2024
The integration of Artificial Intelligence (AI) and Machine Learning (ML) in education is a rapidly evolving field, yet the long-term implications and actual impacts on student learning outcomes require more in-depth study. Address this gap, our study offers a novel approach combining bibliometric analysis and a Systematic Literature Review (SLR), guided by the PRISMA methodology. The first phase, a comprehensive bibliometric analysis, identified key nations, educational institutions,...
-
Samah AlKhuzaey, Floriana Grasso, Terry ...|Sep 1st, 2024|journalArticleSamah AlKhuzaey, Floriana Grasso, Terry ...Sep 1st, 2024
Abstract Designing and constructing pedagogical tests that contain items (i.e. questions) which measure various types of skills for different levels of students equitably is a challenging task. Teachers and item writers alike need to ensure that the quality of assessment materials is consistent, if student evaluations are to be objective and effective. Assessment quality and validity are therefore heavily reliant on the quality of the...
-
Simone Balloccu, Patrícia Schmidtová, Ma...|Dec 1st, 2024|preprintSimone Balloccu, Patrícia Schmidtová, Ma...Dec 1st, 2024
Natural Language Processing (NLP) research is increasingly focusing on the use of Large Language Models (LLMs), with some of the most popular ones being either fully or partially closed-source. The lack of access to model details, especially regarding training data, has repeatedly raised concerns about data contamination among researchers. Several attempts have been made to address this issue, but they are limited to anecdotal evidence and trial and error. Additionally, they overlook the...
-
Hamsa Bastani, Osbert Bastani, Alp Sungu...|Dec 1st, 2024|preprintHamsa Bastani, Osbert Bastani, Alp Sungu...Dec 1st, 2024
Generative artificial intelligence (AI) is poised to revolutionize how humans work, and has already demonstrated promise in significantly improving human productivity. However, a key remaining question is how generative AI affects learning, namely, how humans acquire new skills as they perform tasks. This kind of skill learning is critical to long-term productivity gains, especially in domains where generative AI is fallible and human experts must check its outputs. We study the impact of...
-
The Rise of Artificial Intelligence in Educational Measurement: Opportunities and Ethical ChallengesOkan Bulut, Maggie Beiting-Parrish, Jodi...|Dec 1st, 2024|preprintOkan Bulut, Maggie Beiting-Parrish, Jodi...Dec 1st, 2024
The integration of artificial intelligence (AI) in educational measurement has revolutionized assessment methods, enabling automated scoring, rapid content analysis, and personalized feedback through machine learning and natural language processing. These advancements provide timely, consistent feedback and valuable insights into student performance, thereby enhancing the assessment experience. However, the deployment of AI in education also raises significant ethical concerns regarding...
-
Zheng Chu, Jingchang Chen, Qianglong Che...|Dec 1st, 2024|preprintZheng Chu, Jingchang Chen, Qianglong Che...Dec 1st, 2024
Reasoning, a fundamental cognitive process integral to human intelligence, has garnered substantial interest within artificial intelligence. Notably, recent studies have revealed that chain-of-thought prompting significantly enhances LLM's reasoning capabilities, which attracts widespread attention from both academics and industry. In this paper, we systematically investigate relevant research, summarizing advanced methods through a meticulous taxonomy that offers novel perspectives....
-
Scott Andrew Crossley, Perpetual Baffour...|Dec 1st, 2024|preprintScott Andrew Crossley, Perpetual Baffour...Dec 1st, 2024
-
Hamza Fakhar, Mohammed Lamrabet, Nouredd...|Dec 1st, 2024|journalArticleHamza Fakhar, Mohammed Lamrabet, Nouredd...Dec 1st, 2024
—In recent years, the Artificial Intelligence (AI) field has witnessed rapid growth, affecting diverse sectors, including education. In this systematic review of literature, we aimed to analyze studies concerning the integration of AI in the continuous professional development (CPD) of teachers in order to generate a global vision on its potential to enhance the quality of CPD programs in the international level, and to provide recommendations for its application in the Moroccan context. To...