605 resources

  • Jerald Guiwan, Leeron Lacson, Michaella ...
    |
    May 6th, 2024
    |
    preprint
    Jerald Guiwan, Leeron Lacson, Michaella ...
    May 6th, 2024

    In the Philippines, elementary students face a significant educational hurdle, particularly in Grade 4, where foundational competencies prove challenging to grasp. This research aims to provide a possible solution for this issue by developing and investigating the functionality of an electronic GLM-powered learner-oriented tool (EGLOT), designed to act as an educational companion that leverages a generative language model to personalize the learning experience for Grade 3 students in...

  • Joy He-Yueya, Noah D. Goodman, Emma Brun...
    |
    May 6th, 2024
    |
    preprint
    Joy He-Yueya, Noah D. Goodman, Emma Brun...
    May 6th, 2024

    Creating effective educational materials generally requires expensive and time-consuming studies of student learning outcomes. To overcome this barrier, one idea is to build computational models of student learning and use them to optimize instructional materials. However, it is difficult to model the cognitive processes of learning dynamics. We propose an alternative approach that uses Language Models (LMs) as educational experts to assess the impact of various instructions on learning...

  • Hojung Kim, Changkyung Song, Jiyoung Kim...
    |
    May 6th, 2024
    |
    journalArticle
    Hojung Kim, Changkyung Song, Jiyoung Kim...
    May 6th, 2024

    This study presents a modified version of the Korean Elicited Imitation (EI) test, designed to resemble natural spoken language, and validates its reliability as a measure of proficiency. The study assesses the correlation between average test scores and Test of Proficiency in Korean (TOPIK) levels, examining score distributions among beginner, intermediate, and advanced learner groups. Using item response theory (IRT), the study explores the influence of four key facets—learners, items,...

  • Sunder Ali Khowaja, Parus Khuwaja, Kapal...
    |
    May 5th, 2024
    |
    journalArticle
    Sunder Ali Khowaja, Parus Khuwaja, Kapal...
    May 5th, 2024

    Abstract ChatGPT is another large language model (LLM) vastly available for the consumers on their devices but due to its performance and ability to converse effectively, it has gained a huge popularity amongst research as well as industrial community. Recently, many studies have been published to show the effectiveness, efficiency, integration, and sentiments of chatGPT and other LLMs. In contrast, this study focuses on the important aspects that are mostly overlooked, i.e....

  • Hugh Zhang, Jeff Da, Dean Lee
    |
    May 3rd, 2024
    |
    preprint
    Hugh Zhang, Jeff Da, Dean Lee
    May 3rd, 2024

    Large language models (LLMs) have achieved impressive success on many benchmarks for mathematical reasoning. However, there is growing concern that some of this performance actually reflects dataset contamination, where data closely resembling benchmark questions leaks into the training data, instead of true reasoning ability. To investigate this claim rigorously, we commission Grade School Math 1000 (GSM1k). GSM1k is designed to mirror the style and complexity of the established GSM8k...

  • Ryan Heath
    |
    May 1st, 2024
    |
    webpage
    Ryan Heath
    May 1st, 2024

    The Pentagon is hitting the brakes on the new technology even as business is charging forward.

  • Gloria Ashiya Katuka, Alexander Gain, Ye...
    |
    May 1st, 2024
    |
    preprint
    Gloria Ashiya Katuka, Alexander Gain, Ye...
    May 1st, 2024

    Automatic grading and feedback have been long studied using traditional machine learning and deep learning techniques using language models. With the recent accessibility to high performing large language models (LLMs) like LLaMA-2, there is an opportunity to investigate the use of these LLMs for automatic grading and feedback generation. Despite the increase in performance, LLMs require significant computational resources for fine-tuning and additional specific adjustments to enhance their...

  • Jaewook Lee, Digory Smith, Simon Woodhea...
    |
    May 1st, 2024
    |
    preprint
    Jaewook Lee, Digory Smith, Simon Woodhea...
    May 1st, 2024

    Multiple choice questions (MCQs) are a popular method for evaluating students’ knowledge due to their efficiency in administration and grading. Crafting high-quality math MCQs is a labor-intensive process that requires educators to formulate precise stems and plausible distractors. Recent advances in large language models (LLMs) have sparked interest in automating MCQ creation, but challenges persist in ensuring mathematical accuracy and addressing student errors. This paper introduces a...

  • Mark D. Shermis, Joshua Wilson
    |
    May 1st, 2024
    |
    book
    Mark D. Shermis, Joshua Wilson
    May 1st, 2024

    "The Routledge International Handbook of Automated Essay Evaluation (AEE) is a definitive guide at the intersection of automation, artificial intelligence, and education. This volume encapsulates the ongoing advancement of AEE, reflecting its application in both large-scale and classroom-based assessments to support teaching and learning endeavours"--

  • Ishaan Watts, Varun Gumma, Aditya Yadava...
    |
    May 1st, 2024
    |
    journalArticle
    Ishaan Watts, Varun Gumma, Aditya Yadava...
    May 1st, 2024

    Evaluation of multilingual Large Language Models (LLMs) is challenging due to a variety of factors – the lack of benchmarks with sufficient linguistic diversity, contamination of popular benchmarks into LLM pre-training data and the lack of local, cultural nuances in translated benchmarks. Hence, it is difficult to do extensive evaluation of LLMs in the multilingual […]

  • Siddharth Dixit, Indermit S Gill
    |
    May 1st, 2024
    |
    report
    Siddharth Dixit, Indermit S Gill
    May 1st, 2024

    The study explores the potential benefits of implementing artificial intelligence (AI) in seven development sectors that receive significant funding from the World Bank. The study provides an overview of the challenges faced by these sectors, including agriculture, healthcare, education, finance, energy, infrastructure, and data. The findings reveal that AI can expedite the achievement of development goals in most of these sectors. The study shows that many organizations already utilize AI...

  • Rachmi Rachmi
    |
    Apr 30th, 2024
    |
    journalArticle
    Rachmi Rachmi
    Apr 30th, 2024
  • Danielle R. Thomas, Erin Gatz, Shivang G...
    |
    Apr 29th, 2024
    |
    preprint
    Danielle R. Thomas, Erin Gatz, Shivang G...
    Apr 29th, 2024

    Incorporating human tutoring with AI holds promise for supporting diverse math learners. In the U.S., approximately 15% of students receive special education services, with limited previous research within AIED on the impact of AI-assisted learning among students with disabilities. Previous work combining human tutors and AI suggests that students with lower prior knowledge, such as lacking basic skills, exhibit greater learning gains compared to their more knowledgeable peers. Building upon...

  • Iason Gabriel, Arianna Manzini, Geoff Ke...
    |
    Apr 28th, 2024
    |
    preprint
    Iason Gabriel, Arianna Manzini, Geoff Ke...
    Apr 28th, 2024

    This paper focuses on the opportunities and the ethical and societal risks posed by advanced AI assistants. We define advanced AI assistants as artificial agents with natural language interfaces, whose function is to plan and execute sequences of actions on behalf of a user, across one or more domains, in line with the user's expectations. The paper starts by considering the technology itself, providing an overview of AI assistants, their technical foundations and potential range of...

  • Apr 25th, 2024
    |
    webpage
    Apr 25th, 2024

    Code for paper "G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment" - nlpyang/geval

  • Sankalan Pal Chowdhury, Vilém Zouhar, Mr...
    |
    Apr 25th, 2024
    |
    preprint
    Sankalan Pal Chowdhury, Vilém Zouhar, Mr...
    Apr 25th, 2024

    Large Language Models (LLMs) have found several use cases in education, ranging from automatic question generation to essay evaluation. In this paper, we explore the potential of using Large Language Models (LLMs) to author Intelligent Tutoring Systems. A common pitfall of LLMs is their straying from desired pedagogical strategies such as leaking the answer to the student, and in general, providing no guarantees. We posit that while LLMs with certain guardrails can take the place of subject...

  • Allen Nie, Yash Chandak, Miroslav Suzara...
    |
    Apr 25th, 2024
    |
    preprint
    Allen Nie, Yash Chandak, Miroslav Suzara...
    Apr 25th, 2024

    Large language models (LLMs) are quickly being adopted in a wide range of learning experiences, especially via ubiquitous and broadly accessible chat interfaces like ChatGPT and Copilot. This type of interface is readily available to students and teachers around the world, yet relatively little research has been done to assess the impact of such generic tools on student learning. Coding education is an interesting test case, both because LLMs have strong performance on coding tasks, and...

  • Nicolò Cosimo Albanese
    |
    Apr 20th, 2024
    |
    conferencePaper
    Nicolò Cosimo Albanese
    Apr 20th, 2024

    Ensuring fidelity to source documents is crucial for the responsible use of Large Language Models (LLMs) in Retrieval Augmented Generation (RAG) systems. We propose a lightweight method for real-time hallucination detection, with potential to be deployed as a model-agnostic microservice to bolster reliability. Using in-context learning, our approach evaluates response factuality at the sentence level without annotated data, promoting transparency and user trust. Compared to other...

  • Nicolò Cosimo Albanese
    |
    Apr 20th, 2024
    |
    conferencePaper
    Nicolò Cosimo Albanese
    Apr 20th, 2024

    Ensuring fidelity to source documents is crucial for the responsible use of Large Language Models (LLMs) in Retrieval Augmented Generation (RAG) systems. We propose a lightweight method for real-time hallucination detection, with potential to be deployed as a model-agnostic microservice to bolster reliability. Using in-context learning, our approach evaluates response factuality at the sentence level without annotated data, promoting transparency and user trust. Compared to other...

  • RAND
    |
    Apr 17th, 2024
    |
    report
    RAND
    Apr 17th, 2024
Last update from database: 01/12/2025, 17:15 (UTC)
Powered by Zotero and Kerko.