702 resources

  • Jinlan Fu, See-Kiong Ng, Zhengbao Jiang,...
    |
    Oct 28th, 2023
    |
    journalArticle
    Jinlan Fu, See-Kiong Ng, Zhengbao Jiang,...
    Oct 28th, 2023

    Generative Artificial Intelligence (AI) has enabled the development of sophisticated models that are capable of producing high-caliber text, images, and other outputs through the utilization of large pre-trained models. Nevertheless, assessing the quality of the generation is an even more arduous task than the generation itself, and this issue has not been given adequate consideration recently. This paper proposes a novel evaluation framework, GPTScore, which utilizes the emergent abilities...

  • Isabel O. Gallegos, Ryan A. Rossi, Joe B...
    |
    Oct 28th, 2023
    |
    preprint
    Isabel O. Gallegos, Ryan A. Rossi, Joe B...
    Oct 28th, 2023

    Rapid advancements of large language models (LLMs) have enabled the processing, understanding, and generation of human-like text, with increasing integration into systems that touch our social sphere. Despite this success, these models can learn, perpetuate, and amplify harmful social biases. In this paper, we present a comprehensive survey of bias evaluation and mitigation techniques for LLMs. We first consolidate, formalize, and expand notions of social bias and fairness in natural...

  • Isabel O. Gallegos, Ryan A. Rossi, Joe B...
    |
    Oct 28th, 2023
    |
    preprint
    Isabel O. Gallegos, Ryan A. Rossi, Joe B...
    Oct 28th, 2023

    Rapid advancements of large language models (LLMs) have enabled the processing, understanding, and generation of human-like text, with increasing integration into systems that touch our social sphere. Despite this success, these models can learn, perpetuate, and amplify harmful social biases. In this paper, we present a comprehensive survey of bias evaluation and mitigation techniques for LLMs. We first consolidate, formalize, and expand notions of social bias and fairness in natural...

  • Lifeng Han, Gleb Erofeev, Irina Sorokina...
    |
    Oct 28th, 2023
    |
    preprint
    Lifeng Han, Gleb Erofeev, Irina Sorokina...
    Oct 28th, 2023

    Massively multilingual pre-trained language models (MMPLMs) are developed in recent years demonstrating superpowers and the pre-knowledge they acquire for downstream tasks. This work investigates whether MMPLMs can be applied to clinical domain machine translation (MT) towards entirely unseen languages via transfer learning. We carry out an experimental investigation using Meta-AI's MMPLMs ``wmt21-dense-24-wide-en-X and X-en (WMT21fb)'' which were pre-trained on 7 language pairs and 14...

  • Ehsan Latif, Xiaoming Zhai
    |
    Oct 28th, 2023
    |
    journalArticle
    Ehsan Latif, Xiaoming Zhai
    Oct 28th, 2023

    This study highlights the potential of fine-tuned ChatGPT (GPT-3.5) for automatically scoring student written constructed responses using example assessment tasks in science education. Recent studies on OpenAI's generative model GPT-3.5 proved its superiority in predicting the natural language with high accuracy and human-like responses. GPT-3.5 has been trained over enormous online language materials such as journals and Wikipedia; therefore, more than direct usage of pre-trained GPT-3.5 is...

  • Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...
    |
    Oct 28th, 2023
    |
    journalArticle
    Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...
    Oct 28th, 2023

    This study investigates the application of large language models (LLMs), specifically GPT-3.5 and GPT-4, with Chain-of-Though (CoT) in the automatic scoring of student-written responses to science assessments. We focused on overcoming the challenges of accessibility, technical complexity, and lack of explainability that have previously limited the use of artificial intelligence-based automatic scoring tools among researchers and educators. With a testing dataset comprising six assessment...

  • Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...
    |
    Oct 28th, 2023
    |
    journalArticle
    Gyeong-Geon Lee, Ehsan Latif, Xuansheng ...
    Oct 28th, 2023

    This study investigates the application of large language models (LLMs), specifically GPT-3.5 and GPT-4, with Chain-of-Though (CoT) in the automatic scoring of student-written responses to science assessments. We focused on overcoming the challenges of accessibility, technical complexity, and lack of explainability that have previously limited the use of artificial intelligence-based automatic scoring tools among researchers and educators. With a testing dataset comprising six assessment...

  • Zachary Levonian, Chenglu Li, Wangda Zhu...
    |
    Oct 28th, 2023
    |
    journalArticle
    Zachary Levonian, Chenglu Li, Wangda Zhu...
    Oct 28th, 2023

    For middle-school math students, interactive question-answering (QA) with tutors is an effective way to learn. The flexibility and emergent capabilities of generative large language models (LLMs) has led to a surge of interest in automating portions of the tutoring process - including interactive QA to support conceptual discussion of mathematical concepts. However, LLM responses to math questions can be incorrect or mismatched to the educational context - such as being misaligned with a...

  • Zachary Levonian, Chenglu Li, Wangda Zhu...
    |
    Oct 28th, 2023
    |
    journalArticle
    Zachary Levonian, Chenglu Li, Wangda Zhu...
    Oct 28th, 2023

    For middle-school math students, interactive question-answering (QA) with tutors is an effective way to learn. The flexibility and emergent capabilities of generative large language models (LLMs) has led to a surge of interest in automating portions of the tutoring process - including interactive QA to support conceptual discussion of mathematical concepts. However, LLM responses to math questions can be incorrect or mismatched to the educational context - such as being misaligned with a...

  • Euan D Lindsay, Aditya Johri, Johannes B...
    |
    Oct 28th, 2023
    |
    journalArticle
    Euan D Lindsay, Aditya Johri, Johannes B...
    Oct 28th, 2023

    Providing rich feedback to students is essential for supporting student learning. Recent advances in generative AI, particularly within large language modelling (LLM), provide the opportunity to deliver repeatable, scalable and instant automatically generated feedback to students, making abundant a previously scarce and expensive learning resource. Such an approach is feasible from a technical perspective due to these recent advances in Artificial Intelligence (AI) and Natural Language...

  • Yang Liu, Dan Iter, Yichong Xu
    |
    Oct 28th, 2023
    |
    preprint
    Yang Liu, Dan Iter, Yichong Xu
    Oct 28th, 2023

    The quality of texts generated by natural language generation (NLG) systems is hard to measure automatically. Conventional reference-based metrics, such as BLEU and ROUGE, have been shown to have relatively low correlation with human judgments, especially for tasks that require creativity and diversity. Recent studies suggest using large language models (LLMs) as reference-free metrics for NLG evaluation, which have the benefit of being applicable to new tasks that lack human references....

  • Weicheng Ma, Henry Scheible, Brian Wang,...
    |
    Oct 28th, 2023
    |
    conferencePaper
    Weicheng Ma, Henry Scheible, Brian Wang,...
    Oct 28th, 2023

    Warning: This paper contains content that is stereotypical and may be upsetting. This paper addresses the issue of demographic stereotypes present in Transformer-based pre-trained language models (PLMs) and aims to deepen our understanding of how these biases are encoded in these models. To accomplish this, we introduce an easy-to-use framework for examining the stereotype-encoding behavior of PLMs through a combination of model probing and textual analyses. Our findings reveal that a small...

  • Nick McKenna, Tianyi Li, Liang Cheng
    |
    Oct 28th, 2023
    |
    preprint
    Nick McKenna, Tianyi Li, Liang Cheng
    Oct 28th, 2023

    Large Language Models (LLMs) are claimed to be capable of Natural Language Inference (NLI), necessary for applied tasks like question answering and summarization. We present a series of behavioral studies on several LLM families (LLaMA, GPT-3.5, and PaLM) which probe their behavior using controlled experiments. We establish two biases originating from pretraining which predict much of their behavior, and show that these are major sources of hallucination in generative LLMs. First,...

  • Hunter McNichols, Wanyong Feng, Jaewook ...
    |
    Oct 28th, 2023
    |
    journalArticle
    Hunter McNichols, Wanyong Feng, Jaewook ...
    Oct 28th, 2023

    Multiple-choice questions (MCQs) are ubiquitous in almost all levels of education since they are easy to administer, grade, and are a reliable format in both assessments and practices. An important aspect of MCQs is the distractors, i.e., incorrect options that are designed to target specific misconceptions or insufficient knowledge among students. To date, the task of crafting high-quality distractors has largely remained a labor-intensive process for teachers and learning content...

  • Fengchun Miao, Wayne Holmes
    |
    Oct 28th, 2023
    |
    book
    Fengchun Miao, Wayne Holmes
    Oct 28th, 2023
  • Ethan R. Mollick, Lilach Mollick
    |
    Oct 28th, 2023
    |
    preprint
    Ethan R. Mollick, Lilach Mollick
    Oct 28th, 2023

    This paper provides guidance for using AI to quickly and easily implement evidence-based teaching strategies that instructors can integrate into their teaching. We discuss five teaching strategies that have proven value but are hard to implement in practice due to time and effort constraints. We show how AI can help instructors create material that supports these strategies and improve student learning. The strategies include providing multiple examples and explanations; uncovering and...

  • Steven Moore, Huy A. Nguyen, Tianying Ch...
    |
    Oct 28th, 2023
    |
    bookSection
    Steven Moore, Huy A. Nguyen, Tianying Ch...
    Oct 28th, 2023
  • Arun Balajiee Lekshmi Narayanan, Ligia E...
    |
    Oct 28th, 2023
    |
    journalArticle
    Arun Balajiee Lekshmi Narayanan, Ligia E...
    Oct 28th, 2023

    When caregivers ask open--ended questions to motivate dialogue with children, it facilitates the child's reading comprehension skills.Although there is scope for use of technological tools, referred here as "intelligent tutoring systems", to scaffold this process, it is currently unclear whether existing intelligent systems that generate human--language like questions is beneficial. Additionally, training data used in the development of these automated question generation systems is...

  • Arun Balajiee Lekshmi Narayanan, Rully A...
    |
    Oct 28th, 2023
    |
    journalArticle
    Arun Balajiee Lekshmi Narayanan, Rully A...
    Oct 28th, 2023

    Digital textbooks have become an integral part of everyday learning tasks. In this work, we consider the use of digital textbooks for programming classes. Generally, students struggle with utilizing textbooks on programming to the maximum, with a possible reason being that the example programs provided as illustration of concepts in these textbooks don't offer sufficient interactivity for students, and thereby not sufficiently motivating to explore or understand these programming examples...

  • Andy Nguyen, Ha Ngan Ngo, Yvonne Hong
    |
    Apr 28th, 2023
    |
    journalArticle
    Andy Nguyen, Ha Ngan Ngo, Yvonne Hong
    Apr 28th, 2023

    Abstract The advancement of artificial intelligence in education (AIED) has the potential to transform the educational landscape and influence the role of all involved stakeholders. In recent years, the applications of AIED have been gradually adopted to progress our understanding of students’ learning and enhance learning performance and experience. However, the adoption of AIED has led to increasing ethical risks and concerns regarding several aspects such as personal data and...

Last update from database: 28/10/2025, 20:15 (UTC)
Powered by Zotero and Kerko.