702 resources

  • Leonora Kaldaras, Nicholas R. Yoshida, K...
    |
    Nov 25th, 2022
    |
    journalArticle
    Leonora Kaldaras, Nicholas R. Yoshida, K...
    Nov 25th, 2022

    The Framework for K-12 Science Education (the Framework) and the Next- Generation Science Standards (NGSS) define three dimensions of science: disciplinary core ideas, scientific and engineering practices, and crosscutting concepts and emphasize the integration of the three dimensions (3D) to reflect deep science understanding. The Framework also emphasizes the importance of using learning progressions (LPs) as roadmaps to guide assessment development. These assessments capable of measuring...

  • Marianne Engen Matre, David Lansing Came...
    |
    Nov 25th, 2022
    |
    journalArticle
    Marianne Engen Matre, David Lansing Came...
    Nov 25th, 2022

    To identify and describe the aims, methodological approaches, and major findings of studies on the use of STT among secondary pupils (age 12–18) with learning difficulties published from January 2000 to April 2022. This scoping review includes empirical studies published in peer-reviewed journals and grey literature between January 2000 and April 2022. Searches were conducted in April 2022 in three databases: ERIC, PsycINFO and Scopus. In addition, related reviews were manually screened for...

  • Alexandra Sasha Luccioni, Sylvain Viguie...
    |
    Nov 3rd, 2022
    |
    preprint
    Alexandra Sasha Luccioni, Sylvain Viguie...
    Nov 3rd, 2022

    Progress in machine learning (ML) comes with a cost to the environment, given that training ML models requires significant computational resources, energy and materials. In the present article, we aim to quantify the carbon footprint of BLOOM, a 176-billion parameter language model, across its life cycle. We estimate that BLOOM's final training emitted approximately 24.7 tonnes of~\carboneq~if we consider only the dynamic power consumption, and 50.5 tonnes if we account for all processes...

  • Anita Schick, Jasper Feine, Stefan Moran...
    |
    Oct 31st, 2022
    |
    journalArticle
    Anita Schick, Jasper Feine, Stefan Moran...
    Oct 31st, 2022

    Mental disorders in adolescence and young adulthood are major public health concerns. Digital tools such as text-based conversational agents (ie, chatbots) are a promising technology for facilitating mental health assessment. However, the human-like interaction style of chatbots may induce potential biases, such as socially desirable responding (SDR), and may require further effort to complete assessments.

  • Daniel F. McCaffrey, Jodi M. Casabianca,...
    |
    Oct 22nd, 2022
    |
    journalArticle
    Daniel F. McCaffrey, Jodi M. Casabianca,...
    Oct 22nd, 2022

    This document describes a set of best practices for developing, implementing, and maintaining the critical process of scoring constructed‐response tasks. These practices address both the use of human raters and automated scoring systems as part of the scoring process and cover the scoring of written, spoken, performance, or multimodal responses. Best Practices for Constructed‐Response Scoring is designed not to act as an independent guide, but rather to be used in conjunction with other ETS...

  • Yanyan Fu, Edison M. Choe, Hwanggyu Lim,...
    |
    Oct 6th, 2022
    |
    journalArticle
    Yanyan Fu, Edison M. Choe, Hwanggyu Lim,...
    Oct 6th, 2022

    This case study applied the weak theory of Automatic Item Generation (AIG) to generate isomorphic item instances (i.e., unique but psychometrically equivalent items) for a large‐scale assessment. Three representative instances were selected from each item template (i.e., model) and pilot‐tested. In addition, a new analytical framework, differential child item functioning (DCIF) analysis, based on the existing differential item functioning statistics, was applied to evaluate the psychometric...

  • Shayan Doroudi
    |
    Oct 4th, 2022
    |
    journalArticle
    Shayan Doroudi
    Oct 4th, 2022

    In this paper, I argue that the fields of artificial intelligence (AI) and education have been deeply intertwined since the early days of AI. Specifically, I show that many of the early pioneers of AI were cognitive scientists who also made pioneering and impactful contributions to the field of education. These researchers saw AI as a tool for thinking about human learning and used their understanding of how people learn to further AI. Furthermore, I trace two distinct approaches to thinking...

  • Cyril Chhun, Pierre Colombo, Chloé Clave...
    |
    Sep 15th, 2022
    |
    preprint
    Cyril Chhun, Pierre Colombo, Chloé Clave...
    Sep 15th, 2022

    Research on Automatic Story Generation (ASG) relies heavily on human and automatic evaluation. However, there is no consensus on which human evaluation criteria to use, and no analysis of how well automatic criteria correlate with them. In this paper, we propose to re-evaluate ASG evaluation. We introduce a set of 6 orthogonal and comprehensive human criteria, carefully motivated by the social sciences literature. We also present HANNA, an annotated dataset of 1,056 stories produced by 10...

  • Cyril Chhun, Pierre Colombo, Chloé Clave...
    |
    Sep 15th, 2022
    |
    preprint
    Cyril Chhun, Pierre Colombo, Chloé Clave...
    Sep 15th, 2022

    Research on Automatic Story Generation (ASG) relies heavily on human and automatic evaluation. However, there is no consensus on which human evaluation criteria to use, and no analysis of how well automatic criteria correlate with them. In this paper, we propose to re-evaluate ASG evaluation. We introduce a set of 6 orthogonal and comprehensive human criteria, carefully motivated by the social sciences literature. We also present HANNA, an annotated dataset of 1,056 stories produced by 10...

  • Nico Andersen, Fabian Zehner, Frank Gold...
    |
    Sep 11th, 2022
    |
    journalArticle
    Nico Andersen, Fabian Zehner, Frank Gold...
    Sep 11th, 2022

    In the context of large‐scale educational assessments, the effort required to code open‐ended text responses is considerably more expensive and time‐consuming than the evaluation of multiple‐choice responses because it requires trained personnel and long manual coding sessions.AimOur semi‐supervised coding method eco (exploring coding assistant) dynamically supports human raters by automatically coding a subset of the responses.MethodWe map normalized response texts into a semantic space and...

  • John Heilmann, Denise Finneran, Maura Mo...
    |
    Sep 7th, 2022
    |
    journalArticle
    John Heilmann, Denise Finneran, Maura Mo...
    Sep 7th, 2022

    Narrative language sample analysis (LSA) is a recommended best practice for the assessment of monolingual and bilingual children. With business-as-usual narrative LSA, examiners are actively involved in all aspects of the elicitation. Software advancements have shown multiple benefits of computer-administered language assessments, some of which may be beneficial for narrative assessments, particularly for bilingual children. The goal of this pilot study was to test the feasibility of...

  • Jose Belda-Medina, José Ramón Calvo-Ferr...
    |
    Aug 24th, 2022
    |
    journalArticle
    Jose Belda-Medina, José Ramón Calvo-Ferr...
    Aug 24th, 2022

    Recent advances in Artificial Intelligence (AI) and machine learning have paved the way for the increasing adoption of chatbots in language learning. Research published to date has mostly focused on chatbot accuracy and chatbot–human communication from students’ or in-service teachers’ perspectives. This study aims to examine the knowledge, level of satisfaction and perceptions concerning the integration of conversational AI in language learning among future educators. In this mixed method...

  • Christian Hartmann, Nikol Rummel, Maria ...
    |
    Aug 9th, 2022
    |
    journalArticle
    Christian Hartmann, Nikol Rummel, Maria ...
    Aug 9th, 2022

    This paper presents a fine-grained process analysis of 22 students in a classroom-based learning setting. The students engaged (and failed) in problem-solving attempts prior to instruction (i.e., the Productive-Failure approach). We used the HeuristicsMiner algorithm to analyze the data of a quasi-experimental study. The applied algorithm allowed us to investigate temporally structured think-aloud data, to outline productive and unproductive problem-solving strategies. Our analyses and...

  • Philip Buczak, He Huang, Boris Forthmann...
    |
    Aug 8th, 2022
    |
    journalArticle
    Philip Buczak, He Huang, Boris Forthmann...
    Aug 8th, 2022

    Traditionally, researchers employ human raters for scoring responses to creative thinking tasks. Apart from the associated costs this approach entails two potential risks. First, human raters can be subjective in their scoring behavior (inter‐rater‐variance). Second, individual raters are prone to inconsistent scoring patterns (intra‐rater‐variance). In light of these issues, we present an approach for automated scoring of Divergent Thinking (DT) Tasks. We implemented a pipeline aiming to...

  • Mustafa Abdul Salam, Mohamed Abd El-Fata...
    |
    Aug 2nd, 2022
    |
    journalArticle
    Mustafa Abdul Salam, Mohamed Abd El-Fata...
    Aug 2nd, 2022

    Auto-grading of short answer questions is considered a challenging problem in the processing of natural language. It requires a system to comprehend the free text answers to automatically assign a grade for a student answer compared to one or more model answers. This paper suggests an optimized deep learning model for grading short-answer questions automatically by using various sizes of datasets collected in the Science subject for students in seventh grade in Egypt. The proposed system is...

  • Iddo Drori, Sarah Zhang, Reece Shuttlewo...
    |
    Aug 2nd, 2022
    |
    journalArticle
    Iddo Drori, Sarah Zhang, Reece Shuttlewo...
    Aug 2nd, 2022

    We demonstrate that a neural network pretrained on text and fine-tuned on code solves mathematics course problems, explains solutions, and generates questions at a human level. We automatically synthesize programs using few-shot learning and OpenAI’s Codex transformer and execute them to solve course problems at 81% automatic accuracy. We curate a dataset of questions from Massachusetts Institute of Technology (MIT)’s largest mathematics courses (Single Variable and Multivariable Calculus,...

  • Yigal Attali, Andrew Runge, Geoffrey T. ...
    |
    Jul 22nd, 2022
    |
    journalArticle
    Yigal Attali, Andrew Runge, Geoffrey T. ...
    Jul 22nd, 2022

    Automatic item generation (AIG) has the potential to greatly expand the number of items for educational assessments, while simultaneously allowing for a more construct-driven approach to item development. However, the traditional item modeling approach in AIG is limited in scope to content areas that are relatively easy to model (such as math problems), and depends on highly skilled content experts to create each model. In this paper we describe the interactive reading task, a...

  • Yigal Attali, Andrew Runge, Geoffrey T. ...
    |
    Jul 22nd, 2022
    |
    journalArticle
    Yigal Attali, Andrew Runge, Geoffrey T. ...
    Jul 22nd, 2022

    Automatic item generation (AIG) has the potential to greatly expand the number of items for educational assessments, while simultaneously allowing for a more construct-driven approach to item development. However, the traditional item modeling approach in AIG is limited in scope to content areas that are relatively easy to model (such as math problems), and depends on highly skilled content experts to create each model. In this paper we describe the interactive reading task, a...

  • Rishi Bommasani, Drew A. Hudson, Ehsan A...
    |
    Jul 12th, 2022
    |
    preprint
    Rishi Bommasani, Drew A. Hudson, Ehsan A...
    Jul 12th, 2022

    AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical...

  • Pierre Jean A. Colombo, Chloé Clavel, Pa...
    |
    Jun 28th, 2022
    |
    journalArticle
    Pierre Jean A. Colombo, Chloé Clavel, Pa...
    Jun 28th, 2022

    Assessing the quality of natural language generation (NLG) systems through human annotation is very expensive. Additionally, human annotation campaigns are time-consuming and include non-reusable human labour. In practice, researchers rely on automatic metrics as a proxy of quality. In the last decade, many string-based metrics (e.g., BLEU or ROUGE) have been introduced. However, such metrics usually rely on exact matches and thus, do not robustly handle synonyms. In this paper, we introduce...

Last update from database: 28/10/2025, 10:15 (UTC)
Powered by Zotero and Kerko.