Results – Evidence Library – Artificial Intelligence in Measurement and Education

Enhancing prediction of student success: Automated machine learning approach

Hassan Zeineddine, Udo Braendle, Assaad ...

|

Jan 24th, 2021

|

journalArticle

Hassan Zeineddine, Udo Braendle, Assaad ...

Jan 24th, 2021

Improving Automated Scoring of Student Open Responses in Mathematics

Sami Baral, Anthony F. Botelho, John A. ...

|

Apr 24th, 2021

|

conferencePaper

Sami Baral, Anthony F. Botelho, John A. ...

Apr 24th, 2021

How Do Educationally At-Risk Men and Women Differ in Their Essay-Writing Processes?

Randy Elliot Bennett, Mo Zhang, Sandip S...

|

Apr 24th, 2021

|

journalArticle

Randy Elliot Bennett, Mo Zhang, Sandip S...

Apr 24th, 2021

This study examined differences in the composition processes used by educationally at-risk males and females who wrote essays as part of a high-school equivalency examination. Over 30,000 individuals were assessed, each taking one of 12 forms of the examination’s language arts writing subtest in 23 US states. Writing processes were inferred using features extracted from keystroke logs and aggregated into seven composite indicators. Results showed that females earned higher essay and total...

A Machine Learning Prediction of Automatic Text Based Assessment for Open and Distance Learning: A Review

Guembe Blessing, Ambrose Azeta, Sanjay M...

|

Apr 24th, 2021

|

bookSection

Guembe Blessing, Ambrose Azeta, Sanjay M...

Apr 24th, 2021

On the Opportunities and Risks of Foundation Models

Rishi Bommasani, Drew A. Hudson, Ehsan A...

|

Apr 24th, 2021

|

journalArticle

Rishi Bommasani, Drew A. Hudson, Ehsan A...

Apr 24th, 2021

AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical...

All That's 'Human' Is Not Gold: Evaluating Human Evaluation of Generated Text

Elizabeth Clark, Tal August, Sofia Serra...

|

Apr 24th, 2021

|

preprint

Elizabeth Clark, Tal August, Sofia Serra...

Apr 24th, 2021

Human evaluations are typically considered the gold standard in natural language generation, but as models' fluency improves, how well can evaluators detect and judge machine-generated text? We run a study assessing non-experts' ability to distinguish between human- and machine-authored text (GPT2 and GPT3) in three domains (stories, news articles, and recipes). We find that, without training, evaluators distinguished between GPT3- and human-authored text at random chance level. We explore...

All That's 'Human' Is Not Gold: Evaluating Human Evaluation of Generated Text

Elizabeth Clark, Tal August, Sofia Serra...

|

Apr 24th, 2021

|

preprint

Elizabeth Clark, Tal August, Sofia Serra...

Apr 24th, 2021

Human evaluations are typically considered the gold standard in natural language generation, but as models' fluency improves, how well can evaluators detect and judge machine-generated text? We run a study assessing non-experts' ability to distinguish between human- and machine-authored text (GPT2 and GPT3) in three domains (stories, news articles, and recipes). We find that, without training, evaluators distinguished between GPT3- and human-authored text at random chance level. We explore...

Generate: A NLG system for educational content creation

Saad Khan, Jesse Hamer, Tiago Almeida

|

Apr 24th, 2021

|

conferencePaper

Saad Khan, Jesse Hamer, Tiago Almeida

Apr 24th, 2021

We present Generate, a AI-human hybrid system to help education content creators interactively generate assessment content in an efficient and scalable manner. Our system integrates advanced natural language generation (NLG) approaches with subject matter expertise of assessment developers to efficiently generate a large number of highly customized and valid assessment items. We utilize the powerful Transformer architecture which is capable of leveraging substantive pretraining on several...

Automated Essay Scoring and the Deep Learning Black Box: How Are Rubric Scores Determined?

Vivekanandan S. Kumar, David Boulanger

|

Sep 24th, 2021

|

journalArticle

Vivekanandan S. Kumar, David Boulanger

Sep 24th, 2021

Evaluating Dialogue Systems

Michael McTear, Michael McTear

|

Apr 24th, 2021

|

bookSection

Michael McTear, Michael McTear

Apr 24th, 2021

CIDEr-R: Robust Consensus-based Image Description Evaluation

Gabriel Oliveira dos Santos, Esther Luna...

|

Apr 24th, 2021

|

preprint

Gabriel Oliveira dos Santos, Esther Luna...

Apr 24th, 2021

This paper shows that CIDEr-D, a traditional evaluation metric for image description, does not work properly on datasets where the number of words in the sentence is significantly greater than those in the MS COCO Captions dataset. We also show that CIDEr-D has performance hampered by the lack of multiple reference sentences and high variance of sentence length. To bypass this problem, we introduce CIDEr-R, which improves CIDEr-D, making it more flexible in dealing with datasets with high...

BLEU, METEOR, BERTScore: Evaluation of Metrics Performance in Assessing Critical Translation Errors in Sentiment-oriented Text

University of Wolverhampton, UK, Hadeel ...

|

Apr 24th, 2021

|

conferencePaper

University of Wolverhampton, UK, Hadeel ...

Apr 24th, 2021

A review of deep-neural automated essay scoring models

Masaki Uto

|

Jul 24th, 2021

|

journalArticle

Masaki Uto

Jul 24th, 2021

Abstract Automated essay scoring (AES) is the task of automatically assigning scores to essays as an alternative to grading by humans. Although traditional AES models typically rely on manually designed features, deep neural network (DNN)-based AES models that obviate the need for feature engineering have recently attracted increased attention. Various DNN-AES models with different characteristics have been proposed over the past few years. To our knowledge, however, no study has...

Automated Scoring of Chinese Grades 7–9 Students’ Competence in Interpreting and Arguing from Evidence

Cong Wang, Xiufeng Liu, Lei Wang

|

Apr 24th, 2021

|

journalArticle

Cong Wang, Xiufeng Liu, Lei Wang

Apr 24th, 2021

Data Augmentation by Rubrics for Short Answer Grading

Tianqi Wang, Hiroaki Funayama, Hiroki Ou...

|

Apr 24th, 2021

|

journalArticle

Tianqi Wang, Hiroaki Funayama, Hiroki Ou...

Apr 24th, 2021

Math Word Problem Generation with Mathematical Consistency and Problem Context Constraints

Zichao Wang, Andrew Lan, Richard Baraniu...

|

Apr 24th, 2021

|

conferencePaper

Zichao Wang, Andrew Lan, Richard Baraniu...

Apr 24th, 2021

Effects of Algorithmic Transparency in Bayesian Knowledge Tracing on Trust and Perceived Accuracy

Kim Christopher Williamson, René F. Kizi...

|

Apr 24th, 2021

|

conferencePaper

Kim Christopher Williamson, René F. Kizi...

Apr 24th, 2021

ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback

Mike Wu, Noah Goodman, Chris Piech

|

Apr 24th, 2021

|

journalArticle

Mike Wu, Noah Goodman, Chris Piech

Apr 24th, 2021

High-quality computer science education is limited by the difficulty of providing instructor feedback to students at scale. While this feedback could in principle be automated, supervised approaches to predicting the correct feedback are bottlenecked by the intractability of annotating large quantities of student code. In this paper, we instead frame the problem of providing feedback as few-shot classification, where a meta-learner adapts to give feedback to student code on a new programming...

A Systematic Review of Fairness in Artificial Intelligence Algorithms

Khensani Xivuri, Hossana Twinomurinzi, D...

|

Apr 24th, 2021

|

bookSection

Khensani Xivuri, Hossana Twinomurinzi, D...

Apr 24th, 2021

Bot-Adversarial Dialogue for Safe Conversational Agents

Jing Xu, Da Ju, Margaret Li

|

Apr 24th, 2021

|

conferencePaper

Jing Xu, Da Ju, Margaret Li

Apr 24th, 2021

Search

Publication year