Results – Evidence Library – Artificial Intelligence in Measurement and Education

Analyzing the Behavior of Visual Question Answering Models

Aishwarya Agrawal, Dhruv Batra, Devi Par...

|

May 16th, 2016

|

journalArticle

Aishwarya Agrawal, Dhruv Batra, Devi Par...

May 16th, 2016

Recently, a number of deep-learning based models have been proposed for the task of Visual Question Answering (VQA). The performance of most models is clustered around 60-70%. In this paper we propose systematic methods to analyze the behavior of these models as a first step towards recognizing their strengths and weaknesses, and identifying the most fruitful directions for progress. We analyze two models, one each from two major classes of VQA models -- with-attention and without-attention...

CIDEr: Consensus-based Image Description Evaluation

Ramakrishna Vedantam, C. Lawrence Zitnic...

|

Jun 16th, 2015

|

preprint

Ramakrishna Vedantam, C. Lawrence Zitnic...

Jun 16th, 2015

Automatically describing an image with a sentence is a long-standing challenge in computer vision and natural language processing. Due to recent progress in object detection, attribute classification, action recognition, etc., there is renewed interest in this area. However, evaluating the quality of descriptions has proven to be challenging. We propose a novel paradigm for evaluating image descriptions that uses human consensus. This paradigm consists of three main parts: a new...

The Llama 3 Herd of Models

Abhimanyu Dubey, Abhinav Jauhri, Abhinav...

|

Aug 15th, 2024

|

preprint

Abhimanyu Dubey, Abhinav Jauhri, Abhinav...

Aug 15th, 2024

Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language...

Search

Publication year