In authors or contributors

3 resources

  • Aishwarya Agrawal, Dhruv Batra, Devi Par...
    |
    Oct 29th, 2016
    |
    journalArticle
    Aishwarya Agrawal, Dhruv Batra, Devi Par...
    Oct 29th, 2016

    Recently, a number of deep-learning based models have been proposed for the task of Visual Question Answering (VQA). The performance of most models is clustered around 60-70%. In this paper we propose systematic methods to analyze the behavior of these models as a first step towards recognizing their strengths and weaknesses, and identifying the most fruitful directions for progress. We analyze two models, one each from two major classes of VQA models -- with-attention and without-attention...

  • Ramakrishna Vedantam, C. Lawrence Zitnic...
    |
    Jun 29th, 2015
    |
    preprint
    Ramakrishna Vedantam, C. Lawrence Zitnic...
    Jun 29th, 2015

    Automatically describing an image with a sentence is a long-standing challenge in computer vision and natural language processing. Due to recent progress in object detection, attribute classification, action recognition, etc., there is renewed interest in this area. However, evaluating the quality of descriptions has proven to be challenging. We propose a novel paradigm for evaluating image descriptions that uses human consensus. This paradigm consists of three main parts: a new...

  • Abhimanyu Dubey, Abhinav Jauhri, Abhinav...
    |
    Aug 15th, 2024
    |
    preprint
    Abhimanyu Dubey, Abhinav Jauhri, Abhinav...
    Aug 15th, 2024

    Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language...

Last update from database: 29/10/2025, 19:15 (UTC)
Powered by Zotero and Kerko.