CIDEr: Consensus-based Image Description Evaluation

Open in Zotero

View on zotero.org

Open in Zotero

View on zotero.org

Article Status

Published

Authors/contributors

Title

CIDEr: Consensus-based Image Description Evaluation

Abstract

Automatically describing an image with a sentence is a long-standing challenge in computer vision and natural language processing. Due to recent progress in object detection, attribute classification, action recognition, etc., there is renewed interest in this area. However, evaluating the quality of descriptions has proven to be challenging. We propose a novel paradigm for evaluating image descriptions that uses human consensus. This paradigm consists of three main parts: a new triplet-based method of collecting human annotations to measure consensus, a new automated metric (CIDEr) that captures consensus, and two new datasets: PASCAL-50S and ABSTRACT-50S that contain 50 sentences describing each image. Our simple metric captures human judgment of consensus better than existing metrics across sentences generated by various sources. We also evaluate five state-of-the-art image description approaches using this new protocol and provide a benchmark for future comparisons. A version of CIDEr named CIDEr-D is available as a part of MS COCO evaluation server to enable systematic evaluation and benchmarking.

Repository

arXiv

Archive ID

arXiv:1411.5726

Date

2015-06-02

DOI

10.1109/cvpr.2015.7299087

URL

http://arxiv.org/abs/1411.5726

Accessed

29/04/2024, 20:29

Short Title

CIDEr

Library Catalogue

arXiv.org

Extra

arXiv:1411.5726 [cs] <AI Smry>: A novel paradigm for evaluating image descriptions that uses human consensus is proposed and a new automated metric that captures human judgment of consensus better than existing metrics across sentences generated by various sources is evaluated.

Citation

Vedantam, R., Zitnick, C. L., & Parikh, D. (2015). CIDEr: Consensus-based Image Description Evaluation (arXiv:1411.5726). arXiv. https://doi.org/10.1109/cvpr.2015.7299087

Technical methods

model evaluation subgroup

Link to this record

https://aievidencehub.org/lib/JQY8XCTC