Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TIGEr: Text-to-Image Grounding for Image Caption Evaluation (1909.02050v1)

Published 4 Sep 2019 in cs.CL and cs.CV

Abstract: This paper presents a new metric called TIGEr for the automatic evaluation of image captioning systems. Popular metrics, such as BLEU and CIDEr, are based solely on text matching between reference captions and machine-generated captions, potentially leading to biased evaluations because references may not fully cover the image content and natural language is inherently ambiguous. Building upon a machine-learned text-image grounding model, TIGEr allows to evaluate caption quality not only based on how well a caption represents image content, but also on how well machine-generated captions match human-generated captions. Our empirical tests show that TIGEr has a higher consistency with human judgments than alternative existing metrics. We also comprehensively assess the metric's effectiveness in caption evaluation by measuring the correlation between human judgments and metric scores.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Ming Jiang (59 papers)
  2. Qiuyuan Huang (23 papers)
  3. Lei Zhang (1689 papers)
  4. Xin Wang (1307 papers)
  5. Pengchuan Zhang (58 papers)
  6. Zhe Gan (135 papers)
  7. Jana Diesner (21 papers)
  8. Jianfeng Gao (344 papers)
Citations (60)

Summary

We haven't generated a summary for this paper yet.