Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation (2305.11116v1)

Published 18 May 2023 in cs.CV and cs.CL

Abstract: Existing automatic evaluation on text-to-image synthesis can only provide an image-text matching score, without considering the object-level compositionality, which results in poor correlation with human judgments. In this work, we propose LLMscore, a new framework that offers evaluation scores with multi-granularity compositionality. LLMscore leverages the LLMs to evaluate text-to-image models. Initially, it transforms the image into image-level and object-level visual descriptions. Then an evaluation instruction is fed into the LLMs to measure the alignment between the synthesized image and the text, ultimately generating a score accompanied by a rationale. Our substantial analysis reveals the highest correlation of LLMscore with human judgments on a wide range of datasets (Attribute Binding Contrast, Concept Conjunction, MSCOCO, DrawBench, PaintSkills). Notably, our LLMscore achieves Kendall's tau correlation with human evaluations that is 58.8% and 31.2% higher than the commonly-used text-image matching metrics CLIP and BLIP, respectively.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yujie Lu (42 papers)
  2. Xianjun Yang (37 papers)
  3. Xiujun Li (37 papers)
  4. Xin Eric Wang (74 papers)
  5. William Yang Wang (254 papers)
Citations (54)

Summary

We haven't generated a summary for this paper yet.