Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PKU-I2IQA: An Image-to-Image Quality Assessment Database for AI Generated Images (2311.15556v2)

Published 27 Nov 2023 in cs.CV and eess.IV

Abstract: As image generation technology advances, AI-based image generation has been applied in various fields and Artificial Intelligence Generated Content (AIGC) has garnered widespread attention. However, the development of AI-based image generative models also brings new problems and challenges. A significant challenge is that AI-generated images (AIGI) may exhibit unique distortions compared to natural images, and not all generated images meet the requirements of the real world. Therefore, it is of great significance to evaluate AIGIs more comprehensively. Although previous work has established several human perception-based AIGC image quality assessment (AIGCIQA) databases for text-generated images, the AI image generation technology includes scenarios like text-to-image and image-to-image, and assessing only the images generated by text-to-image models is insufficient. To address this issue, we establish a human perception-based image-to-image AIGCIQA database, named PKU-I2IQA. We conduct a well-organized subjective experiment to collect quality labels for AIGIs and then conduct a comprehensive analysis of the PKU-I2IQA database. Furthermore, we have proposed two benchmark models: NR-AIGCIQA based on the no-reference image quality assessment method and FR-AIGCIQA based on the full-reference image quality assessment method. Finally, leveraging this database, we conduct benchmark experiments and compare the performance of the proposed benchmark models. The PKU-I2IQA database and benchmarks will be released to facilitate future research on \url{https://github.com/jiquan123/I2IQA}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Pixabay. https://pixabay.com/, 2010.
  2. Midjourney. https://www.midjourney.com/home/, 2022.
  3. One transformer fits all distributions in multi-modal diffusion at scale. arXiv preprint arXiv:2303.06555, 2023.
  4. Deep neural networks for no-reference and full-reference image quality assessment. IEEE Transactions on Image Processing, 27(1):206–219, 2018.
  5. Full-reference screen content image quality assessment by fusing multilevel structure similarity. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 17(3):1–21, 2021.
  6. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  7. Confusing image quality assessment: Toward better augmented reality experience. IEEE Transactions on Image Processing, 31:7206–7221, 2022.
  8. Develop then rival: A human vision-inspired framework for superimposed image decomposition. IEEE Transactions on Multimedia, 25:4267–4281, 2023.
  9. Perceptual quality assessment of omnidirectional images. In 2018 IEEE International Symposium on Circuits and Systems (ISCAS), pages 1–5, 2018.
  10. Ivqad 2017: An immersive video quality assessment database. In 2017 International Conference on Systems, Signals and Image Processing (IWSSIP), pages 1–5, 2017.
  11. Universal blind image quality assessment metrics via natural scene statistics and multiple kernel learning. IEEE Transactions on Neural Networks and Learning Systems, 24(12):2013–2026, 2013.
  12. Perceptual quality prediction on authentically distorted images using a bag of features approach. Journal of vision, 17(1):32–32, 2017.
  13. Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
  14. Improved training of wasserstein gans. Advances in neural information processing systems, 30, 2017.
  15. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  16. Clipscore: A reference-free evaluation metric for image captioning. arXiv preprint arXiv:2104.08718, 2021.
  17. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
  18. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  19. Convolutional neural networks for no-reference image quality assessment. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 1733–1740, 2014.
  20. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  21. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  22. Pick-a-pic: An open dataset of user preferences for text-to-image generation. arXiv preprint arXiv:2305.01569, 2023.
  23. Attentions help cnns see better: Attention-based hybrid image quality assessment network. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1140–1149, 2022.
  24. Aligning text-to-image models using human feedback. arXiv preprint arXiv:2302.12192, 2023.
  25. Agiqa-3k: An open database for ai-generated image quality assessment. IEEE Transactions on Circuits and Systems for Video Technology, pages 1–1, 2023.
  26. Rankiqa: Learning from rankings for no-reference image quality assessment. In Proceedings of the IEEE international conference on computer vision, pages 1040–1049, 2017.
  27. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
  28. Unified blind quality assessment of compressed natural, graphic, and screen content images. IEEE Transactions on Image Processing, 26(11):5462–5474, 2017.
  29. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741, 2021.
  30. Toward verifiable and reproducible human evaluation for text-to-image generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14277–14286, 2023.
  31. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  32. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 1(2):3, 2022.
  33. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
  34. Imagenet large scale visual recognition challenge. International journal of computer vision, 115:211–252, 2015.
  35. A novel just-noticeable-difference-based saliency-channel attention residual network for full-reference image quality predictions. IEEE Transactions on Circuits and Systems for Video Technology, 31(7):2602–2616, 2021.
  36. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
  37. Blind quality assessment for in-the-wild images via hierarchical feature fusion strategy. In 2022 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), pages 01–06, 2022.
  38. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI conference on artificial intelligence, volume 31, 2017.
  39. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.
  40. I. T. Union. Methodology for the subjective assess- ment of the quality of television pictures. ITU-R Recommendation BT. 500-11, 2002.
  41. Aigciqa2023: A large-scale image quality assessment database for ai generated images: from the perspectives of quality, authenticity and correspondence. arXiv preprint arXiv:2307.00211, 2023.
  42. A multi-dimensional aesthetic quality assessment model for mobile game images. In 2021 International Conference on Visual Communications and Image Processing (VCIP), pages 1–5, 2021.
  43. Maniqa: Multi-dimension attention network for no-reference image quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1191–1200, 2022.
  44. No-reference image quality assessment using visual codebooks. IEEE Transactions on Image Processing, 21(7):3129–3138, 2012.
  45. Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3836–3847, 2023.
  46. A perceptual quality assessment exploration for aigc images. arXiv preprint arXiv:2303.12618, 2023.
  47. Towards language-free training for text-to-image generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17907–17917, 2022.
  48. Metaiqa: Deep meta-learning for no-reference image quality assessment. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14131–14140, 2020.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Jiquan Yuan (4 papers)
  2. Xinyan Cao (7 papers)
  3. Changjin Li (2 papers)
  4. Fanyi Yang (22 papers)
  5. Jinlong Lin (8 papers)
  6. Xixin Cao (8 papers)
Citations (14)
Github Logo Streamline Icon: https://streamlinehq.com