Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 27 tok/s Pro
GPT-4o 84 tok/s Pro
Kimi K2 174 tok/s Pro
GPT OSS 120B 430 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Exploring the Naturalness of AI-Generated Images (2312.05476v3)

Published 9 Dec 2023 in cs.CV

Abstract: The proliferation of Artificial Intelligence-Generated Images (AGIs) has greatly expanded the Image Naturalness Assessment (INA) problem. Different from early definitions that mainly focus on tone-mapped images with limited distortions (e.g., exposure, contrast, and color reproduction), INA on AI-generated images is especially challenging as it has more diverse contents and could be affected by factors from multiple perspectives, including low-level technical distortions and high-level rationality distortions. In this paper, we take the first step to benchmark and assess the visual naturalness of AI-generated images. First, we construct the AI-Generated Image Naturalness (AGIN) database by conducting a large-scale subjective study to collect human opinions on the overall naturalness as well as perceptions from technical and rationality perspectives. AGIN verifies that naturalness is universally and disparately affected by technical and rationality distortions. Second, we propose the Joint Objective Image Naturalness evaluaTor (JOINT), to automatically predict the naturalness of AGIs that aligns human ratings. Specifically, JOINT imitates human reasoning in naturalness evaluation by jointly learning both technical and rationality features. We demonstrate that JOINT significantly outperforms baselines for providing more subjectively consistent results on naturalness assessment.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (114)
  1. Quality assessment of enhanced videos guided by aesthetics and technical quality attributes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1533–1541, 2023.
  2. Semi-blind image restoration via mumford-shah regularization. IEEE Transactions on Image Processing, 15:483–493, 2006.
  3. Benchmarking performance of object detection under image distortions in an uncontrolled environment. In 2022 IEEE International Conference on Image Processing (ICIP), pages 2071–2075. IEEE, 2022.
  4. Demystifying mmd gans. arXiv preprint arXiv:1801.01401, 2018.
  5. Fast differentiable sorting and ranking. In International Conference on Machine Learning, pages 950–959. PMLR, 2020.
  6. Instructpix2pix: Learning to follow image editing instructions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18392–18402, 2023.
  7. RIR BT. Methodology for the subjective assessment of the quality of television pictures. International Telecommunication Union, 4, 2002.
  8. The naturalness of reproduced high dynamic range images. In Ninth International Conference on Information Visualisation (IV’05), pages 920–925. IEEE, 2005.
  9. A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt. arXiv preprint arXiv:2303.04226, 2023.
  10. The wasserstein-fourier distance for stationary time series. IEEE Transactions on Signal Processing, 69:709–721, 2020.
  11. Yixiong Chen. X-iqe: explainable image quality evaluation for text-to-image generation with visual large language models. arXiv preprint arXiv:2305.10843, 2023.
  12. On quantifying and improving realism of images generated with diffusion. arXiv preprint arXiv:2309.14756, 2023.
  13. Cartoongan: Generative adversarial networks for photo cartoonization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9465–9474, 2018.
  14. Investigation of large display color image appearance–iii: Modeling image naturalness. Journal of Imaging Science and Technology, 53(3):31104–1, 2009.
  15. No-reference blur assessment of digital pictures based on multifeature classifiers. IEEE Transactions on image processing, 20(1):64–75, 2010.
  16. Unveiling the multimedia unconscious: Implicit cognitive processes and multimedia content analysis. In Proceedings of the 21st ACM international conference on Multimedia, pages 213–222, 2013.
  17. Huib de Ridder. Naturalness and image quality: saturation and lightness variation in color images of natural scenes. Journal of imaging science and technology, 40(6):487–493, 1996.
  18. Naturalness and image quality: chroma and hue variation in color images of natural scenes. In Human Vision, Visual Processing, and Digital Display VI, volume 2411, pages 51–61. SPIE, 1995.
  19. Dreamlike.art. https://dreamlike.art. 2023.
  20. Perceptual quality assessment of smartphone photography. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3677–3686, 2020.
  21. Massive online crowdsourced study of subjective and objective picture quality. IEEE Transactions on Image Processing, 25(1):372–387, 2015.
  22. Separate visual pathways for perception and action. Trends in neurosciences, 15(1):20–25, 1992.
  23. Ntire 2022 challenge on perceptual image quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 951–967, 2022.
  24. Blind quality assessment of tone-mapped images via analysis of information, naturalness, and structure. IEEE Transactions on Multimedia, 18(3):432–443, 2016.
  25. Underwater image quality assessment: Subjective and objective methods. IEEE Transactions on Multimedia, 24:1980–1989, 2021.
  26. Answering the call for a standard reliability measure for coding data. Communication methods and measures, 1(1):77–89, 2007.
  27. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  28. Thinking image color aesthetics assessment: Models, datasets and benchmarks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 21838–21847, 2023.
  29. Rethinking image aesthetics assessment: Models, datasets and benchmarks. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pages 942–948, 2022.
  30. Clipscore: A reference-free evaluation metric for image captioning. arXiv preprint arXiv:2104.08718, 2021.
  31. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
  32. Koniq-10k: An ecologically valid database for deep learning of blind image quality assessment. IEEE Transactions on Image Processing, 29:4041–4056, 2020.
  33. David Ingle. Two visual systems in the frog. Science, 181(4104):1053–1055, 1973.
  34. Masked and adaptive transformer for exemplar based image translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22418–22427, 2023.
  35. Ddcolor: Towards photo-realistic and semantic-aware image colorization via dual decoders. arXiv preprint arXiv:2212.11613, 2022.
  36. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4401–4410, 2019.
  37. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8110–8119, 2020.
  38. Musiq: Multi-scale image quality transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5148–5157, 2021.
  39. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  40. Pick-a-pic: An open dataset of user preferences for text-to-image generation. arXiv preprint arXiv:2305.01569, 2023.
  41. Diffusion-based image translation using disentangled style and content representation. In ICLR, 2023.
  42. Improved precision and recall metric for assessing generative models. Advances in Neural Information Processing Systems, 32, 2019.
  43. Most apparent distortion: full-reference image quality assessment and the role of strategy. Journal of electronic imaging, 19(1):011006–011006, 2010.
  44. Study of naturalness in tone-mapped images. Computer Vision and Image Understanding, 196:102971, 2020.
  45. Image colorization using cyclegan with semantic and spatial rationality. Multimedia Tools and Applications, pages 1–15, 2023.
  46. Blindly assess quality of in-the-wild videos via quality-aware pre-training and motion perception. IEEE Transactions on Circuits and Systems for Video Technology, 32(9):5944–5958, 2022.
  47. Agiqa-3k: An open database for ai-generated image quality assessment. IEEE Transactions on Circuits and Systems for Video Technology, 2023.
  48. Norm-in-norm loss with faster convergence and better performance for image quality assessment. In Proceedings of the 28th ACM International Conference on Multimedia, pages 789–797, 2020.
  49. Personality-assisted multi-task learning for generic and personalized image aesthetics assessment. IEEE Transactions on Image Processing, 29:3898–3910, 2020.
  50. Towards benchmarking and assessing visual naturalness of physical world adversarial attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12324–12333, 2023.
  51. Mat: Mask-aware transformer for large hole image inpainting. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10758–10768, 2022.
  52. Deepwsd: Projecting degradations in perceptual space to wasserstein distance in deep feature space. Proceedings of the 30th ACM International Conference on Multimedia, 2022.
  53. Kadid-10k: A large-scale artificially distorted iqa database. In 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX), pages 1–3. IEEE, 2019.
  54. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
  55. Unsupervised blind image quality evaluation via statistical measurements of structure, naturalness, and perception. IEEE Transactions on Circuits and Systems for Video Technology, 30(4):929–943, 2019.
  56. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
  57. Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1096–1104, 2016.
  58. Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision, pages 3730–3738, 2015.
  59. Seeing is not always believing: A quantitative study on human perception of ai-generated images. arXiv preprint arXiv:2304.13023, 2023.
  60. Repaint: Inpainting using denoising diffusion probabilistic models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11461–11471, 2022.
  61. Waterloo exploration database: New challenges for image quality assessment models. IEEE Transactions on Image Processing, 26(2):1004–1016, 2016.
  62. Midjourney. https://www.midjourney.com. 2023.
  63. No-reference image quality assessment in the spatial domain. IEEE Transactions on image processing, 21(12):4695–4708, 2012.
  64. Making a “completely blind” image quality analyzer. IEEE Signal processing letters, 20(3):209–212, 2013.
  65. Ava: A large-scale database for aesthetic visual analysis. In 2012 IEEE conference on computer vision and pattern recognition, pages 2408–2415. IEEE, 2012.
  66. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741, 2021.
  67. Joel Norman. Two visual systems and two theories of perception: An attempt to reconcile the constructivist and ecological approaches. Behavioral and brain sciences, 25(1):73–96, 2002.
  68. Toward verifiable and reproducible human evaluation for text-to-image generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14277–14286, 2023.
  69. Drag your gan: Interactive point-based manipulation on the generative image manifold. In ACM SIGGRAPH 2023 Conference Proceedings, pages 1–11, 2023.
  70. Styleclip: Text-driven manipulation of stylegan imagery. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2085–2094, 2021.
  71. Image database tid2013: Peculiarities, results and perspectives. Signal processing: Image communication, 30:57–77, 2015.
  72. Tid2008-a database for evaluation of full-reference visual quality assessment metrics. Advances of modern radioelectronics, 10(4):30–45, 2009.
  73. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 1(2):3, 2022.
  74. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
  75. Imagenet large scale visual recognition challenge. International journal of computer vision, 115:211–252, 2015.
  76. Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 35:36479–36494, 2022.
  77. Improved techniques for training gans. Advances in neural information processing systems, 29, 2016.
  78. H Sheikh. Live image quality assessment database release 2. http://live. ece. utexas. edu/research/quality, 2005.
  79. Responsible research with crowds: pay crowdworkers at least minimum wage. Communications of the ACM, 61(3):39–41, 2018.
  80. Blindly assess image quality in the wild guided by a self-adaptive hyper network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3667–3676, 2020.
  81. Blind quality assessment for in-the-wild images via hierarchical feature fusion and iterative mixed database training. IEEE Journal of Selected Topics in Signal Processing, 2023.
  82. Mdid: A multiply distorted image database for image quality assessment. Pattern Recognition, 61:153–168, 2017.
  83. Unsupervised deep exemplar colorization via pyramid dual non-local attention. IEEE Transactions on Image Processing, 32:4114–4127, 2023.
  84. Exploring clip for assessing the look and feel of images. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 2555–2563, 2023.
  85. Aigciqa2023: A large-scale image quality assessment database for ai generated images: from the perspectives of quality, authenticity and correspondence. arXiv preprint arXiv:2307.00211, 2023.
  86. Detecting photoshopped faces by scripting photoshop. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10072–10081, 2019.
  87. Fast-vqa: Efficient end-to-end video quality assessment with fragment sampling. In European Conference on Computer Vision, pages 538–554. Springer, 2022.
  88. Exploring video quality assessment on user generated contents from aesthetic and technical perspectives. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 20144–20154, 2023.
  89. Towards explainable in-the-wild video quality assessment: a database and a language-prompted approach. arXiv preprint arXiv:2305.12726, 2023.
  90. Q-instruct: Improving low-level visual abilities for multi-modality foundation models. arXiv preprint arXiv:2311.06783, 2023.
  91. Ai-generated content (aigc): A survey. arXiv preprint arXiv:2304.06632, 2023.
  92. Disentangled image colorization via global anchors. ACM Transactions on Graphics (TOG), 41(6):1–13, 2022.
  93. Naturalness-aware deep no-reference image quality assessment. IEEE Transactions on Multimedia, 21(10):2603–2615, 2019.
  94. Maniqa: Multi-dimension attention network for no-reference image quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1191–1200, 2022.
  95. Towards artistic image aesthetics assessment: a large-scale dataset and a new method. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22388–22397, 2023.
  96. From patches to pictures (paq-2-piq): Mapping the perceptual space of picture quality. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3575–3585, 2020.
  97. Predicting the quality of images compressed after distortion in two steps. IEEE Transactions on Image Processing, 28(12):5757–5770, 2019.
  98. Perceptual image quality assessment: a survey. Science China Information Sciences, 63:1–52, 2020.
  99. Bi-level feature alignment for versatile image translation and manipulation. In European Conference on Computer Vision, pages 224–241. Springer, 2022.
  100. Magicbrush: A manually annotated dataset for instruction-guided image editing. arXiv preprint arXiv:2306.10012, 2023.
  101. Adding conditional control to text-to-image diffusion models. arXiv preprint arXiv:2302.05543, 2023.
  102. Perceptual artifacts localization for inpainting. In European Conference on Computer Vision, pages 146–164. Springer, 2022.
  103. Internlm-xcomposer: A vision-language large model for advanced text-image comprehension and composition. arXiv preprint arXiv:2309.15112, 2023.
  104. Cross-domain correspondence learning for exemplar-based image translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5143–5153, 2020.
  105. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 586–595, 2018.
  106. Blind image quality assessment using a deep bilinear convolutional neural network. IEEE Transactions on Circuits and Systems for Video Technology, 30(1):36–47, 2018.
  107. Uncertainty-aware blind image quality assessment in the laboratory and wild. IEEE Transactions on Image Processing, 30:3474–3486, 2021.
  108. Blind image quality assessment via vision-language correspondence: A multitask learning perspective. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14071–14081, 2023.
  109. A perceptual quality assessment exploration for aigc images. arXiv preprint arXiv:2303.12618, 2023.
  110. A survey on image tampering and its detection in real-world photos. Journal of Visual Communication and Image Representation, 58:380–399, 2019.
  111. Uif: An objective quality assessment for underwater image enhancement. IEEE Transactions on Image Processing, 31:5456–5468, 2022.
  112. Places: A 10 million image database for scene recognition. IEEE transactions on pattern analysis and machine intelligence, 40(6):1452–1464, 2017.
  113. Scene parsing through ade20k dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 633–641, 2017.
  114. Genimage: A million-scale benchmark for detecting ai-generated image. arXiv preprint arXiv:2306.08571, 2023.
Citations (15)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Github Logo Streamline Icon: https://streamlinehq.com