Papers
Topics
Authors
Recent
2000 character limit reached

Advancing Zero-Shot Digital Human Quality Assessment through Text-Prompted Evaluation

Published 6 Jul 2023 in eess.IV, cs.CV, and cs.DB | (2307.02808v1)

Abstract: Digital humans have witnessed extensive applications in various domains, necessitating related quality assessment studies. However, there is a lack of comprehensive digital human quality assessment (DHQA) databases. To address this gap, we propose SJTU-H3D, a subjective quality assessment database specifically designed for full-body digital humans. It comprises 40 high-quality reference digital humans and 1,120 labeled distorted counterparts generated with seven types of distortions. The SJTU-H3D database can serve as a benchmark for DHQA research, allowing evaluation and refinement of processing algorithms. Further, we propose a zero-shot DHQA approach that focuses on no-reference (NR) scenarios to ensure generalization capabilities while mitigating database bias. Our method leverages semantic and distortion features extracted from projections, as well as geometry features derived from the mesh structure of digital humans. Specifically, we employ the Contrastive Language-Image Pre-training (CLIP) model to measure semantic affinity and incorporate the Naturalness Image Quality Evaluator (NIQE) model to capture low-level distortion information. Additionally, we utilize dihedral angles as geometry descriptors to extract mesh features. By aggregating these measures, we introduce the Digital Human Quality Index (DHQI), which demonstrates significant improvements in zero-shot performance. The DHQI can also serve as a robust baseline for DHQA tasks, facilitating advancements in the field. The database and the code are available at https://github.com/zzc-1998/SJTU-H3D.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. W. Zhu, X. Fan, and Y. Zhang, “Applications and research trends of digital human models in the manufacturing industry,” Elsevier VRIH, vol. 1, no. 6, pp. 558–579, 2019.
  2. Z. Zhang, C. Li, W. Sun, X. Liu, X. Min, and G. Zhai, “A perceptual quality assessment exploration for aigc images,” arXiv preprint arXiv:2303.12618, 2023.
  3. C. Li, Z. Zhang, H. Wu, W. Sun, X. Min, X. Liu, G. Zhai, and W. Lin, “Agiqa-3k: An open database for ai-generated image quality assessment,” arXiv preprint arXiv:2306.04717, 2023.
  4. Y. Dong, X. Liu, Y. Gao, X. Zhou, T. Tan, and G. Zhai, “Light-vqa: A multi-dimensional quality assessment model for low-light video enhancement,” arXiv preprint arXiv:2305.09512, 2023.
  5. Y. Gao, Y. Cao, T. Kou, W. Sun, Y. Dong, X. Liu, X. Min, and G. Zhai, “Vdpve: Vqa dataset for perceptual video enhancement,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshop, 2023, pp. 1474–1483.
  6. Y. Fang, H. Zhu, Y. Zeng, K. Ma, and Z. Wang, “Perceptual quality assessment of smartphone photography,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3677–3686.
  7. Z. Ying, M. Mandal, D. Ghadiyaram, and A. Bovik, “Patch-vq:’patching up’the video quality problem,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 14 019–14 029.
  8. Z. Zhang, Y. Zhou, W. Sun, X. Min, Y. Wu, and G. Zhai, “Perceptual quality assessment for digital human heads,” in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).   IEEE, 2023, pp. 1–5.
  9. A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark et al., “Learning transferable visual models from natural language supervision,” in International conference on machine learning.   PMLR, 2021, pp. 8748–8763.
  10. A. Mittal, R. Soundararajan, and A. C. Bovik, “Making a “completely blind” image quality analyzer,” IEEE Signal Processing Letters, vol. 20, no. 3, pp. 209–212, 2012.
  11. E. Alexiou and T. Ebrahimi, “Point cloud quality assessment metric based on angular similarity,” in IEEE International Conference on Multimedia and Expo, 2018, pp. 1–6.
  12. I. Abouelaziz, M. El Hassouni, and H. Cherifi, “No-reference 3d mesh quality assessment based on dihedral angles model and support vector regression,” in Image and Signal Processing, 2016, pp. 369–377.
  13. L. Váša and J. Rus, “Dihedral angle mesh error: a fast perception correlated distortion measure for fixed connectivity triangle meshes,” Computer Graphics Forum, vol. 31, no. 5, pp. 1715–1724, 2012.
  14. Z. Zhang, W. Sun, X. Min, T. Wang, W. Lu, W. Zhu, and G. Zhai, “A no-reference visual quality metric for 3d color meshes,” in 2021 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).   IEEE, 2021, pp. 1–6.
  15. Q. Yang, H. Chen, Z. Ma, Y. Xu, R. Tang, and J. Sun, “Predicting the perceptual quality of point cloud: A 3d-to-2d projection-based exploration,” IEEE Transactions on Multimedia, 2020.
  16. Q. Liu, H. Su, Z. Duanmu, W. Liu, and Z. Wang, “Perceptual quality assessment of colored 3d point clouds,” IEEE Transactions on Visualization and Computer Graphics, 2022.
  17. Y. Liu, Q. Yang, Y. Xu, and L. Yang, “Point cloud quality assessment: Dataset construction and learning-based no-reference metric,” ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2022.
  18. Y. Nehmé, F. Dupont, J. P. Farrugia, P. Le Callet, and G. Lavoué, “Visual quality of 3d meshes with diffuse colors in virtual reality: Subjective and objective evaluation,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 3, pp. 2202–2219, 2021.
  19. Y. Nehmé, J. Delanoy, F. Dupont, J.-P. Farrugia, P. Le Callet, and G. Lavoué, “Textured mesh quality assessment: Large-scale dataset and deep learning-based quality metric,” ACM Transactions on Graphics, 2022.
  20. E. Alexiou and T. Ebrahimi, “On the performance of metrics to predict quality in point cloud representations,” in Applications of digital image processing XL, vol. 10396.   SPIE, 2017, pp. 282–297.
  21. E. Alexiou, E. Upenik, and T. Ebrahimi, “Towards subjective quality assessment of point cloud imaging in augmented reality,” in 2017 IEEE 19th International Workshop on Multimedia Signal Processing (MMSP).   IEEE, 2017, pp. 1–6.
  22. E. Alexiou and T. Ebrahimi, “Impact of visualization strategy for subjective quality assessment of point clouds,” in 2018 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).   IEEE, 2018, pp. 1–6.
  23. R. Mekuria, Z. Li, C. Tulvan, and P. Chou, “Evaluation criteria for pcc (point cloud compression). iso,” IEC JTC1/SC29/WG11, Tech. Rep., 2016.
  24. D. Tian, H. Ochimizu, C. Feng, R. Cohen, and A. Vetro, “Geometric distortion metrics for point cloud compression,” in 2017 IEEE International Conference on Image Processing (ICIP), 2017, pp. 3460–3464.
  25. E. M. Torlig, E. Alexiou, T. A. Fonseca, R. L. de Queiroz, and T. Ebrahimi, “A novel methodology for quality assessment of voxelized point clouds,” in Applications of Digital Image Processing XLI, vol. 10752, 2018, pp. 174–190.
  26. Q. Yang, Z. Ma, Y. Xu, Z. Li, and J. Sun, “Inferring point cloud quality via graph similarity,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.
  27. G. Meynet, Y. Nehmé, J. Digne, and G. Lavoué, “Pcqm: A full-reference quality metric for colored 3d point clouds,” in 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX), 2020, pp. 1–6.
  28. E. Alexiou and T. Ebrahimi, “Towards a point cloud structural similarity metric,” in 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 2020, pp. 1–6.
  29. Z. Zhang, W. Sun, X. Min, T. Wang, W. Lu, and G. Zhai, “No-reference quality assessment for 3d colored point cloud and mesh models,” IEEE Transactions on Circuits and Systems for Video Technology, 2022.
  30. W. Zhou, Q. Yang, Q. Jiang, G. Zhai, and W. Lin, “Blind quality assessment of 3d dense point clouds with structure guided resampling,” arXiv preprint arXiv:2208.14603, 2022.
  31. Q. Liu, H. Yuan, H. Su, H. Liu, Y. Wang, H. Yang, and J. Hou, “Pqa-net: Deep no reference point cloud quality assessment via multi-view projection,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 12, pp. 4645–4660, 2021.
  32. Y. Fan, Z. Zhang, W. Sun, X. Min, N. Liu, Q. Zhou, J. He, Q. Wang, and G. Zhai, “A no-reference quality assessment metric for point cloud based on captured video sequences,” in 2022 IEEE 24th International Workshop on Multimedia Signal Processing (MMSP).   IEEE, 2022, pp. 1–5.
  33. Z. Zhang, W. Sun, X. Min, W. Wu, Y. Chen, and G. Zhai, “Treating point cloud as moving camera videos: A no-reference quality assessment metric,” arXiv preprint arXiv:2208.14085, 2022.
  34. Z. Zhang, W. Sun, X. Min, Q. Zhou, J. He, Q. Wang, and G. Zhai, “Mm-pcqa: Multi-modal learning for no-reference point cloud quality assessment,” IJCAI, 2023.
  35. A. Mittal, A. K. Moorthy, and A. C. Bovik, “No-reference image quality assessment in the spatial domain,” IEEE Transactions on Image Processing, vol. 21, no. 12, pp. 4695–4708, 2012.
  36. N. D. Narvekar and L. J. Karam, “A no-reference image blur metric based on the cumulative probability of blur detection (cpbd),” IEEE Transactions on Image Processing, vol. 20, no. 9, pp. 2678–2683, 2011.
  37. X. Min, G. Zhai, K. Gu, Y. Liu, and X. Yang, “Blind image quality estimation via distortion aggravation,” IEEE Transactions on Broadcasting, vol. 64, no. 2, pp. 508–517, 2018.
  38. K. Gu, G. Zhai, X. Yang, and W. Zhang, “Using free energy principle for blind image quality assessment,” IEEE Transactions on Multimedia, vol. 17, no. 1, pp. 50–63, 2014.
  39. W. Zhang, K. Ma, J. Yan, D. Deng, and Z. Wang, “Blind image quality assessment using a deep bilinear convolutional neural network,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 1, pp. 36–47, 2018.
  40. S. Su, Q. Yan, Y. Zhu, C. Zhang, X. Ge, J. Sun, and Y. Zhang, “Blindly assess image quality in the wild guided by a self-adaptive hyper network,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3667–3676.
  41. J. Ke, Q. Wang, Y. Wang, P. Milanfar, and F. Yang, “Musiq: Multi-scale image quality transformer,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 5148–5157.
  42. W. Sun, X. Min, D. Tu, S. Ma, and G. Zhai, “Blind quality assessment for in-the-wild images via hierarchical feature fusion and iterative mixed database training,” IEEE Journal of Selected Topics in Signal Processing, 2023.
  43. L. Zhang, L. Zhang, and A. C. Bovik, “A feature-enriched completely blind image quality evaluator,” IEEE Transactions on Image Processing, vol. 24, no. 8, pp. 2579–2591, 2015.
  44. D. Hasler and S. E. Suesstrunk, “Measuring colorfulness in natural images,” in Human vision and electronic imaging VIII, vol. 5007.   SPIE, 2003, pp. 87–95.
  45. J. Tian, L. Chen, and G. Chen, “A new image quality assessment metric based on visual perception,” Signal Processing: Image Communication, vol. 26, no. 6, pp. 322–335, 2011.
  46. H. R. Sheikh and A. C. Bovik, “Image information and visual quality,” IEEE Transactions on Image Processing, vol. 15, no. 2, pp. 430–444, 2006.
  47. L. Zhang, L. Zhang, X. Mou, and D. Zhang, “Fsim: A feature similarity index for image quality assessment,” IEEE Transactions on Image Processing, vol. 20, no. 8, pp. 2378–2386, 2014.
  48. H. R. Yeganeh and G. Ramírez, “Objective quality assessment of tone-mapped images,” IEEE Transactions on Image Processing, vol. 21, no. 8, pp. 3373–3386, 2012.
  49. M. Garland and P. S. Heckbert, “Simplifying surfaces with color and texture using quadric error metrics,” in IEEE Proceedings Visualization, 1998, pp. 263–269.
  50. Q.-Y. Zhou, J. Park, and V. Koltun, “Open3D: A modern library for 3D data processing,” arXiv:1801.09847, 2018.
  51. R. I.-R. BT, “Methodology for the subjective assessment of the quality of television pictures,” International Telecommunication Union, 2002.
  52. D. Graziosi, O. Nakagami, S. Kuma, A. Zaghetto, T. Suzuki, and A. Tabatabai, “An overview of ongoing point cloud compression standardization activities: Video-based (v-pcc) and geometry-based (g-pcc),” APSIPA Transactions on Signal and Information Processing, vol. 9, 2020.
  53. H. Wu, L. Liao, A. Wang, C. Chen, J. Hou, W. Sun, Q. Yan, and W. Lin, “Towards robust text-prompted semantic criterion for in-the-wild video quality assessment,” arXiv preprint arXiv:2304.14672, 2023.
  54. J. Wang, K. C. Chan, and C. C. Loy, “Exploring clip for assessing the look and feel of images,” in AAAI, 2023.
  55. V. Q. E. Group et al., “Final report from the video quality experts group on the validation of objective models of video quality assessment,” in VQEG meeting, Ottawa, Canada, March, 2000, 2000.
  56. Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.
  57. Z. Wang, E. P. Simoncelli, and A. C. Bovik, “Multiscale structural similarity for image quality assessment,” in The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, vol. 2, 2003, pp. 1398–1402.
  58. W. Xue, L. Zhang, X. Mou, and A. C. Bovik, “Gradient magnitude similarity deviation: A highly efficient perceptual image quality index,” IEEE transactions on image processing, vol. 23, no. 2, pp. 684–695, 2013.
  59. D. Tian, H. Ochimizu, C. Feng, R. Cohen, and A. Vetro, “Evaluation metrics for point cloud compression,” ISO/IEC JTC m74008, Geneva, Switzerland, vol. 1, no. 3, 2017.
  60. D. Tian, H. Ochimizu, C. Feng, R. Cohen, and A. Vetre, “Updates and integration of evaluation metric software for pcc,” ISO/IEC JTC1/SC29/WG11 input document MPEG2017 M, vol. 40522, p. 26, 2017.
  61. R. Mekuria, K. Blom, and P. Cesar, “Design, implementation, and evaluation of a point cloud codec for tele-immersive video,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 4, pp. 828–842, 2016.
  62. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
  63. C. Schuhmann, R. Beaumont, R. Vencu, C. Gordon, R. Wightman, M. Cherti, T. Coombes, A. Katta, C. Mullis, M. Wortsman et al., “Laion-5b: An open large-scale dataset for training next generation image-text models,” arXiv preprint arXiv:2210.08402, 2022.
  64. R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 586–595.
  65. H. Sheikh, M. Sabir, and A. Bovik, “A statistical evaluation of recent full reference image quality assessment algorithms,” IEEE Transactions on Image Processing, vol. 15, no. 11, pp. 3440–3451, 2006.
Citations (27)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.