Human Shape and Clothing Estimation (2402.18032v1)
Abstract: Human shape and clothing estimation has gained significant prominence in various domains, including online shopping, fashion retail, augmented reality (AR), virtual reality (VR), and gaming. The visual representation of human shape and clothing has become a focal point for computer vision researchers in recent years. This paper presents a comprehensive survey of the major works in the field, focusing on four key aspects: human shape estimation, fashion generation, landmark detection, and attribute recognition. For each of these tasks, the survey paper examines recent advancements, discusses their strengths and limitations, and qualitative differences in approaches and outcomes. By exploring the latest developments in human shape and clothing estimation, this survey aims to provide a comprehensive understanding of the field and inspire future research in this rapidly evolving domain.
- “Single Stage Virtual Try-on via Deformable Attention Flows”, 2022 arXiv:2207.09161 [cs.CV]
- “Learning Flexible Models from Image Sequences”, 1994, pp. 299–308 DOI: 10.1007/3-540-57956-7_34
- S. Belongie, J. Malik and J. Puzicha “Shape matching and object recognition using shape contexts” In IEEE Transactions on Pattern Analysis and Machine Intelligence 24.4, 2002, pp. 509–522 DOI: 10.1109/34.993558
- “Multi-depth dilated network for fashion landmark detection with batch-level online hard keypoint mining” In Image Vis. Comput. 99, 2020, pp. 103930 DOI: 10.1016/j.imavis.2020.103930
- “Realtime multi-person 2d pose estimation using part affinity fields” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 7291–7299
- “End-to-end object detection with transformers” In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16, 2020, pp. 213–229 Springer
- Huizhong Chen, Andrew Gallagher and Bernd Girod “Describing clothing by semantic attributes” In Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part III 12, 2012, pp. 609–623 Springer
- “Improving Fashion Landmark Detection by Dual Attention Feature Enhancement” In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), 2019, pp. 3101–3104 DOI: 10.1109/ICCVW.2019.00374
- “Bodies at Rest: 3D Human Pose and Shape Estimation from a Pressure Image using Synthetic Data”, 2020 arXiv:2004.01166 [cs.CV]
- D.M Gavrila “The Visual Analysis of Human Movement: A Survey” In Computer Vision and Image Understanding 73.1, 1999, pp. 82–98 DOI: https://doi.org/10.1006/cviu.1998.0716
- “You can try without visiting: a comprehensive survey on virtually try-on outfits” In Multimedia Tools and Applications 81.14 Springer, 2022, pp. 19967–19998
- “ClothFlow: A Flow-Based Model for Clothed Person Generation”, 2019, pp. 10470–10479 DOI: 10.1109/ICCV.2019.01057
- “VITON: An Image-based Virtual Try-on Network”, 2018 arXiv:1711.08447 [cs.CV]
- “Deep residual learning for image recognition” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778
- Tomoharu Iwata, Shinji Watanabe and Hiroshi Sawada “Fashion coordinates recommender system using photographs from fashion magazines” In Twenty-Second International Joint Conference on Artificial Intelligence, 2011
- “Model-based estimation of 3D human motion” In IEEE Transactions on Pattern Analysis and Machine Intelligence 22.12, 2000, pp. 1453–1459 DOI: 10.1109/34.895978
- “End-to-end Recovery of Human Shape and Pose”, 2018 arXiv:1712.06584 [cs.CV]
- “Hipster wars: Discovering elements of fashion styles” In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13, 2014, pp. 472–488 Springer
- “Learning to Reconstruct 3D Human Pose and Shape via Model-fitting in the Loop”, 2019 arXiv:1909.12828 [cs.CV]
- “A Global-Local Emebdding Module for Fashion Landmark Detection”, 2019 arXiv:1908.10548 [cs.CV]
- Chen Li, Kun Zhou and Stephen Lin “Simulating makeup through physics-based manipulation of intrinsic image layers” In Proceedings of the IEEE Conference on computer vision and pattern recognition, 2015, pp. 4621–4629
- “Monocular Real-Time Volumetric Performance Capture”, 2020 arXiv:2007.13988 [cs.CV]
- “Spatial-Aware Non-Local Attention for Fashion Landmark Detection”, 2019 arXiv:1903.04104 [cs.CV]
- “Feature pyramid networks for object detection” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2117–2125
- Si Liu, Luoqi Liu and Shuicheng Yan “Fashion analysis: Current techniques and future directions” In IEEE MultiMedia 21.2 IEEE, 2014, pp. 72–79
- “Comparing VR-and AR-based try-on systems using personalized avatars” In Electronics 9.11 MDPI, 2020, pp. 1814
- “Swin transformer: Hierarchical vision transformer using shifted windows” In Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 10012–10022
- “DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations” In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 1096–1104 DOI: 10.1109/CVPR.2016.124
- “Deepfashion: Powering robust clothes recognition and retrieval with rich annotations” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 1096–1104
- “Fashion Landmark Detection in the Wild”, 2016 arXiv:1608.03049 [cs.CV]
- “SMPL: A Skinned Multi-Person Linear Model” In ACM Trans. Graph. 34.6 New York, NY, USA: Association for Computing Machinery, 2015 DOI: 10.1145/2816795.2818013
- David G Lowe “Distinctive image features from scale-invariant keypoints” In International journal of computer vision 60 Springer, 2004, pp. 91–110
- “Shape and nonrigid motion estimation through physics-based synthesis” In IEEE Transactions on Pattern Analysis and Machine Intelligence 15.6, 1993, pp. 580–591 DOI: 10.1109/34.216727
- Matiur Rahman Minar, Thai Thanh Tuan and Heejune Ahn “CloTH-VTON+: Clothing Three-Dimensional Reconstruction for Hybrid Image-Based Virtual Try-ON” In IEEE Access 9, 2021, pp. 30960–30978 DOI: 10.1109/ACCESS.2021.3059701
- “CP-VTON+: Clothing Shape and Texture Preserving Image-Based Virtual Try-On” In The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2020
- Chen Ning, Yang Di and Li Menglu “Survey on clothing image retrieval with cross-domain” In Complex & Intelligent Systems 8.6 Springer, 2022, pp. 5531–5544
- “Expressive Body Capture: 3D Hands, Face, and Body from a Single Image” In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2019
- Tomas Pfister, James Charles and Andrew Zisserman “Flowing ConvNets for Human Pose Estimation in Videos”, 2015 arXiv:1506.02897 [cs.CV]
- “Pose guided attention for multi-label fashion image classification” In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2019, pp. 0–0
- “PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization” In CoRR abs/1905.05172, 2019 arXiv: http://arxiv.org/abs/1905.05172
- “PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization”, 2020 arXiv:2004.00452 [cs.CV]
- “Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization” In International Journal of Computer Vision 128.2 Springer ScienceBusiness Media LLC, 2019, pp. 336–359 DOI: 10.1007/s11263-019-01228-7
- Akash Sengupta, Ignas Budvytis and Roberto Cipolla “Synthetic Training for Accurate 3D Human Pose and Shape Estimation in the Wild”, 2020 arXiv:2009.10013 [cs.CV]
- Hao Tian, Yu Cao and PY Mok “DETR-based Layered Clothing Segmentation and Fine-Grained Attribute Recognition” In arXiv preprint arXiv:2304.08107, 2023
- “Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation”, 2014 arXiv:1406.2984 [cs.CV]
- “A statistical approach to texture classification from single images” In International journal of computer vision 62 Springer, 2005, pp. 61–81
- “Toward Characteristic-Preserving Image-based Virtual Try-On Network”, 2018 arXiv:1807.07688 [cs.CV]
- Rui Wang, Jun Feng and Qirong Bu “Fashion Landmark Detection via Deep Residual Spatial Attention Network” In 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI), 2021, pp. 745–752 DOI: 10.1109/ICTAI52525.2021.00118
- “Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification” In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 4271–4280 DOI: 10.1109/CVPR.2018.00449
- “Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification”, 2018 DOI: 10.1109/CVPR.2018.00449
- “Non-local Neural Networks”, 2018 arXiv:1711.07971 [cs.CV]
- “Fashionformer: A simple, effective and unified baseline for human fashion segmentation and recognition” In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXVII, 2022, pp. 545–563 Springer
- “Parsing clothing in fashion photographs” In 2012 IEEE Conference on Computer vision and pattern recognition, 2012, pp. 3570–3577 IEEE
- “Unconstrained Fashion Landmark Detection via Hierarchical Recurrent Transformer Networks”, 2017 arXiv:1708.02044 [cs.CV]
- “LASOR: Learning Accurate 3D Human Pose and Shape via Synthetic Occlusion-Aware Data and Neural Mesh Rendering” In IEEE Transactions on Image Processing 31 Institute of ElectricalElectronics Engineers (IEEE), 2022, pp. 1938–1948 DOI: 10.1109/tip.2022.3149229
- “Multi-Scale Context Aggregation by Dilated Convolutions”, 2016 arXiv:1511.07122 [cs.CV]
- “Body Meshes as Points” In CoRR abs/2105.02467, 2021 arXiv: https://arxiv.org/abs/2105.02467
- “Deformable DETR: Deformable Transformers for End-to-End Object Detection”, 2021 arXiv:2010.04159 [cs.CV]