Papers
Topics
Authors
Recent
2000 character limit reached

CrowdRec: 3D Crowd Reconstruction from Single Color Images (2310.06332v1)

Published 10 Oct 2023 in cs.CV

Abstract: This is a technical report for the GigaCrowd challenge. Reconstructing 3D crowds from monocular images is a challenging problem due to mutual occlusions, server depth ambiguity, and complex spatial distribution. Since no large-scale 3D crowd dataset can be used to train a robust model, the current multi-person mesh recovery methods can hardly achieve satisfactory performance in crowded scenes. In this paper, we exploit the crowd features and propose a crowd-constrained optimization to improve the common single-person method on crowd images. To avoid scale variations, we first detect human bounding-boxes and 2D poses from the original images with off-the-shelf detectors. Then, we train a single-person mesh recovery network using existing in-the-wild image datasets. To promote a more reasonable spatial distribution, we further propose a crowd constraint to refine the single-person network parameters. With the optimization, we can obtain accurate body poses and shapes with reasonable absolute positions from a large-scale crowd image using a single-person backbone. The code will be publicly available at~\url{https://github.com/boycehbz/CrowdRec}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (20)
  1. 2d human pose estimation: New benchmark and state of the art analysis. In Proceedings of the IEEE Conference on computer Vision and Pattern Recognition, pages 3686–3693, 2014.
  2. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7291–7299, 2017.
  3. A mobile vision system for robust multi-person tracking. In 2008 IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8. IEEE, 2008.
  4. Alphapose: Whole-body regional multi-person pose estimation and tracking in real-time. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
  5. Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430, 2021.
  6. Reconstructing groups of people with hypergraph relational reasoning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14873–14883, 2023.
  7. Human3. 6m: Large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE transactions on pattern analysis and machine intelligence, 36(7):1325–1339, 2013.
  8. Deep social grouping network for large scenes with multiple subjects. 51(8):1287–1301, 2021.
  9. Cliff: Carrying location information in full frames into human pose and shape estimation. In European Conference on Computer Vision, pages 590–606. Springer, 2022.
  10. Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755. Springer, 2014.
  11. Smpl: A skinned multi-person linear model. ACM transactions on graphics (TOG), 34(6):1–16, 2015.
  12. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
  13. Mot16: A benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831, 2016.
  14. Crowdhuman: A benchmark for detecting human in a crowd. arXiv preprint arXiv:1805.00123, 2018.
  15. Deep high-resolution representation learning for human pose estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5693–5703, 2019.
  16. Monocular, one-stage, regression of multiple 3d people. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 11179–11188, 2021.
  17. Putting people in their place: Monocular regression of 3d people in depth. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13243–13252, 2022.
  18. Panda:a gigapixel-level human-centric video dataset. In CVPR, pages 3265–3275, 2020.
  19. Citypersons: A diverse dataset for pedestrian detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3213–3221, 2017.
  20. Bytetrack: Multi-object tracking by associating every detection box. In European Conference on Computer Vision, pages 1–21. Springer, 2022.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.