Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 75 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 20 tok/s Pro
GPT-5 High 18 tok/s Pro
GPT-4o 95 tok/s Pro
Kimi K2 193 tok/s Pro
GPT OSS 120B 467 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Towards Zero-Shot Interpretable Human Recognition: A 2D-3D Registration Framework (2403.06658v2)

Published 11 Mar 2024 in cs.CV

Abstract: Large vision models based in deep learning architectures have been consistently advancing the state-of-the-art in biometric recognition. However, three weaknesses are commonly reported for such kind of approaches: 1) their extreme demands in terms of learning data; 2) the difficulties in generalising between different domains; and 3) the lack of interpretability/explainability, with biometrics being of particular interest, as it is important to provide evidence able to be used for forensics/legal purposes (e.g., in courts). To the best of our knowledge, this paper describes the first recognition framework/strategy that aims at addressing the three weaknesses simultaneously. At first, it relies exclusively in synthetic samples for learning purposes. Instead of requiring a large amount and variety of samples for each subject, the idea is to exclusively enroll a 3D point cloud per identity. Then, using generative strategies, we synthesize a very large (potentially infinite) number of samples, containing all the desired covariates (poses, clothing, distances, perspectives, lighting, occlusions,...). Upon the synthesizing method used, it is possible to adapt precisely to different kind of domains, which accounts for generalization purposes. Such data are then used to learn a model that performs local registration between image pairs, establishing positive correspondences between body parts that are the key, not only to recognition (according to cardinality and distribution), but also to provide an interpretable description of the response (e.g.: "both samples are from the same person, as they have similar facial shape, hair color and legs thickness").

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. Keep it smpl: Automatic estimation of 3d human pose and shape from a single image. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part V 14, pages 561–578. Springer, 2016.
  2. D. Casas and M. Comino-Trinidad. SMPLitex: A Generative Model and Dataset for 3D Human Texture Estimation from Single Image. In British Machine Vision Conference (BMVC), 2023.
  3. This looks like that: Deep learning for interpretable image recognition, 2019.
  4. Superpoint: Self-supervised interest point detection and description, 2018.
  5. European Parliament and Council of the European Union. Regulation (EU) 2016/679 of the European Parliament and of the Council.
  6. 2d3d-matchnet: Learning to match keypoints across 2d image and 3d point cloud. In 2019 International Conference on Robotics and Automation (ICRA), pages 4790–4796. IEEE, 2019.
  7. Learning to detect unseen object classes by between-class attribute transfer. In 2009 IEEE conference on computer vision and pattern recognition, pages 951–958. IEEE, 2009.
  8. J. Li and G. H. Lee. Deepi2p: Image-to-point cloud registration via deep classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15960–15969, 2021.
  9. Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions, 2017.
  10. SMPL: A skinned multi-person linear model. ACM Trans. Graphics (Proc. SIGGRAPH Asia), 34(6):248:1–248:16, Oct. 2015.
  11. D. G. Lowe. Object recognition from local scale-invariant features. In Proceedings of the seventh IEEE international conference on computer vision, volume 2, pages 1150–1157. Ieee, 1999.
  12. A unified approach to interpreting model predictions. Advances in neural information processing systems, 30, 2017.
  13. Expressive body capture: 3d hands, face, and body from a single image. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2019.
  14. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660, 2017.
  15. Corri2p: Deep image-to-point cloud registration via dense correspondence. IEEE Transactions on Circuits and Systems for Video Technology, 33(3):1198–1208, 2022.
  16. RenderPeople. Renderpeople website. https://renderpeople.com/, Nov 2023.
  17. ” why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016.
  18. Orb: An efficient alternative to sift or surf. In 2011 International Conference on Computer Vision, pages 2564–2571, 2011.
  19. K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
  20. Interpretable object recognition by semantic prototype analysis. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 800–809, 2024.
  21. Deep learning for person re-identification: A survey and outlook. IEEE transactions on pattern analysis and machine intelligence, 44(6):2872–2893, 2021.
  22. A survey on neural network interpretability. IEEE Transactions on Emerging Topics in Computational Intelligence, 5(5):726–742, 2021.
  23. Scalable person re-identification: A benchmark. In Computer Vision, IEEE International Conference on, 2015.

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.