epsilon-Mesh Attack: A Surface-based Adversarial Point Cloud Attack for Facial Expression Recognition (2403.06661v1)
Abstract: Point clouds and meshes are widely used 3D data structures for many computer vision applications. While the meshes represent the surfaces of an object, point cloud represents sampled points from the surface which is also the output of modern sensors such as LiDAR and RGB-D cameras. Due to the wide application area of point clouds and the recent advancements in deep neural networks, studies focusing on robust classification of the 3D point cloud data emerged. To evaluate the robustness of deep classifier networks, a common method is to use adversarial attacks where the gradient direction is followed to change the input slightly. The previous studies on adversarial attacks are generally evaluated on point clouds of daily objects. However, considering 3D faces, these adversarial attacks tend to affect the person's facial structure more than the desired amount and cause malformation. Specifically for facial expressions, even a small adversarial attack can have a significant effect on the face structure. In this paper, we suggest an adversarial attack called $\epsilon$-Mesh Attack, which operates on point cloud data via limiting perturbations to be on the mesh surface. We also parameterize our attack by $\epsilon$ to scale the perturbation mesh. Our surface-based attack has tighter perturbation bounds compared to $L_2$ and $L_\infty$ norm bounded attacks that operate on unit-ball. Even though our method has additional constraints, our experiments on CoMA, Bosphorus and FaceWarehouse datasets show that $\epsilon$-Mesh Attack (Perpendicular) successfully confuses trained DGCNN and PointNet models $99.72\%$ and $97.06\%$ of the time, with indistinguishable facial deformations. The code is available at https://github.com/batuceng/e-mesh-attack.
- S. Afshar and A. Ali Salah. Facial expression recognition in the wild using improved dense trajectories and fisher vector encoding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 66–74, 2016.
- Disentangling 3d/4d facial affect recognition with faster multi-view transformer. IEEE Signal Processing Letters, 28:1913–1917, 2021.
- Instant multi-view head capture through learnable registration. In Conference on Computer Vision and Pattern Recognition (CVPR), pages 768–779, 2023.
- Facewarehouse: A 3d facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics, 20(3):413–425, 2013.
- N. Carlini and D. Wagner. Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp), pages 39–57. Ieee, 2017.
- Umb-db: A database of partially occluded 3d faces. In 2011 IEEE international conference on computer vision workshops (ICCV workshops), pages 2113–2119. IEEE, 2011.
- Ppfnet: Global context aware local features for robust 3d point matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 195–205, 2018.
- Video and image based emotion recognition challenges in the wild: Emotiw 2015. In Proceedings of the 2015 ACM on international conference on multimodal interaction, pages 423–426, 2015.
- Facial expression recognition based on spatio-temporal interest points for depth sequences. The Imaging Science Journal, 64(7):396–407, 2016.
- P. Ekman et al. Basic emotions. Handbook of cognition and emotion, 98(45-60):16, 1999.
- Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
- Deepbbs: Deep best buddies for point cloud registration. In 2021 International Conference on 3D Vision (3DV), pages 342–351. IEEE, 2021.
- Shape-invariant 3d adversarial point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15335–15344, 2022.
- Feature-metric registration: A fast semi-supervised approach for robust point cloud registration without correspondences. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11366–11374, 2020.
- Poisson surface reconstruction. In Proceedings of the fourth Eurographics symposium on Geometry processing, volume 7, page 0, 2006.
- Softflow: Probabilistic framework for normalizing flow on manifolds. Advances in Neural Information Processing Systems, 33:16388–16397, 2020.
- Setvae: Learning hierarchical composition for generative modeling of set-structured data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15059–15068, 2021.
- Adversarial examples in the physical world. In Artificial intelligence safety and security, pages 99–112. Chapman and Hall/CRC, 2018.
- S. Li and W. Deng. Deep facial expression recognition: A survey. IEEE transactions on affective computing, 13(3):1195–1215, 2020.
- Adaptive deep metric learning for identity-aware facial expression recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 20–29, 2017.
- 4d facial analysis: A survey of datasets, algorithms and applications. Computers & Graphics, 115:423–445, 2023.
- The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. In 2010 ieee computer society conference on computer vision and pattern recognition-workshops, pages 94–101. IEEE, 2010.
- Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
- Dit-3d: Exploring plain diffusion transformers for 3d shape generation. arXiv preprint arXiv:2307.01831, 2023.
- Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 652–660, 2017.
- Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in neural information processing systems, pages 5099–5108, 2017.
- Generating 3D faces using convolutional mesh autoencoders. In European Conference on Computer Vision (ECCV), pages 725–741, 2018.
- Facial expression classification on web images. In Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), pages 3517–3520. IEEE, 2012.
- Towards 3d point cloud based object maps for household environments. Robotics and Autonomous Systems, 56(11):927–941, 2008.
- Bosphorus database for 3d face analysis. In Biometrics and Identity Management: First European Workshop, BIOID 2008, Roskilde, Denmark, May 7-9, 2008. Revised Selected Papers 1, pages 47–56. Springer, 2008.
- Adversarially robust 3d point cloud recognition using self-supervisions. Advances in Neural Information Processing Systems, 34:15498–15512, 2021.
- Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.
- Y. Wang and J. M. Solomon. Deep closest point: Learning representations for point cloud registration. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3523–3532, 2019.
- Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics (tog), 38(5):1–12, 2019.
- Generating 3d adversarial point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9136–9144, 2019.
- Facial expression recognition by de-expression residue learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2168–2177, 2018.
- Facescape: a large-scale high quality 3d face dataset and detailed riggable 3d face prediction. In Proceedings of the ieee/cvf conference on computer vision and pattern recognition, pages 601–610, 2020.
- Adversarial attack and defense on point sets. arXiv preprint arXiv:1902.10899, 2019.
- A high-resolution 3d dynamic facial expression database. In 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition, pages 1–6, 2008.
- A 3d facial expression database for facial behavior research. In 7th international conference on automatic face and gesture recognition (FGR06), pages 211–216. IEEE, 2006.
- 3d adversarial attacks beyond point cloud. Information Sciences, 633:491–503, 2023.
- Bp4d-spontaneous: a high-resolution spontaneous 3d dynamic facial expression database. Image and Vision Computing, 32(10):692–706, 2014.
- Dup-net: Denoiser and upsampler network for 3d adversarial point clouds defense. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1961–1970, 2019.
- Batuhan Cengiz (4 papers)
- Mert Gulsen (2 papers)
- Yusuf H. Sahin (7 papers)
- Gozde Unal (32 papers)