A Study of Face Obfuscation in ImageNet (2103.06191v3)

Published 10 Mar 2021 in cs.CV

Abstract: Face obfuscation (blurring, mosaicing, etc.) has been shown to be effective for privacy protection; nevertheless, object recognition research typically assumes access to complete, unobfuscated images. In this paper, we explore the effects of face obfuscation on the popular ImageNet challenge visual recognition benchmark. Most categories in the ImageNet challenge are not people categories; however, many incidental people appear in the images, and their privacy is a concern. We first annotate faces in the dataset. Then we demonstrate that face obfuscation has minimal impact on the accuracy of recognition models. Concretely, we benchmark multiple deep neural networks on obfuscated images and observe that the overall recognition accuracy drops only slightly (<= 1.0%). Further, we experiment with transfer learning to 4 downstream tasks (object recognition, scene recognition, face attribute classification, and object detection) and show that features learned on obfuscated images are equally transferable. Our work demonstrates the feasibility of privacy-aware visual recognition, improves the highly-used ImageNet challenge benchmark, and suggests an important path for future visual datasets. Data and code are available at https://github.com/princetonvisualai/imagenet-face-obfuscation.

Citations (136)

View on Semantic Scholar

Summary

The paper demonstrates that applying face obfuscation techniques leads to less than a 1% drop in recognition accuracy across various neural network models.
It employs a combination of Amazon Rekognition and crowdsourcing to accurately annotate faces, revealing that 17% of images contain human faces.
The study confirms that the transferability of features in downstream tasks remains robust despite the use of privacy-preserving obfuscation methods.

Face Obfuscation in ImageNet and Its Impact on Visual Recognition

The paper "A Study of Face Obfuscation in ImageNet" addresses the impact of face obfuscation techniques on the accuracy and usability of large-scale visual recognition datasets, notably the ImageNet challenge. This investigation serves as an important contribution to the field of privacy-preserving machine learning, exploring the balance between data utility and privacy concerns.

Privacy in Visual Recognition Datasets

In datasets like ImageNet, while the categorization of entries does not focus on people, incidental human presence is common, raising privacy concerns. The authors recognize the potential privacy infringement posed by freely available datasets and aim to attenuate this by focusing on face obfuscation, a widely recognized method for protecting individual identities in visual data.

Methodology and Implementation

The authors started by annotating faces within the ImageNet dataset using a combination of Amazon Rekognition and crowdsourcing, ensuring a significant degree of accuracy in face detection. The extensive annotation revealed that a staggering 17% of images in the dataset contain recognizable human faces, a clear testament to the privacy issues at hand.

Two straightforward methods for face obfuscation were employed: blurring and overlaying. These were chosen for their simplicity and past effectiveness in privacy-preserving environments. The authors conducted extensive experiments, benchmarking their impact across various neural network architectures, such as ResNet, DenseNet, and VGG, for image classification tasks.

Numerical Findings and Transferability Insights

Remarkably, the paper reports that the introduction of face obfuscation resulted in a minimal drop in recognition accuracy; less than 1% reduction was observed across models. Blurring showed a validation accuracy drop ranging from 0.1-0.7%, and overlaying from 0.3-1.0%. These marginal changes suggest that visual recognition can be achieved with obfuscated datasets without substantial loss of performance.

Beyond classification accuracy, the paper thoughtfully explores the transferability of features learned from obfuscated datasets to different downstream tasks. The analysis across CIFAR-10, PASCAL VOC, and CelebA datasets indicates that pre-trained models on face-obfuscated ImageNet images do not exhibit compromised feature transferability. This broadens the potential application of privacy-aware visual datasets without forgoing model effectiveness.

Implications and Future Developments

By providing extensive empirical evidence that face obfuscation techniques maintain dataset utility while enhancing privacy, the paper advocates for integrating such techniques into the standard procedure for future dataset creation. The annotated dataset and obfuscation models are made publicly available, providing a valuable resource for further research in privacy-preserving visual recognition.

While the research does not provide formal privacy guarantees beyond empirical testing, it highlights a feasible path for reconciling data utility with privacy in publicly available datasets. As discussions around data ethics and privacy continue to evolve, this work provides essential insights and methodologies for the community to build upon.

Future work could endeavor to explore even broader implications of obfuscation techniques or develop more sophisticated methods to further enhance privacy without sacrificing model accuracy. Integrating face obfuscation systematically in dataset preparation represents a pragmatic step towards ethically-responsible AI development.

PDF Markdown

Related Papers

GitHub

GitHub - princetonvisualai/imagenet-face-obfuscation: Code for the paper "A Study of Face Obfuscation in ImageNet" (49 stars)