Papers
Topics
Authors
Recent
2000 character limit reached

Wild Face Anti-Spoofing (WFAS) Dataset

Updated 28 November 2025
  • Wild Face Anti-Spoofing (WFAS) is a large-scale, in-the-wild benchmark featuring 1,383,300 images from 469,920 identities with 17 varied presentation attack types.
  • It employs rigorous subject-split protocols (Known-Type and Unknown-Type) with ISO/IEC 30107-3 metrics to simulate realistic operational scenarios for PAD systems.
  • Benchmark analyses reveal significant challenges in zero-shot generalization, highlighting the need for self-supervised, Transformer-based, and domain-adaptive methods.

The Wild Face Anti-Spoofing (WFAS) dataset is a large-scale, unconstrained face anti-spoofing (FAS) benchmark designed to overcome the quantity and diversity limitations of previous FAS datasets. It is explicitly constructed for developing and evaluating face presentation attack detection (PAD) methods under challenging, real-world scenarios that reflect the operational variability encountered in automated face recognition systems. The WFAS dataset uniquely features a comprehensive array of presentation attack (PA) types, extensive sensor heterogeneity, and rigorous evaluation protocols, establishing itself as a critical resource for advancing generalizable FAS algorithms (Wang et al., 2023).

1. Scope and Composition

WFAS constitutes the first "in-the-wild" FAS benchmark with both scale and diversity unmatched by prior benchmarks. The dataset comprises 1,383,300 images representing 469,920 unique identities, with significant representation across both live and spoof instances:

Class # Images # Subjects
Live 529,571 148,169
Spoof 853,729 321,751
Total 1,383,300 469,920

It encompasses 17 distinct PA types, systematically grouped as follows:

  • 2D-Print Attacks (8 carriers): newspaper, poster, photo, album, picture book, scanned photo, packaging, cloth.
  • 2D-Display Attacks (4 devices): phone, tablet, TV, computer (spanning LCD/IPS/OLED/VA technologies).
  • 3D Attacks (5 subcategories): rigid masks, garage kits, dolls, adult dolls, waxworks.

Spoof images are sourced from diverse internet scenarios (including museum waxworks and TV screens), and data was captured using a heterogeneous array of commercial RGB sensors (e.g., smartphones, DSLRs, scanners), generating variable image resolutions and authentic imaging artifacts. Live samples derive from Creative Commons–licensed sources and are clustered to minimize identity overlap via RetinaFace and ArcFace pipelines.

2. Dataset Partitioning and Protocols

WFAS adopts a subject-split ratio of approximately 4:1:5 (train:dev:test) with strict partitioning: no PA carrier or device type is shared across multiple subsets, ensuring genuine scenario decoupling for robust generalization analysis. Two protocol constructs are defined:

  • Protocol 1 (Known-Type): All 17 attack types appear in train, development, and test sets, emulating global deployment conditions with slight domain shifts.
  • Protocol 2 (Unknown-Type): Only 2D PAs are included in train and development; all 3D PAs are reserved for testing, modeling zero-shot generalization against high-fidelity and previously unseen attacks.

Train and test splits are further engineered so that, for instance, certain print attack carriers like "newspaper" and "album" appear only in training, while others such as "picture book" and "scanned photo" are confined to dev/test.

3. Evaluation Metrics and Procedure

WFAS adheres to ISO/IEC 30107-3 standards for PAD system assessment, reporting the following metrics:

  • APCER (Attack Presentation Classification Error Rate):

APCER=#spoof samples classified as live#all spoof samples\mathrm{APCER} = \frac{\#\text{spoof samples classified as live}}{\#\text{all spoof samples}}

  • BPCER (Bona Fide Presentation Classification Error Rate):

BPCER=#live samples classified as spoof#all live samples\mathrm{BPCER} = \frac{\#\text{live samples classified as spoof}}{\#\text{all live samples}}

  • ACER (Average Classification Error Rate):

ACER=APCER+BPCER2\mathrm{ACER} = \frac{\mathrm{APCER} + \mathrm{BPCER}}{2}

The operational threshold is determined by the Equal Error Rate (EER) on the development set and subsequently fixed for evaluating the test set. This process reflects realistic system configuration in deployed biometric verification.

4. Benchmark Baselines and Comparative Performance

Extensive benchmarking using the WFAS test set quantifies the generalizability of prevalent FAS architectures. Table entries below summarize model performances across both protocols, with ACER as the principal metric:

Model Protocol 1 (Known-Type) ACER (%) Protocol 2 (Unknown-Type) ACER (%)
ResNet-50 7.71 27.43 (APCER: 47.10, BPCER: 7.76)
PatchNet 8.53 32.40
MaxVit (ViT) 6.58 29.12
CDCN++ variants 8.74–11.50 30.11–30.72
DC-CDN/DCN 10.61–19.40 33.28–42.17
LGSC 8.50 26.47

Classification-only methods (ResNet-50, MaxVit) exhibit strongest performance under Protocol 1, surpassing pixel-wise auxiliary and generative approaches. All methods show substantial ACER deterioration facing unseen 3D PAs (Protocol 2), though LGSC demonstrates resilience. This suggests the prevailing challenge in zero-shot OOD generalization in FAS.

5. Challenge Results and State-of-the-Art Innovations

WFAS was the foundation for the Wild Face Anti-Spoofing Challenge hosted at the CVPR 2023 Workshop, attracting 219 competitive teams. The top-3 solutions in Protocol 1 achieved the following results:

Team ACER (%) APCER (%) BPCER (%)
ChinaTelecom 1.601 1.296 1.906
Meituan 2.221 – –
NetEase 2.554 – –

Principal innovations among top finalists included:

Findings indicate that robust representation learning using self-supervision and Transformer-based encoders significantly enhances anti-spoofing accuracy under unconstrained conditions (Wang et al., 2023).

6. Research Implications and Directions

Insights derived from the WFAS initiative and challenge motivate several research directions:

  • Reformulation of Generative Pixel-wise Supervision: Most extant methods focus on suppressing diverse spoof cues but overlook the compactness of live-face cues in attacks; future approaches may benefit from explicitly modeling and minimizing the dispersion of live cues in spoof distributions.
  • Self-Supervised and Transformer-Based Pretraining: Success on WFAS demonstrates the value of large, unlabeled or pseudo-labeled FAS-specific corpora for improving model robustness.
  • Domain-Adaptive and Zero-Shot Defenses: Substantial generalization gaps under Protocol 2 point to future work in meta-learning, domain randomization, adversarial adaptation, and improved style–content disentanglement for OOD PAD.
  • Interpretability and Explainable FAS: Integration of deep generative modeling with human-interpretable metrics can facilitate more transparent and actionable defense strategies.
  • Continuous Dataset Expansion: WFAS establishes a precedent for ongoing wild data collection; sustaining security requires contributions of new spoof modalities (e.g., deepfakes, animated masks) and frequent scenario diversification.

A plausible implication is that continued dataset enrichment and methodological innovation are necessary to counteract evolving PA vectors and sensor proliferation in the wild (Wang et al., 2023).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Wild Face Anti-Spoofing (WFAS) Dataset.