VeilGen: Biometric & Glare Synthesis Frameworks

Updated 28 November 2025

VeilGen is a dual-framework approach encompassing peri-ocular biometric recognition with deep-feature analysis and physics-informed veiling glare synthesis and removal.
The first framework employs a VGG-19 based feature extractor with PCA reduction and diverse classifiers to achieve near-perfect identification, gender, age, and expression recognition under occlusion.
The second framework integrates a Stable Diffusion backbone with novel modules to simulate and invert optical degradations, outperforming existing deblurring and dehazing techniques.

VeilGen denotes two distinct, technically unrelated frameworks in the computer vision literature: (1) a deep-feature-based identification system for recognizing veiled individuals on images restricted to the peri-ocular region (Hassanat et al., 2021), and (2) a physics-informed generative model for veiling glare synthesis and removal in lens-degraded imagery (Qian et al., 21 Nov 2025). Each system is foundational in its respective subfield, and both adopt the name "VeilGen" to emphasize the inference or simulation of salient information in the presence of partial occlusion or optical degradation.

1. Peri-Ocular Recognition for Veiled Persons

1.1 Dataset and Problem Definition

VeilGen (Hassanat et al., 2021) addresses biometric recognition in scenarios where only the peri-ocular region is visible due to full-face veiling (e.g., Niqab). The task suite comprises identity (150-way), gender (2-way), age (4-way: Children <18, Youth 19–30, Adults 31–50, Elderly ≥51), and “eye-smile” expression (2-way) recognition. Experiments utilize the VPI-New dataset:

Subjects	Images per Subject	Total Images	Sessions	Gender Distribution	Age Range
150	14	2100	2 × 7	41M/109F	8–78

Acquisition was performed with a 13MP smartphone camera under uncontrolled (office) indoor conditions at distances of 30–50 cm, minor pose variation, and veils in black/white. Labeling encodes all annotation in the filename, including session, ID, gender, age, image index, and expression.

1.2 Deep Feature Extraction Pipeline

VeilGen applies the unmodified VGG-19 network (pretrained on ImageNet) as a fixed feature extractor. Each image undergoes:

Conversion to RGB (if grayscale), and resizing to 224 × 224.
Extraction of the 4096-dimensional activations from fully connected layers FC6 and FC7.
Optional coordinate-wise merging of FC6 and FC7 (min, max, mean).
Application of PCA for dimensionality reduction, with $\alpha \in \{0.99, 0.97, 0.95\}$ variance retained, producing typical feature dimensionalities $m \approx 446$ (99%), $208$ (97%), $137$ (95%).
Vector normalization centered on the training fold mean.

1.3 Classification Methodology

Classification leverages WEKA with 10-fold cross-validation stratified by class and no dedicated test partition. The evaluated models include k-Nearest Neighbors (k = 1, 3, 5), Random Forest (100 trees), Naïve Bayes (Gaussian), BayesNet (heuristic Bayesian structure), and a single-layer feedforward Neural Network (cross-entropy softmax).

Classification is performed on the PCA-reduced feature vectors for each task. No regularization or augmentation is introduced post-feature extraction.

1.4 Performance Results

VeilGen establishes state-of-the-art results for fully veiled identity, gender, age, and expression recognition:

Task	Best Configuration	Accuracy / F1 / AUC	Notable Results
Identification	ANN on FC6 PCA95%	99.95%	±0.05% std
Gender	3NN on FC6 PCA97%	99.91%	Sens. 99.94%, Spec. 99.82%
Age	1NN on FC6 PCA95%	100.00%	Perf. conf. matrix
Eye-smile	ANN on FC6 PCA97%	80.0% (F1 ≈ 0.76)	ROC area ≈ 0.77

VeilGen outperforms prior hand-crafted feature approaches on the VPI dataset for identity (99.95% vs. 97.22%) and gender (99.9% vs. 99.41%).

1.5 Limitations and Extensions

Current limitations of VeilGen include constrained acquisition conditions (indoor, controlled background, near-frontal), lack of network fine-tuning (feature extractor is not updated), and relatively modest expression recognition performance. The framework does not incorporate “in-the-wild” data, and future directions include large-scale veiled-face corpus collection, more robust architectures (e.g., ResNet or ArcFace), and multi-task learning to optimize all recognition axes jointly.

2. Generative Modeling of Veiling Glare in Compact Optics

2.1 Physical Model and Task Definition

VeilGen (Qian et al., 21 Nov 2025) targets image degradation in compact lenses—particularly veiling glare arising from stray-light scattering in non-ideal optics—which leads to spatially varying, depth-independent image degradation. The challenge is twofold: classical dehazing models are ill-suited due to non-depth-dependent scattering, and high-quality paired data for supervised restoration are typically unavailable.

The underlying image formation model is:

$I_{de}^p = (I_c^p \otimes K^p)\,T^p + G^p$

for each patch $p$ and color channel, where $K^p$ is the local PSF, $T^p \in [0,1]$ local transmission (contrast loss), $G^p$ additive glare, and $I_c^p$ the clean reference patch.

2.2 VeilGen Architecture

VeilGen leverages a Stable Diffusion (SD-v2.1) backbone with several novel modules:

IRControlNet: Guides the main U-Net-based denoiser with aberration conditioning.
Latent Optical Transmission and Glare Map Predictor (LOTGMP): Infers latent per-pixel transmission ( $z_{trans}$ ) and glare maps ( $z_{glare}$ ) by processing both noisy latents and target domain encodings, using a shallow, time-embedded convolutional network.
Veiling Glare Imposition Module (VGIM): Physically imposes $z_{trans}$ (scaling) and $z_{glare}$ (addition) at multiple skip levels in the U-Net, forming a differentiable approximation of physical scattering.

Sampling is performed via a DDPM-style process, where for each diffusion step, the denoiser blends predictions conditioned on clean (aberration-only) and compound (aberration + glare) maps, controlled by a fixed mixture coefficient $w=0.85$ .

2.3 Unsupervised Physics-Informed Training

VeilGen distinguishes two training domains:

Source ( $\mathcal{S}$ ): Paired with only aberration, conditioned on neutral transmission/glare maps.
Target ( $\mathcal{T}$ ): Unpaired, with compound degradation; LOTGMP infers the physical maps.

The total generation loss is

$\mathcal{L}_{gen} = p\,\mathcal{L}_\mathcal{S} + (1-p)\,\mathcal{L}_\mathcal{T}$

with $p=0.3$ . Each component is an $\ell_2$ loss in the denoiser’s latent feature space, with Stable Diffusion’s intrinsic priors regularizing the result. This hybrid approach enables unsupervised learning of physically realistic degradation statistics from real-world compact lens data.

2.4 Restoration via DeVeiler Network

The companion DeVeiler restoration network includes:

Encoder–Decoder U-Net structure, with a bottleneck of SwinIR RSTB layers for long-range spatial modeling.
Veiling Glare Compensation Module (VGCM): Performs the inverse transform of VGIM during feature decoding, leveraging predicted $(\hat{z}_{trans},\hat{z}_{glare})$ maps to demodulate and restore features.
Distilled Degradation Network (DDN): A shallow CNN approximating the forward VeilGen scattering model, included for the reversibility (physics) constraint loss.

The loss for restoration combines an $\ell_1$ image term, a perceptual LPIPS term, and a reversibility term penalizing mismatch between DDN-applied degradation on the restored image and the original degraded observation.

2.5 Experimental Evaluation

VeilGen and DeVeiler are evaluated on data from large-aperture Single Lens (SL) and Metasurface-Refractive Lens (MRL) systems:

Method	PSNR (Screen-SL)	SSIM	LPIPS
SwinIR (aberration)	18.18	0.686	0.298
SwinIR + DiffDehaze	19.31	0.642	0.347
QDMR (domain adaptation)	18.45	0.681	0.291
DeVeiler (VeilGen)	22.38	0.729	0.261

No-reference metrics (Realworld-SL): CLIPIQA=0.607, Q-Align=3.987, NIQE=4.448 (all best among tested baselines). Qualitative analysis confirms contrast recovery, color saturation restoration, and fine texture preservation.

Ablation studies show the necessity of both LOTGMP and the SD prior, with performance drops ( $\approx$ 0.74 dB PSNR) on their removal. Traditional CycleGAN and haze-based generation perform markedly worse in paired data synthesis ( $+$ 0.74 dB PSNR for VeilGen).

2.6 Network Implementation and Mathematical Details

Key architectural properties:

LOTGMP: Two 3×3 convolutional layers (ReLU), time embedding MLP, two output heads (transmission, glare).
VGIM/VGCM: At each skip level, features are demodulated as $F’ = F × z_{trans} + z_{glare}$ (VGIM) and $F’’ = (F’ – z_{glare})/z_{trans}$ (VGCM).
DDN: Five-layer CNN for direct pixelwise forward degradation.
Optimization: AdamW or Adam optimizers, progressive learning rate decay, batch sizes 8–16; $\lambda_{rev}=1.0$ in restoration loss.

3. Applications and Results

Peri-Ocular Biometrics: VeilGen (Hassanat et al., 2021) advances peri-ocular recognition under extreme occlusion, demonstrating near-perfect performance in identity, gender, and age classification within the studied protocol.
Optical Deblurring and Dehazing: VeilGen (Qian et al., 21 Nov 2025) provides the first generative framework capable of simulating and inverting realistic compound degradations (optical aberration plus veiling glare) in compact lens systems, supporting both dataset generation and interpretability.

4. Comparative Analysis and Advantages

Compared to prior work:

Biometrics: VeilGen substantially exceeds hand-crafted feature systems for person identification and matches or surpasses gender and age baselines, while being uniquely evaluated on simultaneous expression recognition.
Veiling Glare Synthesis: CycleGAN and conventional haze-based data generation yield inferior synthetic pairs and degraded restoration when compared via full-reference and no-reference image quality metrics.
Restoration Quality: The bidirectional guidance of intermediate feature modulation by transmission/glare maps uniquely enables DeVeiler to outperform both blind and naive cascaded baselines.

5. Limitations and Future Directions

5.1 VeilGen Biometrics

Current data are “in the laboratory”—collected indoors, under controlled settings and near-frontal poses. This suggests diminished generalization to unconstrained “in-the-wild” cases with harsher lighting, more significant occlusion, or resolution variation.
The feature extractor is off-the-shelf and unfinetuned; adaptation or end-to-end training (e.g., via siamese or metric learning) could enhance robustness.
Eye-smile recognition is well below the other tasks (≈80% accuracy), potentially remedied by domain-specific peri-ocular expression networks or augmentation.

5.2 VeilGen Glare Synthesis

The current framework is validated primarily on large-aperture and metasurface-refractive lenses in controlled and real-world screens.
Data synthesis via VeilGen is dependent on the distribution of target degraded images for LOTGMP learning; collection of broader, more diverse compound degraded samples may further improve generalization.
Extension to additional physical degradations (beyond transmission and additive glare) and to end-to-end task-specific inference (e.g., jointly with downstream recognition) is proposed.

6. Broader Context and Significance

VeilGen (Hassanat et al., 2021) establishes a reproducible, feature-centric pipeline for challenging veiled biometrics, providing a baseline for future “privacy-by-occlusion” and partial-face recognition research.
VeilGen (Qian et al., 21 Nov 2025) introduces a physically grounded, diffusion-based paradigm for simulating and inverting optical degradations, bridging generative modeling with physical interpretability. The explicit modeling of latent transmission and glare maps, regularized by strong diffusion priors, establishes a blueprint for future vision systems targeting physically realistic compound image degradation and restoration.

Both frameworks are open-sourced with code and datasets to promulgate further paper in their respective domains.

PDF Markdown Chat (Pro)

References (2)

Deep learning for identification and face, gender, expression recognition under constraints (2021)

Learning Latent Transmission and Glare Maps for Lens Veiling Glare Removal (2025)

VeilGen: Biometric & Glare Synthesis Frameworks

1. Peri-Ocular Recognition for Veiled Persons

1.1 Dataset and Problem Definition

1.2 Deep Feature Extraction Pipeline

1.3 Classification Methodology

1.4 Performance Results

1.5 Limitations and Extensions

2. Generative Modeling of Veiling Glare in Compact Optics

2.1 Physical Model and Task Definition

2.2 VeilGen Architecture

2.3 Unsupervised Physics-Informed Training

2.4 Restoration via DeVeiler Network

2.5 Experimental Evaluation

2.6 Network Implementation and Mathematical Details

3. Applications and Results

4. Comparative Analysis and Advantages

5. Limitations and Future Directions

5.1 VeilGen Biometrics

5.2 VeilGen Glare Synthesis

6. Broader Context and Significance

Whiteboard

Follow Topic

Continue Learning

VeilGen: Biometric & Glare Synthesis Frameworks

1. Peri-Ocular Recognition for Veiled Persons

1.1 Dataset and Problem Definition

1.2 Deep Feature Extraction Pipeline

1.3 Classification Methodology

1.4 Performance Results

1.5 Limitations and Extensions

2. Generative Modeling of Veiling Glare in Compact Optics

2.1 Physical Model and Task Definition

2.2 VeilGen Architecture

2.3 Unsupervised Physics-Informed Training

2.4 Restoration via DeVeiler Network

2.5 Experimental Evaluation

2.6 Network Implementation and Mathematical Details

3. Applications and Results

4. Comparative Analysis and Advantages

5. Limitations and Future Directions

5.1 VeilGen Biometrics

5.2 VeilGen Glare Synthesis

6. Broader Context and Significance

Sponsor

Whiteboard

Follow Topic

Continue Learning

Related Topics