LightIrisNet: Mobile Iris Segmentation
- LightIrisNet is a lightweight, MobileNetV3-based multi-task segmentation architecture optimized for real-time iris segmentation on commodity smartphones using visible light imaging.
- It integrates atrous spatial pyramid pooling and multi-task output heads to concurrently predict iris masks, pupil masks, boundaries, and geometric parameters even under challenging conditions.
- It achieves sub-1% EER on ISO/IEC-compliant datasets with on-device processing capabilities, ensuring robust and efficient biometric recognition in mobile environments.
LightIrisNet is a lightweight, MobileNetV3-based multi-task segmentation architecture developed to perform real-time and high-accuracy iris segmentation on commodity smartphones in the visible spectrum (VIS), enabling standardized, practical mobile iris recognition under strict image quality protocols. Designed to address the unique challenges posed by VIS imaging—including pigmentation variability, blur, glare, occlusion, and off-axis gaze—LightIrisNet is a core component in end-to-end pipelines that meet ISO/IEC 29794-6 standards and achieve sub-1% Equal Error Rate (EER) on new high-quality datasets.
1. Architectural Fundamentals and Design Choices
LightIrisNet adopts the MobileNetV3-Large backbone to minimize computational cost and memory footprint while retaining strong feature representation suited for mobile vision tasks. Key structural features include:
- Encoder: MobileNetV3-Large (<10M parameters), leveraging depthwise separable convolutions, squeeze-and-excitation blocks, and hard-swish activation.
- Atrous Spatial Pyramid Pooling (ASPP): Multi-scale context module to enhance segmentation of fine iris/pupil boundaries, especially in low-contrast or reflective regions.
- Decoder/Heads: Upconvolutional layers, skip connections, and multi-task output branches debut in a DeepLabv3+ style, enabling simultaneous prediction of:
- Iris mask
- Pupil mask
- Boundary map
- Signed distance transform (SDT)
- Normalized ellipse parameters
Data augmentation simulates real-world VIS degradations: random spatial crops, lighting perturbations, geometric jitter, and limbus erosion. Ellipse supervision regularizes geometric consistency and compensates for ambiguous boundary conditions.
2. Training Strategy and Multi-Task Loss Formulation
The optimization protocol integrates auxiliary losses across all output branches:
- Region (mask) losses: Binary cross-entropy and Tversky loss; latter addresses class imbalance between iris/pupil and background.
- Boundary loss/SDT: Penalizes mask roughness, enforces sharp edges, and ensures topological correctness.
- Ellipse loss: L2 penalty for deviation from true iris/pupil ellipses.
- Adaptive weighting: Empirically balanced for stable convergence and optimal trade-off between region fidelity and geometric regularity.
Training uses the AdamW optimizer ( initial learning rate, cosine decay), gradient clipping, five-epoch warmup, and frozen batch norm layers for encoder stability. Mixed-precision backpropagation further accelerates convergence.
3. Dataset Protocol and ISO Compliant Acquisition
LightIrisNet is trained and validated on 17,120 images sourced from UBIRIS.v1/v2, MICHE, and CUVIRIS—the latter a new dataset acquired via real-time ISO/IEC 29794-6-compliant Android application employing autofocus, sharpness metric (variance of Laplacian), eye framing, and automated feedback. CUVIRIS contains 752 images, 47 subjects, and ensures diverse pigmentation under controlled indoor lighting. Hardware validation covers both Samsung Galaxy S21 Ultra and Google Pixel 6.
Data Split
Training/validation/testing split is by subject: 80% train, 10% validation, 10% test, to ensure no identity leakage. Each sample is annotated with ground-truth masks, geometry priors, and SDTs.
4. Segmentation Performance and Biometric Evaluation Metrics
LightIrisNet achieves state-of-the-art segmentation quality across all evaluation partitions:
- Iris mask Dice: 0.954 (CUVIRIS), mean ~0.94 across all datasets
- Pupil mask Dice: 0.937 (CUVIRIS), mean ~0.92
- Boundary error (): Minimal, matching or outperforming dense segmentation competitors
For biometric matching, pipeline outputs normalized iris strips using Daugman's rubber-sheet transformation; occlusions are masked. Verification is performed using two matchers:
- OSIRIS: Log-Gabor feature encoding, circular Hamming distance
- IrisFormer: Transformer architecture on patch embeddings, robust to translation, rotation, and occlusion
Recognition metrics:
- CUVIRIS EER (OSIRIS): 0.76%
- CUVIRIS EER (IrisFormer): 0.057%
- True Accept Rate at FAR=0.01: 97.9% (OSIRIS)
Comparison to prior methods (UBIRIS, MICHE, DeepIrisNet2, CNN baselines) demonstrates marked improvements in EER with LightIrisNet segmentation, especially when supplied to transformer-based matchers.
5. Real-Time On-Device Execution and Practical Deployment
LightIrisNet is engineered for full on-device processing on contemporary Android hardware:
- Model size: <10M parameters
- Inference time: ~25 ms per frame on Samsung Galaxy S21 Ultra (real-time operation)
- Pipeline stages: eye detection (YOLOv3-Tiny, TFLite), segmentation (LightIrisNet), normalization, feature extraction, matching—all feasible on mobile CPUs and NPUs
Integration with the acquisition app enforces ISO/IEC 29794-6 compliance during capture. The entire pipeline sustains robust throughput (~8 FPS for detection/quality feedback, <0.2s/sample capture), supporting user-facing biometric systems in smartphone environments.
6. Comparison with Prior Lightweight and Heavyweight CNN Segmenters
In direct benchmarking against baseline architectures (VGG-16, DenseNet, DeepIrisNet2, handcrafted methods), LightIrisNet demonstrates superior segmentation (Dice, IoU) and enables lowest EERs when paired with advanced feature extractors:
| Dataset | Method | EER (%) |
|---|---|---|
| UBIRIS.v2 | SCNN | 5.6 |
| UBIRIS.v2 | IrisFormer | 5.1 |
| MICHE-I | CNN | 5–7 |
| CUVIRIS | OSIRIS | 0.76 |
| CUVIRIS | IrisFormer | 0.057 |
This suggests that high-quality segmentation from lightweight networks can dramatically close the gap with transformer or handcrafted pipelines, provided acquisition protocols are standardized and geometric priors enforced.
7. Implications, Open Resources, and Standards Compliance
LightIrisNet establishes a reproducible reference point for VIS iris recognition under rigorously controlled acquisition and open-source protocols. By balancing segmentation acuity, computational efficiency, and standardization, it enables practical deployments in field, personal, and embedded biometrics. The architecture, trained weights, acquisition app, and CUVIRIS dataset subset are released for benchmarking and further research.
A plausible implication is that the integration of geometric supervision (ellipse, SDT), multi-task learning, and real-time MobileNetV3 backbones under ISO/IEC protocols will continue to improve both recognition accuracy and deployability in unconstrained, mobile environments. LightIrisNet architecture may serve as a blueprint for future standardized, efficient biometric segmentation systems.