BREA-Depth: Anatomical Monocular Depth Estimation
- BREA-Depth is a deep learning framework for monocular depth estimation in bronchoscopic imagery that integrates airway anatomical priors to enhance 3D airway reconstruction.
- It leverages a dual encoder–decoder architecture with CycleGAN-based domain adaptation and an airway structure awareness loss to enforce depth consistency and anatomical realism.
- Evaluations on synthetic and real datasets show improved structural preservation and performance metrics like LocalAccu and DepthCon compared to baseline models.
BREA-Depth is a deep learning framework designed for monocular depth estimation in bronchoscopic imagery, with explicit integration of airway-specific anatomical priors for improved structural accuracy in 3D airway reconstruction. Developed to address deficiencies in standard depth foundation models—specifically their lack of anatomical awareness and tendency to overfit local textures—BREA-Depth augments unsupervised domain adaptation pipelines with airway geometry priors, cycle-consistent translation, and airway structure–aware losses to enforce depth consistency within airway lumens (Zhang et al., 15 Sep 2025).
1. Model Architecture and Data Flow
BREA-Depth extends unpaired CycleGAN adaptation to leverage domain knowledge in bronchoscopic settings. The framework consists of dual encoder–decoder pathways for translation between synthetic and real domains:
- Synthetic domain (): Pairs of synthetic bronchoscopic images () and perfect depth maps () rendered from a Blender-based 3D airway model incorporating anatomical branching parameters.
- Real domain (): Real bronchoscopic images (), labeled with pseudo-depth from a prior depth foundation model (DepthAnything).
The architecture uses two U-Net Transformer–like encoder–decoder branches:
- Syn2Real: maps to
- Real2Syn: maps to 0
A shared PatchGAN discriminator (1) enforces consistency and realism on both the RGB and depth channel outputs.
At inference time, a real frame can be passed through Real2Syn for denoising and synthetic-style depth, or through Syn2Real for a refined depth adapted to real bronchoscopic appearance. Outputs are depth (or disparity) maps for downstream back-projection and 3D reconstruction.
2. Loss Functions and Mathematical Formulation
BREA-Depth optimizes a composite objective, incorporating adversarial, cycle-consistency, identity, and airway structure losses:
- Adversarial Loss (2): Applied to both synthetic-to-real and real-to-synthetic branches, discriminates joint (RGB, depth) tuples between true and generated domains.
- Cycle-Consistency Loss (3): Enforces bidirectional translation consistency of both color and depth information.
- Identity Loss (4): Penalizes unnecessary changes when reconstructing within-domain samples.
- Airway Structure Awareness Loss (5): Imposes anatomical priors by enforcing that predicted depth inside the airway lumen exceeds (is deeper than) that outside, encouraging correct gradients at airway bifurcations and preserving lumen continuity.
The full objective is: 6 with standard hyperparameters (7, 8, 9, 0).
3. Airway Structure Awareness Loss and Anatomical Realism
Airway structure awareness loss is computed using a binary mask 1 (simple grayscale thresholding of input images) to localize the airway lumen. The loss enforces that the mean depth prediction inside the lumen (2) is significantly larger than outside, via a hinge loss: 3 where
4
This loss directly encodes the anatomical expectation that the airway lumen should be the deepest visible region, promoting smooth depth gradients across bifurcations and robustifying output to ambiguous cues or low illumination.
4. Evaluation Metrics: Quantifying Structural Consistency
BREA-Depth introduces two airway-specific metrics for structural evaluation:
- Lowest Depth Localization Accuracy (LocalAccu): Fraction of pixels with minimum predicted depth lying within the ground-truth airway lumen mask.
- Depth Contrast Consistency (DepthCon): Z-score of separation between mean depth inside the lumen and outside it, standardized by outside-lumen depth variance; large negative values indicate anatomically plausible predictions.
These metrics complement traditional depth measures (AbsRel, RMSE, etc.) by quantifying preservation of airway geometry.
| Metric | Definition / Use |
|---|---|
| LocalAccu | Min depth pixels' overlap with ground-truth lumen |
| DepthCon | Z-scored depth contrast (lumen vs. non-lumen) |
These evaluations demonstrate BREA-Depth's anatomical fidelity, with improvements over foundation models or ablations (removal of CycleGAN or 5) reflected in both airway-specific and classical metrics.
5. Datasets, Training, and Implementation
BREA-Depth employs multiple sources:
- Synthetic: 9,500 image+depth pairs from anatomically parameterized Blender simulations.
- Real Bronchoscopy: 55,000 commercial bronchoscope frames (pseudo-depth from DepthAnything); ex-vivo annotated dataset with 3,437 manually labeled frames.
- Phantom Benchmark: 16 video sequences (39,599 frames) from bronchial phantom with CT-ground truth.
Training uses PyTorch (RTX 3080), batch size 2, Adam optimizer (6), for 30 epochs. Data preprocessing standardizes spatial resolution and applies randomized augmentation. Model and data resources are publicly released, supporting replication.
6. Comparative Performance and Results
Quantitative analysis on ex-vivo and phantom datasets shows that BREA-Depth achieves superior airway structure preservation:
| Method | DepthCon (↑) | LocalAccu (↑) | AbsRel (↓) | RMSE (↓) | δ(1.25) (↑) |
|---|---|---|---|---|---|
| BREA-Depth | 97.27% | 62.36% | 0.23 | 12.26 | 70.64% |
| DepthAnything | 70.55% | 45.64% | 0.24 | 12.98 | 64.25% |
| 3cGAN | 99.27% | 57.00% | 0.33 | 15.67 | 57.84% |
| Ablation (no CycleGAN) | 68.36% | 25.36% | — | — | — |
Qualitative results illustrate that BREA-Depth retains lumen demarcation and depth continuity across complex bifurcating regions, with classic metrics underreporting structural improvement due to ground-truth sparsity.
7. Extensions and Future Directions
- Acquisition of in vivo–grade or higher-fidelity ground truth for more robust validation.
- Integration with visual odometry for simultaneous localization and mapping (SLAM) in bronchoscopic navigation.
- Augmentation with multi-task heads for airway landmark detection to enhance structural constraints.
- Exploration of diffusion or transformer-only generative models for increased stability and realism.
- Incorporation of model uncertainty quantification to signal unreliable region predictions during clinical usage.
BREA-Depth represents a domain-adapted, anatomically aware approach for monocular depth estimation in bronchoscopy, coupling generative adaptation with explicit preservation of global airway structure (Zhang et al., 15 Sep 2025).