DCO-YOLO Pipeline Detection Framework
- The paper introduces DCO-YOLO, an enhanced YOLOv11-based framework integrating DySample, CGLU, and OutlookAttention to improve detection of small-scale features in GPR images.
- It leverages a novel 3D-DIoU matching algorithm to fuse B-, C-, and D-scan detections, enabling coherent 3D reconstructions of underground pipelines in urban settings.
- Empirical results show significant improvements in precision, recall, and mAP metrics, while achieving efficient inference speeds (~75 FPS on RTX 3060) for practical deployment.
The DCO-YOLO framework is a lightweight, deep-learning-based object detection system designed for robust recognition and spatial localization of underground pipelines from multi-view ground-penetrating radar (GPR) images. Centered on the YOLOv11 architecture, DCO-YOLO introduces three tightly integrated modules—DySample, CGLU, and OutlookAttention—that collectively enhance feature extraction for small-scale and complex targets characteristic of subsurface pipeline environments. The framework further incorporates a 3D-DIoU spatial matching algorithm for effective fusion of B-scan, C-scan, and D-scan detections into coherent 3D pipeline reconstructions, significantly improving detection accuracy and robustness in urban pipeline recognition contexts (Lv et al., 24 Dec 2025).
1. Framework Architecture and Component Integration
DCO-YOLO extends YOLOv11 (input resolution 640×640×3) by injecting three primary modules at backbone, neck, and head stages:
- DySample replaces all traditional upsampling layers, enabling learnable, dynamic resampling grids.
- CGLU (Channel Gating Linear Unit) substitutes the SE-attention branch in C2PSA blocks, providing dynamic channel-wise feature modulation.
- OutlookAttention is embedded within selected C3k2 modules (
C3K2_OA=True), introducing localized attention across sliding windows for spatially-sensitive context aggregation.
The high-level dataflow follows: Input → Conv Stem/Backbone (C3k2+OutlookAttention, SPPF) → C2PSA_CGLU → DySample Upsample → Neck → Detection Head (CIoU and classification losses).
A representative detection path sequence: ... → SPPF → C2PSA_CGLU → DySample → C3k2_OA(True) → detect → ...
This architectural reconfiguration systematically reinforces the extraction and synthesis of edge and texture cues that are crucial for detecting small-scale pipeline signatures in GPR imagery (Lv et al., 24 Dec 2025).
2. Module-Specific Mathematical Formulation
2.1 DySample
DySample introduces dynamic, data-dependent upsampling grids. For an input feature map and upsampling factor :
where are convolutions. Offsets are added to the interpolation grid :
An upsampled feature is then obtained as
2.2 CGLU (Channel Gating Linear Unit)
CGLU processes input via:
- Value:
- Gate:
The final output:
2.3 OutlookAttention
Project input to value and attention scores:
For each window (size ):
This operator aggregates spatial information effectively over local windows, targeting geometric ambiguities in radar reflections (Lv et al., 24 Dec 2025).
3. 3D-DIoU Matching and Multi-View Fusion
DCO-YOLO applies detection independently to each GPR view (B-scan, C-scan, D-scan), followed by 3D spatial matching via an extended DIoU metric:
where is the Euclidean distance between box centers and is the diagonal of their smallest enclosing cuboid.
Candidate boxes across views are fused into 3D pipelines if:
This multi-view strategy, integrated with geometric constraints and center distance penalties, resolves false splits and ambiguous localizations inherent in single-view GPR analysis (Lv et al., 24 Dec 2025).
4. Training Regimen and Loss Design
Detection employs YOLOv11’s standard heads with:
- Localization: CIoU-based regression loss
- Classification: (binary cross-entropy or focal)
Total detection loss:
Training details include Adam optimizer, initial learning rate 0.01, batch size 32, up to 140 epochs with early stopping (drop to 120 epochs final). Data augmentation specifics are not detailed in the source. No explicit attention regularization term is applied (Lv et al., 24 Dec 2025).
5. Empirical Evaluation and Ablation
Comprehensive ablation assesses the incremental impact of each module:
| Model Variant | Precision (%) | Recall (%) | [email protected] (%) | [email protected]–0.95 (%) |
|---|---|---|---|---|
| YOLOv11 baseline | 94.2 | 91.2 | 95.8 | 67.7 |
| + DySample | 94.7 (+0.5) | 91.8 (+0.6) | 96.1 (+0.3) | 68.4 (+0.7) |
| + DySample + CGLU | 95.3 (+1.1) | 92.3 (+1.1) | 96.3 (+0.5) | 68.8 (+1.1) |
| + DySample + OutlookAttention | 95.7 (+1.5) | 91.9 (+0.7) | 96.2 (+0.4) | 69.5 (+1.8) |
| DCO-YOLO (all modules) | 96.2 (+2.0) | 93.3 (+2.1) | 96.7 (+0.9) | 71.1 (+3.5) |
In complex urban scenarios, DCO-YOLO achieves 96.2% accuracy, 93.3% recall, and 96.7% [email protected], outperforming the baseline by 2.0%, 2.1%, and 0.9%, respectively. Ablation demonstrates the synergistic gain when all modules are combined. Grad-CAM++ visualizations indicate improved geometric focus (Lv et al., 24 Dec 2025).
6. Inference Strategy, Efficiency, and Application Scope
Pipeline reconstruction follows a deterministic multi-view post-processing procedure:
1 2 3 4 5 6 7 8 |
Input: Triplet of images (I_B, I_C, I_D) 1. Preprocess (ISDFT, background removal, low-pass, gain) 2. Run DCO-YOLO detect on each: D_B = detect(I_B); D_C = detect(I_C); D_D = detect(I_D) 3. For each box (b, c, d): lift to 3D (B_3D, C_3D, D_3D) - Compute diou_BC = DIoU_3D(B_3D, C_3D), diou_BD = DIoU_3D(B_3D, D_3D) - If both > 0.4: fuse, classify, and estimate burial depth 4. Output fused pipelines with type & depth |
Empirical results indicate efficient inference (~75 FPS on RTX 3060) and significant reduction in volumetric demand (5.6% of raw 3D data). The framework is highly suitable for urban infrastructure mapping, with its three-view fusion method mitigating ambiguities found in individual GPR scans (Lv et al., 24 Dec 2025).
7. Context, Significance, and Limitations
DCO-YOLO embodies a physically-grounded, data-efficient strategy for intelligent pipeline mapping, integrating deep neural feature enhancements with domain-specific measurement constraints. The cross-dimensional mechanisms (DySample, CGLU, OutlookAttention) are directly validated on real-world GPR acquisitions, providing empirical evidence for improved edge and geometry extraction essential to pipeline localization. The reliance on multi-view consensus and 3D spatial association distinguishes DCO-YOLO from canonical 2D object detectors in subsurface sensing applications.
Detailed information beyond module integration (e.g., augmentation methods, exact loss weights, or codebase specifics) is not provided in the source (Lv et al., 24 Dec 2025). The approach assumes availability of quality B/C/D-scan GPR imagery and may require retraining or tuning for generalized geophysical settings.
Reference: