Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 77 tok/s

Gemini 2.5 Pro 54 tok/s Pro

GPT-5 Medium 29 tok/s Pro

GPT-5 High 26 tok/s Pro

GPT-4o 103 tok/s Pro

Kimi K2 175 tok/s Pro

GPT OSS 120B 454 tok/s Pro

Claude Sonnet 4.5 38 tok/s Pro

2000 character limit reached

DRPCA-Net: Make Robust PCA Great Again for Infrared Small Target Detection (2507.09541v1)

Published 13 Jul 2025 in cs.CV

Abstract: Infrared small target detection plays a vital role in remote sensing, industrial monitoring, and various civilian applications. Despite recent progress powered by deep learning, many end-to-end convolutional models tend to pursue performance by stacking increasingly complex architectures, often at the expense of interpretability, parameter efficiency, and generalization. These models typically overlook the intrinsic sparsity prior of infrared small targets--an essential cue that can be explicitly modeled for both performance and efficiency gains. To address this, we revisit the model-based paradigm of Robust Principal Component Analysis (RPCA) and propose Dynamic RPCA Network (DRPCA-Net), a novel deep unfolding network that integrates the sparsity-aware prior into a learnable architecture. Unlike conventional deep unfolding methods that rely on static, globally learned parameters, DRPCA-Net introduces a dynamic unfolding mechanism via a lightweight hypernetwork. This design enables the model to adaptively generate iteration-wise parameters conditioned on the input scene, thereby enhancing its robustness and generalization across diverse backgrounds. Furthermore, we design a Dynamic Residual Group (DRG) module to better capture contextual variations within the background, leading to more accurate low-rank estimation and improved separation of small targets. Extensive experiments on multiple public infrared datasets demonstrate that DRPCA-Net significantly outperforms existing state-of-the-art methods in detection accuracy. Code is available at https://github.com/GrokCV/DRPCA-Net.

Summary

The paper introduces DRPCA-Net, a novel network that unfolds the RPCA optimization process into a staged deep learning model for infrared target detection.
It integrates dynamic parameter generation and spatial attention to enhance interpretability and adaptability to varying infrared scenes.
Experimental evaluations on benchmark datasets demonstrate superior detection accuracy with notable improvements in mIoU and F1 scores.

DRPCA-Net: Make Robust PCA Great Again for Infrared Small Target Detection

The paper "DRPCA-Net: Make Robust PCA Great Again for Infrared Small Target Detection" (2507.09541) proposes a novel network architecture, DRPCA-Net, which adapts the traditional Robust Principal Component Analysis (RPCA) method for the infrared small target detection domain. By integrating model-based optimization with deep learning techniques, DRPCA-Net offers enhanced interpretability alongside superior detection performance.

Deep Unfolding Architecture

Dynamic RPCA Network (DRPCA-Net) Overview

DRPCA-Net unfolds the RPCA optimization process into a layer-wise neural network structure to preserve RPCA's interpretability and enhance adaptability with learnable components. This network is comprised of $K$ stages, each containing three modules: Latent Background Encoder Module (LBEM), Dynamic Target Extraction Module (DTEM), and Dynamic Image Reconstruction Module (DIRM), along with a Parameter Generator.

Latent Background Encoder Module (LBEM)

The LBEM estimates the low-rank background by applying a corrective mapping to the input residual, thereby eliminating the need for computationally intensive matrix operations often involved in RPCA, such as singular value decomposition (SVD).

Dynamic Target Extraction Module (DTEM)

DTEM replaces fixed parameters with dynamically generated ones for target refinement. Through a hypernetwork, stage-specific parameters are generated based on the input, allowing the model to adaptively modulate based on scene characteristics. This approach enhances robustness and generalization.

Dynamic Image Reconstruction Module (DIRM)

DIRM goes beyond simple component aggregation by using the proposed Dynamic Residual Group (DRG) module, which integrates residual learning with dynamic spatial attention to handle spatially varying background patterns.

Figure 1: Overall structure of DRPCA-Net. The network is composed of K stages, and the structure of each stage is identical.

Experimental Evaluation

Performance Metrics on Benchmark Datasets

DRPCA-Net was rigorously evaluated on popular infrared datasets, including SIRST V1, NUDT-SIRST, SIRST-Aug, and IRSTD-1K. It achieved state-of-the-art results in detection accuracy, demonstrated by superior mIoU and F1 scores compared to traditional methods and contemporary deep learning models. DRPCA-Net's ability to dynamically adapt to varying scenarios and efficiently process imagery is evident in both its accuracy metrics and computational efficiency.

Figure 2: Comparative ROC curve analysis of detection methods on NUDT-SIRST dataset.

Success and Failure Cases

Qualitative analysis reveals DRPCA-Net's proficiency in accurately delineating small targets within noisy backgrounds and complicated domain scenes. However, its assumptions about sparsity and low-rank backgrounds might occasionally yield false positives in highly cluttered environments.

Figure 3: Examples of failure cases on the IRSTD-1K dataset. From left to right: original infrared image, DRPCA-Net prediction, and ground truth.

Conclusion

DRPCA-Net effectively melds the rigorous mathematical foundation of RPCA with modern neural network flexibility, thus offering a robust approach to complex problems in infrared small target detection. Its architecture, particularly the dynamic parameter generation and advanced feature refinement capabilities, creates a sophisticated yet efficient solution that establishes new performance benchmarks across multiple datasets. Future explorations could further enhance its adaptability to environments that diverge from the predefined model assumptions.