Papers
Topics
Authors
Recent
2000 character limit reached

Depth-Sensitive Soft Suppression with RGB-D Inter-Modal Stylization Flow for Domain Generalization Semantic Segmentation (2505.07050v1)

Published 11 May 2025 in cs.CV

Abstract: Unsupervised Domain Adaptation (UDA) aims to align source and target domain distributions to close the domain gap, but still struggles with obtaining the target data. Fortunately, Domain Generalization (DG) excels without the need for any target data. Recent works expose that depth maps contribute to improved generalized performance in the UDA tasks, but they ignore the noise and holes in depth maps due to device and environmental factors, failing to sufficiently and effectively learn domain-invariant representation. Although high-sensitivity region suppression has shown promising results in learning domain-invariant features, existing methods cannot be directly applicable to depth maps due to their unique characteristics. Hence, we propose a novel framework, namely Depth-Sensitive Soft Suppression with RGB-D inter-modal stylization flow (DSSS), focusing on learning domain-invariant features from depth maps for the DG semantic segmentation. Specifically, we propose the RGB-D inter-modal stylization flow to generate stylized depth maps for sensitivity detection, cleverly utilizing RGB information as the stylization source. Then, a class-wise soft spatial sensitivity suppression is designed to identify and emphasize non-sensitive depth features that contain more domain-invariant information. Furthermore, an RGB-D soft alignment loss is proposed to ensure that the stylized depth maps only align part of the RGB features while still retaining the unique depth information. To our best knowledge, our DSSS framework is the first work to integrate RGB and Depth information in the multi-class DG semantic segmentation task. Extensive experiments over multiple backbone networks show that our framework achieves remarkable performance improvement.

Summary

Depth-Sensitive Soft Suppression with RGB-D Inter-Modal Stylization Flow for Domain Generalization Semantic Segmentation

Semantic segmentation has been a pivotal task within computer vision, aimed at discerning and classifying each pixel in an image. Yet, the challenge of domain discrepancy between training and testing datasets persists, especially when models are exposed to unseen domains. Domain Generalization (DG), unlike Unsupervised Domain Adaptation (UDA), offers a promising approach by enhancing generalization capabilities without requiring target domain data. Recent investigations have illuminated the efficacy of depth maps in stabilizing performance across varied domains. However, issues such as noise and inconsistencies inherent in depth maps remain troublesome. Against this backdrop, the paper presents an innovative framework, Depth-Sensitive Soft Suppression with RGB-D Inter-Modal Stylization Flow (DSSS), designed to extract domain-invariant features effectively for the DG semantic segmentation task.

The proposed DSSS framework addresses several challenges evident in depth map utilization:

  1. Depth Map Challenges: Depth maps carry innate noise due to hardware limitations and environmental factors. Existing suppression techniques often disregard these nuances, treating depth data as uniformly domain-invariant. This oversight leads to inefficiencies in feature extraction.
  2. Sensitivity Suppression Techniques: Previous sensitivity suppression methods focused primarily on RGB image characteristics, often employing techniques that could distort depth information. Traditional methods rely on binary thresholding which tends to misinterpret continuous sensitivity variations leading to loss of critical information.

DSSS proposes three key innovations:

  1. RGB-D Inter-Modal Stylization Flow: This stylization mechanism generates diverse depth maps utilizing RGB information. Rather than simple data augmentation, this flow improves depth feature diversity without reliance on auxiliary datasets and avoids the distortion associated with standard augmentation techniques.
  2. Class-wise Soft Spatial Sensitivity Suppression: Moving away from channel-based sensitivity, this suppression operates at a spatial level, identifying non-sensitive depth regions that bear domain-invariant properties. This class-wise suppression enhances domain-invariant learning more finely than global methods.
  3. RGB-D Soft Alignment Loss: To preserve modality-specific features whilst aligning RGB and stylized depth features, this loss ensures depth maps retain partial RGB characteristics, aiding in the extraction of comprehensive domain-invariant data.

The results of extensive experiments indicate substantial performance gains in semantic segmentation tasks across diverse unseen target domains. The DSSS framework demonstrates notable efficacy in road segmentation and object boundary detection, particularly under complex traffic scenarios and low-light conditions. These improvements underscore the importance of refining depth feature extraction processes through sensitivity evaluation and stylization.

Future research may explore potential applications of DSSS across broader AI-driven domains and settings, possibly integrating additional sensor modalities for enhanced robustness. Moreover, comparative studies into alternative stylization and suppression methodologies could further enhance DG performance, establishing new benchmarks in the semantic segmentation landscape. The implications for both theoretical development and practical deployment in real-world applications are significant, paving the way for systems capable of adapting to the dynamic challenges posed by varied real-world environments.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.