FSDR: Frequency Space Domain Randomization for Domain Generalization (2103.02370v1)

Published 3 Mar 2021 in cs.CV

Abstract: Domain generalization aims to learn a generalizable model from a known source domain for various unknown target domains. It has been studied widely by domain randomization that transfers source images to different styles in spatial space for learning domain-agnostic features. However, most existing randomization uses GANs that often lack of controls and even alter semantic structures of images undesirably. Inspired by the idea of JPEG that converts spatial images into multiple frequency components (FCs), we propose Frequency Space Domain Randomization (FSDR) that randomizes images in frequency space by keeping domain-invariant FCs (DIFs) and randomizing domain-variant FCs (DVFs) only. FSDR has two unique features: 1) it decomposes images into DIFs and DVFs which allows explicit access and manipulation of them and more controllable randomization; 2) it has minimal effects on semantic structures of images and domain-invariant features. We examined domain variance and invariance property of FCs statistically and designed a network that can identify and fuse DIFs and DVFs dynamically through iterative learning. Extensive experiments over multiple domain generalizable segmentation tasks show that FSDR achieves superior segmentation and its performance is even on par with domain adaptation methods that access target data in training.

Citations (193)

View on Semantic Scholar

Collections

Sign up for free to add this paper to one or more collections.

Sign Up

Summary

Frequency Space Domain Randomization for Domain Generalization

The paper “FSDR: Frequency Space Domain Randomization for Domain Generalization” introduces a novel approach in the domain generalization field, specifically targeting the challenge of adapting models trained in one domain to perform well across various unseen domains. The authors propose a technique called Frequency Space Domain Randomization (FSDR), which advances previous methods by dissecting images into frequency components and then selectively randomizing these components to improve model generalization without altering semantic content.

Methodology

At the crux of the FSDR approach is the transformation of images from spatial space to frequency space, using Discrete Cosine Transform (DCT) to decompose images into multiple frequency components (FCs). The technique identifies domain-invariant frequency components (DIFs) and domain-variant frequency components (DVFs), allowing randomization to target only DVFs. This differentiation grants more precise control over the randomization process and ensures that semantic structures, considered domain-invariant, remain unaffected by the transformations.

The authors implement two specific approaches within FSDR: Spectrum Analysis based (FSDR-SA) and Spectrum Learning based (FSDR-SL). FSDR-SA relies on empirical investigation to determine DIFs and DVFs statically, while FSDR-SL adopts a dynamic learning mechanism, iteratively refining its understanding of these components based on prediction entropy. Both approaches involve adjusting source-domain images to resemble target-domain references through histogram matching, applied exclusively to identified DVFs.

Experimental Results

The efficacy of FSDR is demonstrated through extensive experiments involving semantic segmentation tasks across multiple domains, utilizing synthetic datasets like GTA5 and SYNTHIA as source domains and real datasets such as Cityscapes, BDD, and Mapillary as target domains. The results indicate significant improvements in segmentation accuracy compared to baseline models, illustrating the importance of preserving DIFs during the randomization process. Additionally, FSDR-SL shows superior performance over FSDR-SA due to its adaptive nature. More compelling is FSDR's ability to perform comparably to domain adaptation techniques, which typically require access to target domain data during training.

Implications and Future Directions

FSDR contributes a substantial advancement in domain generalization by minting a technique that reduces unwanted semantic alterations while enhancing model adaptability across unseen domains. Its ability to dynamically discern between DIFs and DVFs and apply targeted randomization is a key strength that can be harnessed across varying applications within AI and computer vision. Moreover, FSDR's model-independent framework suggests scalability and adaptability in integrating with existing systems.

From a theoretical standpoint, this method introduces intriguing possibilities for exploring frequency space representations in other machine learning contexts, potentially influencing the development of robust models capable of learning and generalizing from diverse data sources. Future research could explore refining the mechanisms of spectrum learning, exploring alternative frequency transformations, or applying the FSDR framework to additional computational tasks, such as object detection, as hinted at in preliminary tests.

Overall, FSDR positions itself as a promising approach in the quest for generalizable AI models, paving the way for more reliable and adaptable solutions in real-world scenarios where data shifts and domain discrepancies are prevalent.

FSDR: Frequency Space Domain Randomization for Domain Generalization (2103.02370v1)

Collections

Summary

Frequency Space Domain Randomization for Domain Generalization

Methodology

Experimental Results

Implications and Future Directions

Follow-up Questions

Authors (4)

FSDR: Frequency Space Domain Randomization for Domain Generalization (2103.02370v1)

Collections

Summary

Frequency Space Domain Randomization for Domain Generalization

Methodology

Experimental Results

Implications and Future Directions

Follow-up Questions

Related Papers

Authors (4)