Condition-Invariant Semantic Segmentation (2305.17349v4)
Abstract: Adaptation of semantic segmentation networks to different visual conditions is vital for robust perception in autonomous cars and robots. However, previous work has shown that most feature-level adaptation methods, which employ adversarial training and are validated on synthetic-to-real adaptation, provide marginal gains in condition-level adaptation, being outperformed by simple pixel-level adaptation via stylization. Motivated by these findings, we propose to leverage stylization in performing feature-level adaptation by aligning the internal network features extracted by the encoder of the network from the original and the stylized view of each input image with a novel feature invariance loss. In this way, we encourage the encoder to extract features that are already invariant to the style of the input, allowing the decoder to focus on parsing these features and not on further abstracting from the specific style of the input. We implement our method, named Condition-Invariant Semantic Segmentation (CISS), on the current state-of-the-art domain adaptation architecture and achieve outstanding results on condition-level adaptation. In particular, CISS sets the new state of the art in the popular daytime-to-nighttime Cityscapes$\to$Dark Zurich benchmark. Furthermore, our method achieves the second-best performance on the normal-to-adverse Cityscapes$\to$ACDC benchmark. CISS is shown to generalize well to domains unseen during training, such as BDD100K-night and ACDC-night. Code is publicly available at https://github.com/SysCV/CISS .
- Y.-H. Tsai, W.-C. Hung, S. Schulter, K. Sohn, M.-H. Yang, and M. Chandraker, “Learning to adapt structured output space for semantic segmentation,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
- T.-H. Vu, H. Jain, M. Bucher, M. Cord, and P. Perez, “ADVENT: Adversarial entropy minimization for domain adaptation in semantic segmentation,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
- Y. Li, L. Yuan, and N. Vasconcelos, “Bidirectional learning for domain adaptation of semantic segmentation,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
- Y. Luo, L. Zheng, T. Guan, J. Yu, and Y. Yang, “Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
- Z. Wang, M. Yu, Y. Wei, R. Feris, J. Xiong, W.-m. Hwu, T. S. Huang, and H. Shi, “Differential treatment for stuff and things: A simple unsupervised domain adaptation method for semantic segmentation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
- Y. Zou, Z. Yu, B. Vijaya Kumar, and J. Wang, “Unsupervised domain adaptation for semantic segmentation via class-balanced self-training,” in The European Conference on Computer Vision (ECCV), 2018.
- Y. Zou, Z. Yu, X. Liu, B. V. Kumar, and J. Wang, “Confidence regularized self-training,” in IEEE/CVF International Conference on Computer Vision (ICCV), October 2019.
- Z. Zheng and Y. Yang, “Rectifying pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation,” International Journal of Computer Vision, 2021.
- W. Tranheden, V. Olsson, J. Pinto, and L. Svensson, “DACS: Domain adaptation via cross-domain mixed sampling,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2021.
- F. Shen, A. Gurram, Z. Liu, H. Wang, and A. Knoll, “DiGA: Distil to generalize and then adapt for domain adaptive semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- R. Gong, Q. Wang, M. Danelljan, D. Dai, and L. Van Gool, “Continuous pseudo-label rectified domain adaptive semantic segmentation with implicit neural representations,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- S. R. Richter, V. Vineet, S. Roth, and V. Koltun, “Playing for data: Ground truth from computer games,” in European Conference on Computer Vision. Springer, 2016.
- G. Ros, L. Sellart, J. Materzynska, D. Vazquez, and A. M. Lopez, “The SYNTHIA dataset: A large collection of synthetic images for semantic segmentation of urban scenes,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.
- C. Sakaridis, D. Dai, and L. Van Gool, “ACDC: The Adverse Conditions Dataset with Correspondences for semantic driving scene understanding,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
- Y. Yang and S. Soatto, “FDA: Fourier domain adaptation for semantic segmentation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
- L. Hoyer, D. Dai, and L. Van Gool, “HRDA: Context-aware high-resolution domain-adaptive semantic segmentation,” in The European Conference on Computer Vision (ECCV), 2022.
- E. Reinhard, M. Adhikhmin, B. Gooch, and P. Shirley, “Color transfer between images,” IEEE Computer graphics and applications, vol. 21, no. 5, pp. 34–41, 2001.
- L. Hoyer, D. Dai, H. Wang, and L. Van Gool, “MIC: Masked image consistency for context-enhanced domain adaptation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- C. Sakaridis, D. Dai, and L. Van Gool, “Map-guided curriculum domain adaptation and uncertainty-aware evaluation for semantic nighttime image segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.
- F. Yu, H. Chen, X. Wang, W. Xian, Y. Chen, F. Liu, V. Madhavan, and T. Darrell, “BDD100K: A diverse driving dataset for heterogeneous multitask learning,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
- J. Hoffman, D. Wang, F. Yu, and T. Darrell, “FCNs in the wild: Pixel-level adversarial and constraint-based adaptation,” arXiv e-prints, vol. abs/1612.02649, December 2016.
- C. Sakaridis, D. Dai, and L. Van Gool, “Semantic foggy scene understanding with synthetic data,” International Journal of Computer Vision, vol. 126, no. 9, pp. 973–992, 2018.
- J. Hoffman, E. Tzeng, T. Park, J.-Y. Zhu, P. Isola, K. Saenko, A. Efros, and T. Darrell, “CyCADA: Cycle-consistent adversarial domain adaptation,” in International Conference on Machine Learning, 2018.
- Y. Chen, W. Li, and L. Van Gool, “ROAD: Reality oriented adaptation for semantic segmentation of urban scenes,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
- C. Sakaridis, D. Dai, S. Hecker, and L. Van Gool, “Model adaptation with synthetic and real data for semantic dense foggy scene understanding,” in The European Conference on Computer Vision (ECCV), 2018.
- S. Sankaranarayanan, Y. Balaji, A. Jain, S. Nam Lim, and R. Chellappa, “Learning from synthetic data: Addressing domain shift for semantic segmentation,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
- Y. Zhang, Z. Qiu, T. Yao, D. Liu, and T. Mei, “Fully convolutional adaptation networks for semantic segmentation,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
- D. Dai, C. Sakaridis, S. Hecker, and L. Van Gool, “Curriculum model adaptation with synthetic and real data for semantic foggy scene understanding,” International Journal of Computer Vision, vol. 128, no. 5, pp. 1182–1204, 2020.
- Y.-H. Tsai, K. Sohn, S. Schulter, and M. Chandraker, “Domain adaptation for structured output via discriminative patch representations,” in IEEE/CVF International Conference on Computer Vision (ICCV), October 2019.
- X. Lai, Z. Tian, X. Xu, Y. Chen, S. Liu, H. Zhao, L. Wang, and J. Jia, “DecoupleNet: Decoupled network for domain adaptive semantic segmentation,” in European Conference on Computer Vision (ECCV), 2022.
- R. Li, S. Li, C. He, Y. Zhang, X. Jia, and L. Zhang, “Class-balanced pixel-level self-labeling for domain adaptive semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- X. Guo, J. Liu, T. Liu, and Y. Yuan, “SimT: Handling open-set noise for domain adaptive semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in The IEEE International Conference on Computer Vision (ICCV), 2017.
- S. Lee, T. Son, and S. Kwak, “FIFO: Learning fog-invariant features for foggy scene segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- M. Kim and H. Byun, “Learning texture invariant representation for domain adaptation of semantic segmentation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
- L. Hoyer, D. Dai, and L. Van Gool, “DAFormer: Improving network architectures and training strategies for domain-adaptive semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- L. Melas-Kyriazi and A. K. Manrai, “PixMatch: Unsupervised domain adaptation via pixelwise consistency training,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- K. Sohn, D. Berthelot, N. Carlini, Z. Zhang, H. Zhang, C. A. Raffel, E. D. Cubuk, A. Kurakin, and C.-L. Li, “FixMatch: Simplifying semi-supervised learning with consistency and confidence,” in Advances in Neural Information Processing Systems, 2020.
- G. French, S. Laine, T. Aila, M. Mackiewicz, and G. Finlayson, “Semi-supervised semantic segmentation needs strong, varied perturbations,” in Proceedings of the British Machine Vision Conference (BMVC), 2020.
- I. n. Alonso, A. Sabater, D. Ferstl, L. Montesano, and A. C. Murillo, “Semi-supervised semantic segmentation with pixel-level contrastive learning from a class-wise memory bank,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
- B. Xie, S. Li, M. Li, C. H. Liu, G. Huang, and G. Wang, “SePiCo: Semantic-guided pixel contrast for domain adaptive semantic segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 7, pp. 9004–9021, 2023.
- G. Lee, C. Eom, W. Lee, H. Park, and B. Ham, “Bi-directional contrastive learning for domain adaptive semantic segmentation,” in European Conference on Computer Vision (ECCV), 2022.
- Z. Jiang, Y. Li, C. Yang, P. Gao, Y. Wang, Y. Tai, and C. Wang, “Prototypical contrast adaptation for domain adaptive semantic segmentation,” in European Conference on Computer Vision (ECCV), 2022.
- D. Bruggemann, C. Sakaridis, T. Broedermann, and L. Van Gool, “Contrastive model adaptation for cross-condition robustness in semantic segmentation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
- Z. Wu, X. Wu, X. Zhang, L. Ju, and S. Wang, “SiamDoGe: Domain generalizable semantic segmentation using siamese network,” in European Conference on Computer Vision (ECCV), 2022.
- G. Lin, A. Milan, C. Shen, and I. Reid, “RefineNet: Multi-path refinement networks with identity mappings for high-resolution semantic segmentation,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P. Luo, “SegFormer: Simple and efficient design for semantic segmentation with transformers,” in Advances in Neural Information Processing Systems, 2021.
- C. Sakaridis, D. Dai, and L. Van Gool, “Guided curriculum model adaptation and uncertainty-aware evaluation for semantic nighttime image segmentation,” in The IEEE International Conference on Computer Vision (ICCV), 2019.
- X. Wu, Z. Wu, H. Guo, L. Ju, and S. Wang, “DANNet: A one-stage domain adaption network for unsupervised nighttime semantic segmentation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2021.
- L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 4, pp. 834–848, 2018.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.
- I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” in ICLR, 2018.
- D. Bruggemann, C. Sakaridis, P. Truong, and L. Van Gool, “Refign: Align and refine for adaptation of semantic segmentation to adverse conditions,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023.
- M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The Cityscapes dataset for semantic urban scene understanding,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- N. Reddy, A. Singhal, A. Kumar, M. Baktashmotlagh, and C. Arora, “Master of all: Simultaneous generalization of urban-scene segmentation to all adverse weather conditions,” in European Conference on Computer Vision (ECCV), 2022.
- X. Ma, Z. Wang, Y. Zhan, Y. Zheng, Z. Wang, D. Dai, and C.-W. Lin, “Both style and fog matter: Cumulative domain adaptation for semantic foggy scene understanding,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.