Top

Multimedia Systems

Published in:

19-07-2023 | Regular Paper

Pull and concentrate: improving unsupervised semantic segmentation adaptation with cross- and intra-domain consistencies

Authors: Jian-Wei Zhang, Yifan Sun, Wei Chen

Published in: Multimedia Systems | Issue 5/2023

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Unsupervised domain adaptation (UDA) is an important solution for the cross-domain problem in semantic segmentation. Existing segmentation UDA methods mainly consider the domain shift as the major challenge. This paper, from a novel viewpoint, disentangles the cross-domain problem into two negative factors beyond the domain shift. Specifically, we find that apart from the domain shift factor, the dispersed within-class distribution on the target domain is another factor that compromises cross-domain segmentation. This paper finds that the neglected target domain distribution dispersion is a challenge as crucial as the domain shift. In response to the joint of these two negative factors, we propose a “Pull-and-Concentrate” (PuCo) method comprised of two consistencies: (1) A cross-domain consistency “pulls” the source and target domain distribution (of the same class) close to each other based on a novel statistical style transfer. (2) An intra-domain consistency “concentrates” the within-class distribution on the target domain in a new unsupervised teacher-student method. Both consistencies have the advantage of being robust (or insulated) from pseudo-label noises. This advantage allows PuCo to bring consistent improvement over a battery of pseudo-label-based UDA methods. For example, on GTA5 to Cityscapes and SYNTHIA to Cityscapes, PuCo achieves \(60.3\%\) and \(57.2\%\) mean IoU, respectively. Code is available at https://github.com/Jarvis73/PuCo.

previous article Self-distillation object segmentation via pyramid knowledge representation and transfer

next article EGARNet: adjacent residual lightweight super-resolution network based on extended group-enhanced convolution

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Hoffman, J., Wang, D., Yu, F., Darrell, T.: FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation. arXiv:1612.02649 (2016)

Tsai, Y.-H., Hung, W.-C., Schulter, S., Sohn, K., Yang, M.-H., Chandraker, M.: Learning to Adapt Structured Output Space for Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7472–7481 (2018)

Hoffman, J., Tzeng, E., Park, T., Zhu, J.-Y., Isola, P., Saenko, K., Efros, A., Darrell, T.: Cycada: Cycle-consistent adversarial domain adaptation. In: International Conference on Machine Learning, pp. 1989–1998 (2018). PMLR

Wu, Z., Han, X., Lin, Y.-L., Uzunbas, M.G., Goldstein, T., Lim, S.N., Davis, L.S.: DCAN: Dual Channel-wise Alignment Networks for Unsupervised Scene Adaptation. In: ECCV, pp. 518–534 (2018)

Luo, Y., Liu, P., Guan, T., Yu, J., Yang, Y.: Significance-Aware Information Bottleneck for Domain Adaptive Semantic Segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6778–6787 (2019)

Yang, Y., Soatto, S.: FDA: Fourier Domain Adaptation for Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4085–4095 (2020)

Zou, Y., Yu, Z., Vijaya Kumar, B.V.K., Wang, J.: Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: ECCV (2018)

Luo, Y., Zheng, L., Guan, T., Yu, J., Yang, Y.: Taking a Closer Look at Domain Shift: Category-Level Adversaries for Semantics Consistent Domain Adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2507–2516 (2019)

Zhang, Q., Zhang, J., Liu, W., Tao, D.: Category Anchor-Guided Unsupervised Domain Adaptation for Semantic Segmentation. Advances in Neural Information Processing Systems 32 (2019)

10.

Mei, K., Zhu, C., Zou, J., Zhang, S.: Instance Adaptive Self-training for Unsupervised Domain Adaptation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV. Lecture Notes in Computer Science, pp. 415–430. Springer International Publishing, Cham (2020)

11.

Wang, H., Shen, T., Zhang, W., Duan, L.-Y., Mei, T.: Classes Matter: A Fine-Grained Adversarial Approach to Cross-Domain Semantic Segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV. Lecture Notes in Computer Science, pp. 642–659. Springer International Publishing, Cham (2020)

12.

Zhang, P., Zhang, B., Zhang, T., Chen, D., Wang, Y., Wen, F.: Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12414–12424 (2021)

13.

Araslanov, N., Roth, S.: Self-Supervised Augmentation Consistency for Adapting Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15384–15394 (2021)

14.

Zou, Y., Yu, Z., Liu, X., Kumar, B.V.K.V., Wang, J.: Confidence Regularized Self-Training. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5982–5991 (2019)

15.

Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1501–1510 (2017)

16.

Zhou, K., Yang, Y., Qiao, Y., Xiang, T.: Domain generalization with MixStyle. In: ICLR (2021)

17.

Grill, J.-B., Strub, F., Altché, F., Tallec, C., Richemond, P., Buchatskaya, E., Doersch, C., Avila Pires, B., Guo, Z., Gheshlaghi Azar, M., Piot, B., Kavukcuoglu, K., Munos, R., Valko, M.: Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning. Adv. Neural. Inf. Process. Syst. 33, 21271–21284 (2020)

18.

He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum Contrast for Unsupervised Visual Representation Learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9729–9738 (2020)

19.

Wang, X., Zhang, R., Shen, C., Kong, T., Li, L.: Dense Contrastive Learning for Self-Supervised Visual Pre-Training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3024–3033 (2021)

20.

Wei, C., Shen, K., Chen, Y., Ma, T.: Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data. In: ICLR (2020)

21.

Chapelle, O., Scholkopf, B., Zien, A. Eds.: Semi-Supervised Learning (Chapelle, O. et al., Eds.; 2006) [Book reviews]. IEEE Transactions on Neural Networks 20(3), 542–542 (2009)

22.

Amini, M.-R., Feofanov, V., Pauletto, L., Devijver, E., Maximov, Y.: Self-Training: A Survey. arXiv

23.

Zheng, Z., Yang, Y.: Rectifying Pseudo Label Learning via Uncertainty Estimation for Domain Adaptive Semantic Segmentation. IJCV 129(4), 1106–1120 (2021)CrossRef

24.

Cheng, Y., Wei, F., Bao, J., Chen, D., Wen, F., Zhang, W.: Dual Path Learning for Domain Adaptation of Semantic Segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9082–9091 (2021)

25.

Li, W., Yang, X., Li, Z.: Mlcb-net: a multi-level class balancing network for domain adaptive semantic segmentation. Multimedia Systems, 1–12 (2023)

26.

Melas-Kyriazi, L., Manrai, A.K.: PixMatch: Unsupervised Domain Adaptation via Pixelwise Consistency Training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12435–12445 (2021)

27.

Wang, Z., Yu, M., Wei, Y., Feris, R., Xiong, J., Hwu, W.-m., Huang, T.S., Shi, H.: Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12635–12644 (2020)

28.

Guo, X., Yang, C., Li, B., Yuan, Y.: MetaCorrection: Domain-aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3927–3936 (2021)

29.

Li, R., Li, S., He, C., Zhang, Y., Jia, X., Zhang, L.: Class-Balanced Pixel-Level Self-Labeling for Domain Adaptive Semantic Segmentation. arXiv:2203.09744 [cs] (2022)

30.

Xie, B., Li, S., Li, M., Liu, C.H., Huang, G., Wang, G.: SePiCo: Semantic-Guided Pixel Contrast for Domain Adaptive Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 45(7), 9004–9021 (2023)

31.

Li, T., Roy, S., Zhou, H., Lu, H., Lathuilière, S.: Contrast, Stylize and Adapt: Unsupervised Contrastive Learning Framework for Domain Adaptive Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4868–4878 (2023)

32.

Hoyer, L., Dai, D., Van Gool, L.: DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

33.

Hoyer, L., Dai, D., Van Gool, L.: HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation. arXiv:2204.13132 [cs] (2022)

34.

Gong, R., Wang, Q., Danelljan, M., Dai, D., Van Gool, L.: Continuous Pseudo-Label Rectified Domain Adaptive Semantic Segmentation With Implicit Neural Representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7225–7235 (2023)

35.

Tarvainen, A., Valpola, H.: Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Advances in neural information processing systems 30 (2017)

36.

Laine, S., Aila, T.: Temporal Ensembling for Semi-Supervised Learning. arXiv:1610.02242 (2017)

37.

Gong, C., Wang, D., Liu, Q.: AlphaMatch: Improving Consistency for Semi-Supervised Learning With Alpha-Divergence. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13683–13692 (2021)

38.

Hyun, M., Jeong, J., Kwak, N.: Class-Imbalanced Semi-Supervised Learning. arXiv:2002.06815 (2020)

39.

Sohn, K., Berthelot, D., Carlini, N., Zhang, Z., Zhang, H., Raffel, C.A., Cubuk, E.D., Kurakin, A., Li, C.-L.: Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Adv. Neural. Inf. Process. Syst. 33, 596–608 (2020)

40.

Ghosh, A., Thiery, A.H.: On Data-Augmentation and Consistency-Based Semi-Supervised Learning. In: ICLR (2020)

41.

Lai, X., Tian, Z., Jiang, L., Liu, S., Zhao, H., Wang, L., Jia, J.: Semi-Supervised Semantic Segmentation With Directional Context-Aware Consistency. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1205–1214 (2021)

42.

Wu, Y., Liu, C., Chen, L., Zhao, D., Zheng, Q., Zhou, H.: Perturbation consistency and mutual information regularization for semi-supervised semantic segmentation. Multimedia Systems, 1–13 (2022)

43.

Xie, Z., Lin, Y., Zhang, Z., Cao, Y., Lin, S., Hu, H.: Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16684–16693 (2021)

44.

Wang, W., Zhou, T., Yu, F., Dai, J., Konukoglu, E., Van Gool, L.: Exploring Cross-Image Pixel Contrast for Semantic Segmentation. arXiv:2101.11939 (2021)

45.

Liang, X., Wu, L., Li, J., Wang, Y., Meng, Q., Qin, T., Chen, W., Zhang, M., Liu, T.-Y.: R-Drop: Regularized Dropout for Neural Networks. arXiv:2106.14448 (2021)

46.

Huang, T., Sun, Y., Wang, X., Yao, H., Zhang, C.: Spatial Ensemble: A Novel Model Smoothing Mechanism for Student-Teacher Framework. In: Advances in Neural Information Processing Systems, vol. 34, pp. 15957–15968. Curran Associates, Inc

47.

Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580 (2012)

48.

Yang, Y., Zhuang, Y., Pan, Y.: Multiple knowledge representation for big data artificial intelligence: framework, applications, and case studies. Frontiers of Information Technology & Electronic Engineering 22(12), 1551–1558 (2021)CrossRef

49.

Gatys, L.A., Ecker, A.S., Bethge, M.: Image Style Transfer Using Convolutional Neural Networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2414–2423 (2016)

50.

Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance normalization: The missing ingredient for fast stylization. arXiv:1607.08022 (2016)

51.

Peng, D., Lei, Y., Liu, L., Zhang, P., Liu, J.: Global and Local Texture Randomization for Synthetic-to-Real Semantic Segmentation 30, 6594–6608

52.

Zhao, Y., Zhong, Z., Luo, Z., Lee, G.H., Sebe, N.: Source-Free Open Compound Domain Adaptation in Semantic Segmentation, 1–1

53.

Wang, X., Zhu, L., Zheng, Z., Xu, M., Yang, Y.: Align and tell: Boosting text-video retrieval with local alignment and fine-grained supervision. IEEE Transactions on Multimedia (2022)

54.

Li, Y., Yuan, L., Vasconcelos, N.: Bidirectional Learning for Domain Adaptation of Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6936–6945 (2019)

55.

Yang, J., An, W., Wang, S., Zhu, X., Yan, C., Huang, J.: Label-Driven Reconstruction for Domain Adaptation in Semantic Segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV. Lecture Notes in Computer Science, pp. 480–498. Springer International Publishing, Cham (2020)

56.

Musto, L., Zinelli, A.: Semantically Adaptive Image-to-image Translation for Domain Adaptation of Semantic Segmentation. arXiv:2009.01166 (2020)

57.

Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Bach, F.R., Blei, D.M. (eds.) Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015. JMLR Workshop and Conference Proceedings, vol. 37, pp. 448–456. JMLR.org

58.

French, G., Mackiewicz, M., Fisher, M.: Self-ensembling for visual domain adaptation. In: ICLR (2018)

59.

Richter, S.R., Vineet, V., Roth, S., Koltun, V.: Playing for data: Ground truth from computer games. In: ECCV, pp. 102–118 (2016). Springer

60.

Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The Cityscapes Dataset for Semantic Urban Scene Understanding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3213–3223 (2016)

61.

Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3234–3243 (2016)

62.

Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE TPAMI 40(4), 834–848 (2018)CrossRef

63.

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

64.

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: A simple way to prevent neural networks from overfitting 15(56), 1929–1958

65.

Chen, L.-C., Papandreou, G., Schroff, F., Adam, H.: Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv:1706.05587 (2017)

66.

Chen, X., He, K.: Exploring Simple Siamese Representation Learning. arXiv:2011.10566 (2020)

67.

Tranheden, W., Olsson, V., Pinto, J., Svensson, L.: DACS: Domain Adaptation via Cross-Domain Mixed Sampling. In: WACV, pp. 1379–1389 (2021)

68.

Vu, T.-H., Jain, H., Bucher, M., Cord, M., Perez, P.: ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2517–2526 (2019)

69.

Yang, J., Xu, R., Li, R., Qi, X., Shen, X., Li, G., Lin, L.: An Adversarial Perturbation Oriented Domain Adaptation Approach for Semantic Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence 34(07), 12613–12620 (2020)CrossRef

70.

Tsai, Y.-H., Sohn, K., Schulter, S., Chandraker, M.: Domain Adaptation for Structured Output via Discriminative Patch Representations. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1456–1465 (2019)

71.

Truong, T.-D., Duong, C.N., Le, N., Phung, S.L., Rainwater, C., Luu, K.: BiMaL: Bijective Maximum Likelihood Approach to Domain Adaptation in Semantic Scene Segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8548–8557 (2021)

72.

Zhang, Y., Qiu, Z., Yao, T., Ngo, C.-W., Liu, D., Mei, T.: Transferring and Regularizing Prediction for Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9621–9630 (2020)

73.

Lian, Q., Lv, F., Duan, L., Gong, B.: Constructing Self-Motivated Pyramid Curriculums for Cross-Domain Semantic Segmentation: A Non-Adversarial Approach. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6758–6767 (2019)

74.

Ma, H., Lin, X., Wu, Z., Yu, Y.: Coarse-To-Fine Domain Adaptive Semantic Segmentation With Photometric Alignment and Category-Center Regularization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4051–4060 (2021)

75.

Liu, Y., Deng, J., Gao, X., Li, W., Duan, L.: BAPA-Net: Boundary Adaptation and Prototype Alignment for Cross-Domain Semantic Segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8801–8811 (2021)

76.

Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A Simple Framework for Contrastive Learning of Visual Representations. ICML 1 (2020)

77.

McInnes, L., Healy, J., Melville, J.: UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv:1802.03426 (2020)

Title: Pull and concentrate: improving unsupervised semantic segmentation adaptation with cross- and intra-domain consistencies
Authors: Jian-Wei Zhang
Yifan Sun
Wei Chen
Publication date: 19-07-2023
Publisher: Springer Berlin Heidelberg
Published in: Multimedia Systems / Issue 5/2023
Print ISSN: 0942-4962
Electronic ISSN: 1432-1882
DOI: https://doi.org/10.1007/s00530-023-01131-9

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 5/2023

Image aesthetics assessment using composite features from transformer and CNN

A deraining with detail-recovery network via context aggregation

Multilevel progressive recursive dilated networks with correlation filter (MPRDNCF) for image super-resolution

BLE-Net: boundary learning and enhancement network for polyp segmentation

Efficient and self-adaptive rationale knowledge base for visual commonsense reasoning

EfficientFace: an efficient deep network with feature enhancement for accurate face detection