Top

Optical Memory and Neural Networks

Published in:

01-12-2023

Low Rank Adaptation for Stable Domain Adaptation of Vision Transformers

Authors: N. Filatov, M. Kindulov

Published in: Optical Memory and Neural Networks | Special Issue 2/2023

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Unsupervised domain adaptation plays a crucial role in semantic segmentation tasks due to the high cost of annotating data. Existing approaches often rely on large transformer models and momentum networks to stabilize and improve the self-training process. In this study, we investigate the applicability of low-rank adaptation (LoRA) to domain adaptation in computer vision. Our focus is on the unsupervised domain adaptation task of semantic segmentation, which requires adapting models from a synthetic dataset (GTA5) to a real-world dataset (City-scapes). We employ the Swin Transformer as the feature extractor and TransDA domain adaptation framework. Through experiments, we demonstrate that LoRA effectively stabilizes the self-training process, achieving similar training dynamics to the exponentially moving average (EMA) mechanism. Moreover, LoRA provides comparable metrics to EMA under the same limited computation budget. In GTA5 → Cityscapes experiments, the adaptation pipeline with LoRA achieves a mIoU of 0.515, slightly surpassing the EMA baseline’s mIoU of 0.513, while also offering an 11% speedup in training time and video memory saving. These re-sults highlight LoRA as a promising approach for domain adaptation in computer vision, offering a viable alternative to momentum networks which also saves computational resources.

previous article Individual Tree Segmentation Quality Evaluation Using Deep Learning Models LiDAR Based

next article Attractor Properties of Spatiotemporal Memory in Effective Sequence Processing Task

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Cheng, B., Schwing, A., and Kirillov, A., Per-pixel classification is not all you need for semantic segmentation, Adv. Neural Inf. Process. Syst., 2021, vol. 34, pp. 17864–17875.

Lialin, V., Deshpande, V., and Rumshisky, A., Scaling down to scale up: A guide to parameter-efficient fine-tuning, arXiv preprint arXiv:2303.15647, 2023.

Hu, E.J. et al., Lora: Low-rank adaptation of large language models, arXiv preprint arXiv:2106.09685, 2021.

Chen, M., Zheng, Z., Yang, Y., and Chua, T.-S., PiPa: Pixel-and Patch-wise Self-supervised Learning for Domain Adaptative Semantic Segmentation, arXiv preprint arXiv:2211.07609, 2022.

Xie, B., Li, S., Li, M., Liu, C.H., Huang, G., and Wang, G., Sepico: Semantic-guided pixel contrast for domain adaptive semantic segmentation, IEEE Trans. Pattern Anal. Mach. Intell., 2023.

Khan, S., Naseer, M., Hayat, M., Zamir, S.W., Khan, F.S., and Shah, M., Transformers in vision: A survey, ACM Comput. Surv. (CSUR), 2022, vol. 54, no. 10s, pp. 1–41.CrossRef

Vaswani, A. et al., Attention is all you need, Adv. Neural Inf. Process. Syst., 2017, vol. 30.

Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S., End-to-end object detection with transformers, in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Part I 16, Springer, 2020, pp. 213–229.

Dosovitskiy, A. et al.,“An image is worth 16x16 words: Transformers for image recognition at scale, arXiv preprint arXiv:2010.11929, 2020.

10.

Liu, Z. et al., Swin transformer: Hierarchical vision transformer using shifted windows, in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10012–10022.

11.

Wang, H., Shen, T., Zhang, W., Duan, L.-Y., and Mei, T., Classes matter: A fine-grained adversarial approach to cross-domain semantic segmentation, in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIV, Springer, 2020, pp. 642–659.

12.

Zhang, P., Zhang, B., Zhang, T., Chen, D., Wang, Y., and Wen, F., Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 12414–12424.

13.

Guan, L. and Yuan, X., Iterative Loop Learning Combining Self-Training and Active Learning for Domain Adaptive Semantic Segmentation, arXiv preprint arXiv:2301.13361, 2023.

14.

Hoyer, L., Dai, D., Wang, H., and van Gool, L., MIC: Masked image consistency for context-enhanced domain adaptation, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 11721–11732.

15.

Liu, H. et al., Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning, Adv. Neural Inf. Process. Syst., 2022, vol. 35, pp. 1950–1965.

16.

Lester, B., Al-Rfou, R., and Constant, N., The power of scale for parameter-efficient prompt tuning, arXiv preprint arXiv:2104.08691, 2021.

17.

Chen, R. et al., Smoothing matters: Momentum transformer for domain adaptive semantic segmentation, arXiv preprint arXiv:2203.07988, 2022.

18.

Richter, S.R., Vineet, V., Roth, S., and Koltun, V., Playing for data: Ground truth from computer games, in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part II 14, Springer, 2016, pp. 102–118.

19.

Cordts, M. et al., The cityscapes dataset for semantic urban scene understanding, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 3213–3223.

20.

Xiao, T., Liu, Y., Zhou, B., Jiang, Y., and Sun, J., Unified Perceptual Parsing for Scene Understanding, presented at the Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 418–434. https://openaccess.thecvf.com/content_ECCV_2018/html/Tete_Xiao_Unified_Perceptual_Parsing_ECCV_2018_paper.html.

21.

Cheng, B., Misra, I., Schwing, A.G., Kirillov, A., and Girdhar, R., Masked-attention Mask Transformer for Universal Image Segmentation. arXiv, Jun. 15, 2022. http://arxiv.org/abs/2112.01527. https://doi.org/10.48550/arXiv.2112.01527

Title: Low Rank Adaptation for Stable Domain Adaptation of Vision Transformers
Authors: N. Filatov
M. Kindulov
Publication date: 01-12-2023
Publisher: Pleiades Publishing
Published in: Optical Memory and Neural Networks / Issue Special Issue 2/2023
Print ISSN: 1060-992X
Electronic ISSN: 1934-7898
DOI: https://doi.org/10.3103/S1060992X2306005X

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Special Issue 2/2023

Individual Tree Segmentation Quality Evaluation Using Deep Learning Models LiDAR Based

Resistor Array as a Commutator

Attractor Properties of Spatiotemporal Memory in Effective Sequence Processing Task

Optimal Control Selection for Stabilizing the Inverted Pendulum Problem Using Neural Network Method

Application of the Variational Principle to Create a Measurable Assessment of the Relevance of Objects Included in Training Databases

Motion Control of Supersonic Passenger Aircraft Using Machine Learning Methods

Premium Partner