nach oben

Optical Memory and Neural Networks

Erschienen in:

01.12.2023

Low Rank Adaptation for Stable Domain Adaptation of Vision Transformers

verfasst von: N. Filatov, M. Kindulov

Erschienen in: Optical Memory and Neural Networks | Sonderheft 2/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Unsupervised domain adaptation plays a crucial role in semantic segmentation tasks due to the high cost of annotating data. Existing approaches often rely on large transformer models and momentum networks to stabilize and improve the self-training process. In this study, we investigate the applicability of low-rank adaptation (LoRA) to domain adaptation in computer vision. Our focus is on the unsupervised domain adaptation task of semantic segmentation, which requires adapting models from a synthetic dataset (GTA5) to a real-world dataset (City-scapes). We employ the Swin Transformer as the feature extractor and TransDA domain adaptation framework. Through experiments, we demonstrate that LoRA effectively stabilizes the self-training process, achieving similar training dynamics to the exponentially moving average (EMA) mechanism. Moreover, LoRA provides comparable metrics to EMA under the same limited computation budget. In GTA5 → Cityscapes experiments, the adaptation pipeline with LoRA achieves a mIoU of 0.515, slightly surpassing the EMA baseline’s mIoU of 0.513, while also offering an 11% speedup in training time and video memory saving. These re-sults highlight LoRA as a promising approach for domain adaptation in computer vision, offering a viable alternative to momentum networks which also saves computational resources.

Vorheriger Artikel Individual Tree Segmentation Quality Evaluation Using Deep Learning Models LiDAR Based

Nächster Artikel Attractor Properties of Spatiotemporal Memory in Effective Sequence Processing Task

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Cheng, B., Schwing, A., and Kirillov, A., Per-pixel classification is not all you need for semantic segmentation, Adv. Neural Inf. Process. Syst., 2021, vol. 34, pp. 17864–17875.

Lialin, V., Deshpande, V., and Rumshisky, A., Scaling down to scale up: A guide to parameter-efficient fine-tuning, arXiv preprint arXiv:2303.15647, 2023.

Hu, E.J. et al., Lora: Low-rank adaptation of large language models, arXiv preprint arXiv:2106.09685, 2021.

Chen, M., Zheng, Z., Yang, Y., and Chua, T.-S., PiPa: Pixel-and Patch-wise Self-supervised Learning for Domain Adaptative Semantic Segmentation, arXiv preprint arXiv:2211.07609, 2022.

Xie, B., Li, S., Li, M., Liu, C.H., Huang, G., and Wang, G., Sepico: Semantic-guided pixel contrast for domain adaptive semantic segmentation, IEEE Trans. Pattern Anal. Mach. Intell., 2023.

Khan, S., Naseer, M., Hayat, M., Zamir, S.W., Khan, F.S., and Shah, M., Transformers in vision: A survey, ACM Comput. Surv. (CSUR), 2022, vol. 54, no. 10s, pp. 1–41.CrossRef

Vaswani, A. et al., Attention is all you need, Adv. Neural Inf. Process. Syst., 2017, vol. 30.

Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S., End-to-end object detection with transformers, in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Part I 16, Springer, 2020, pp. 213–229.

Dosovitskiy, A. et al.,“An image is worth 16x16 words: Transformers for image recognition at scale, arXiv preprint arXiv:2010.11929, 2020.

10.

Liu, Z. et al., Swin transformer: Hierarchical vision transformer using shifted windows, in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10012–10022.

11.

Wang, H., Shen, T., Zhang, W., Duan, L.-Y., and Mei, T., Classes matter: A fine-grained adversarial approach to cross-domain semantic segmentation, in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIV, Springer, 2020, pp. 642–659.

12.

Zhang, P., Zhang, B., Zhang, T., Chen, D., Wang, Y., and Wen, F., Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 12414–12424.

13.

Guan, L. and Yuan, X., Iterative Loop Learning Combining Self-Training and Active Learning for Domain Adaptive Semantic Segmentation, arXiv preprint arXiv:2301.13361, 2023.

14.

Hoyer, L., Dai, D., Wang, H., and van Gool, L., MIC: Masked image consistency for context-enhanced domain adaptation, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 11721–11732.

15.

Liu, H. et al., Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning, Adv. Neural Inf. Process. Syst., 2022, vol. 35, pp. 1950–1965.

16.

Lester, B., Al-Rfou, R., and Constant, N., The power of scale for parameter-efficient prompt tuning, arXiv preprint arXiv:2104.08691, 2021.

17.

Chen, R. et al., Smoothing matters: Momentum transformer for domain adaptive semantic segmentation, arXiv preprint arXiv:2203.07988, 2022.

18.

Richter, S.R., Vineet, V., Roth, S., and Koltun, V., Playing for data: Ground truth from computer games, in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part II 14, Springer, 2016, pp. 102–118.

19.

Cordts, M. et al., The cityscapes dataset for semantic urban scene understanding, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 3213–3223.

20.

Xiao, T., Liu, Y., Zhou, B., Jiang, Y., and Sun, J., Unified Perceptual Parsing for Scene Understanding, presented at the Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 418–434. https://openaccess.thecvf.com/content_ECCV_2018/html/Tete_Xiao_Unified_Perceptual_Parsing_ECCV_2018_paper.html.

21.

Cheng, B., Misra, I., Schwing, A.G., Kirillov, A., and Girdhar, R., Masked-attention Mask Transformer for Universal Image Segmentation. arXiv, Jun. 15, 2022. http://arxiv.org/abs/2112.01527. https://doi.org/10.48550/arXiv.2112.01527

Titel: Low Rank Adaptation for Stable Domain Adaptation of Vision Transformers
verfasst von: N. Filatov
M. Kindulov
Publikationsdatum: 01.12.2023
Verlag: Pleiades Publishing
Erschienen in: Optical Memory and Neural Networks / Ausgabe Sonderheft 2/2023
Print ISSN: 1060-992X
Elektronische ISSN: 1934-7898
DOI: https://doi.org/10.3103/S1060992X2306005X

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Sonderheft 2/2023

Application of Convolutional Neural Networks for Creation of Photoluminescent Carbon Nanosensor for Heavy Metals Detection

Individual Tree Segmentation Quality Evaluation Using Deep Learning Models LiDAR Based

Motion Control of Supersonic Passenger Aircraft Using Machine Learning Methods

Optimal Control Selection for Stabilizing the Inverted Pendulum Problem Using Neural Network Method

Application of the Variational Principle to Create a Measurable Assessment of the Relevance of Objects Included in Training Databases

Attractor Properties of Spatiotemporal Memory in Effective Sequence Processing Task

Premium Partner