Skip to main content
Top

2025 | OriginalPaper | Chapter

TTT-MIM: Test-Time Training with Masked Image Modeling for Denoising Distribution Shifts

Authors : Youssef Mansour, Xuyang Zhong, Serdar Caglar, Reinhard Heckel

Published in: Computer Vision – ECCV 2024

Publisher: Springer Nature Switzerland

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Neural networks trained end-to-end give state-of-the-art performance for image denoising. However, when applied to an image outside of the training distribution, the performance often degrades significantly. In this work, we propose a test-time training (TTT) method based on masked image modeling (MIM) to improve denoising performance for out-of-distribution images. The method, termed TTT-MIM, consists of a training stage and a test time adaptation stage. At training, we minimize a standard supervised loss and a self-supervised loss aimed at reconstructing masked image patches. At test-time, we minimize a self-supervised loss to fine-tune the network to adapt to a single noisy image. Experiments show that our method can improve performance under natural distribution shifts, in particular it adapts well to real-world camera and microscope noise. A competitor to our method of training and finetuning is to use a zero-shot denoiser that does not rely on training data. However, compared to state-of-the-art zero-shot denoisers, our method shows superior performance, and is much faster, suggesting that training and finetuning on the test instance is a more efficient approach to image denoising than zero-shot methods in setups where little to no data is available. Our GitHub page is: https://​github.​com/​MLI-lab/​TTT_​Denoising.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
1.
go back to reference Abdelhamed, A., Lin, S., Brown, M.S.: A high-quality denoising dataset for smartphone cameras. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Abdelhamed, A., Lin, S., Brown, M.S.: A high-quality denoising dataset for smartphone cameras. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
2.
go back to reference Bao, H., Dong, L., Wei, F.: Beit: bert pre-training of image transformers. In: International Conference on Learning Representations (ICLR) (2022) Bao, H., Dong, L., Wei, F.: Beit: bert pre-training of image transformers. In: International Conference on Learning Representations (ICLR) (2022)
3.
go back to reference Batson, J., Royer, L.: Noise2Self: blind denoising by self-supervision. In: International Conference on Machine Learning (ICML) (2019) Batson, J., Royer, L.: Noise2Self: blind denoising by self-supervision. In: International Conference on Machine Learning (ICML) (2019)
4.
go back to reference Broaddus, C., Krull, A., Weigert, M., Schmidt, U., Myers, G.: Removing structured noise with self-supervised blind-spot networks. In: International Symposium on Biomedical Imaging (ISBI) (2020) Broaddus, C., Krull, A., Weigert, M., Schmidt, U., Myers, G.: Removing structured noise with self-supervised blind-spot networks. In: International Symposium on Biomedical Imaging (ISBI) (2020)
5.
6.
go back to reference Chen, L., Chu, X., Zhang, X., Sun, J.: Simple baselines for image restoration. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision – ECCV 2022. ECCV 2022. LNCS, vol. 13667, pp. 17–33. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20071-7_2 Chen, L., Chu, X., Zhang, X., Sun, J.: Simple baselines for image restoration. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision – ECCV 2022. ECCV 2022. LNCS, vol. 13667, pp. 17–33. Springer, Cham (2022). https://​doi.​org/​10.​1007/​978-3-031-20071-7_​2
7.
go back to reference Chen, M., et al.: Generative pretraining from pixels. In: International Conference on Machine Learning (ICML) (2020) Chen, M., et al.: Generative pretraining from pixels. In: International Conference on Machine Learning (ICML) (2020)
8.
go back to reference Chen, X., He, K.: Exploring simple Siamese representation learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021) Chen, X., He, K.: Exploring simple Siamese representation learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
9.
go back to reference Darestani, M.Z., Liu, J., Heckel, R.: Test-time training can close the natural distribution shift performance gap in deep learning based compressed sensing. In: International Conference on Machine Learning (ICML) (2022) Darestani, M.Z., Liu, J., Heckel, R.: Test-time training can close the natural distribution shift performance gap in deep learning based compressed sensing. In: International Conference on Machine Learning (ICML) (2022)
10.
go back to reference Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2009) Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2009)
11.
go back to reference Dosovitskiy, A., et al.: An image is worth 16 \(\times \) 16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (ICLR) (2021) Dosovitskiy, A., et al.: An image is worth 16 \(\times \) 16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (ICLR) (2021)
12.
go back to reference Fahim, M.A.N.I., Boutellier, J.: SS-TTA: test-time adaption for self-supervised denoising methods. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2023) Fahim, M.A.N.I., Boutellier, J.: SS-TTA: test-time adaption for self-supervised denoising methods. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2023)
13.
go back to reference Gunawan, A., Nugroho, M.A., Park, S.J.: Test-time adaptation for real image denoising via meta-transfer learning. In: arXiv preprint (2022) Gunawan, A., Nugroho, M.A., Park, S.J.: Test-time adaptation for real image denoising via meta-transfer learning. In: arXiv preprint (2022)
14.
go back to reference He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022) He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
15.
go back to reference Hu, Z., Yang, Z., Hu, X., Nevatia, R.: Simple: similar pseudo label exploitation for semi-supervised classification. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021) Hu, Z., Yang, Z., Hu, X., Nevatia, R.: Simple: similar pseudo label exploitation for semi-supervised classification. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
16.
go back to reference Huang, T., Li, S., Jia, X., Lu, H., Liu, J.: Neighbor2neighbor: self-supervised denoising from single noisy images. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021) Huang, T., Li, S., Jia, X., Lu, H., Liu, J.: Neighbor2neighbor: self-supervised denoising from single noisy images. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
17.
go back to reference Kundu, J.N., Venkat, N., Babu, R.V., et al.: Universal source-free domain adaptation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020) Kundu, J.N., Venkat, N., Babu, R.V., et al.: Universal source-free domain adaptation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
18.
go back to reference Lehtinen, J., et al.: Noise2Noise: learning image restoration without clean data. In: International Conference on Machine Learning (ICML) (2018) Lehtinen, J., et al.: Noise2Noise: learning image restoration without clean data. In: International Conference on Machine Learning (ICML) (2018)
19.
go back to reference Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: IEEE/CVF International Conference on Computer Vision (ICCV) (2021) Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: IEEE/CVF International Conference on Computer Vision (ICCV) (2021)
20.
go back to reference Mansour, Y., Heckel, R.: Zero-shot noise2noise: efficient image denoising without any data. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023) Mansour, Y., Heckel, R.: Zero-shot noise2noise: efficient image denoising without any data. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
21.
go back to reference Mansour, Y., Lin, K., Heckel, R.: Image-to-image mlp-mixer for image reconstruction. In: arXiv preprint (2022) Mansour, Y., Lin, K., Heckel, R.: Image-to-image mlp-mixer for image reconstruction. In: arXiv preprint (2022)
22.
go back to reference Mohan, S., Vincent, J.L., Manzorro, R., Crozier, P., Fernandez-Granda, C., Simoncelli, E.: Adaptive denoising via gaintuning. In: Neural Information Processing Systems (NeurIPS) (2021) Mohan, S., Vincent, J.L., Manzorro, R., Crozier, P., Fernandez-Granda, C., Simoncelli, E.: Adaptive denoising via gaintuning. In: Neural Information Processing Systems (NeurIPS) (2021)
23.
go back to reference Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2016) Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
24.
go back to reference Plotz, T., Roth, S.: Benchmarking denoising algorithms with real photographs. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2017) Plotz, T., Roth, S.: Benchmarking denoising algorithms with real photographs. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
25.
go back to reference Quan, Y., Chen, M., Pang, T., Ji, H.: Self2self with dropout: learning self-supervised denoising from single image. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020) Quan, Y., Chen, M., Pang, T., Ji, H.: Self2self with dropout: learning self-supervised denoising from single image. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020)
27.
go back to reference Sidky, E.Y., Pan, X.: Report on the AAPM deep-learning sparse-view CT grand challenge. Med. Phys. (2022) Sidky, E.Y., Pan, X.: Report on the AAPM deep-learning sparse-view CT grand challenge. Med. Phys. (2022)
28.
go back to reference Sun, Y., Wang, X., Liu, Z., Miller, J., Efros, A., Hardt, M.: Test-time training with self-supervision for generalization under distribution shifts. In: International Conference on Machine Learning (ICML) (2020) Sun, Y., Wang, X., Liu, Z., Miller, J., Efros, A., Hardt, M.: Test-time training with self-supervision for generalization under distribution shifts. In: International Conference on Machine Learning (ICML) (2020)
29.
go back to reference Tu, Z., et al.: Maxim: multi-axis MLP for image processing. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022) Tu, Z., et al.: Maxim: multi-axis MLP for image processing. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
30.
go back to reference Ulyanov, D., Vedaldi, A., Lempitsky, V.: Deep image prior. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ulyanov, D., Vedaldi, A., Lempitsky, V.: Deep image prior. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
31.
go back to reference Vaksman, G., Elad, M., Milanfar, P.: LIDIA: lightweight learned image denoising with instance adaptation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2020) Vaksman, G., Elad, M., Milanfar, P.: LIDIA: lightweight learned image denoising with instance adaptation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2020)
32.
go back to reference Wang, D., Shelhamer, E., Liu, S., Olshausen, B., Darrell, T.: Tent: fully test-time adaptation by entropy minimization. In: International Conference on Learning Representations (ICLR) (2021) Wang, D., Shelhamer, E., Liu, S., Olshausen, B., Darrell, T.: Tent: fully test-time adaptation by entropy minimization. In: International Conference on Learning Representations (ICLR) (2021)
33.
go back to reference Wang, Z., Cun, X., Bao, J., Zhou, W., Liu, J., Li, H.: Uformer: a general u-shaped transformer for image restoration. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022) Wang, Z., Cun, X., Bao, J., Zhou, W., Liu, J., Li, H.: Uformer: a general u-shaped transformer for image restoration. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
34.
go back to reference Wu, Y., He, K.: Group normalization. In: European Conference on Computer Vision (ECCV) (2018) Wu, Y., He, K.: Group normalization. In: European Conference on Computer Vision (ECCV) (2018)
35.
go back to reference Xie, Z., Geng, Z., Hu, J., Zhang, Z., Hu, H., Cao, Y.: Revealing the dark secrets of masked image modeling. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023) Xie, Z., Geng, Z., Hu, J., Zhang, Z., Hu, H., Cao, Y.: Revealing the dark secrets of masked image modeling. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
36.
go back to reference Xie, Z., et al.: Simmim: a simple framework for masked image modeling. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022) Xie, Z., et al.: Simmim: a simple framework for masked image modeling. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
37.
go back to reference Xu, J., Li, H., Liang, Z., Zhang, D., Zhang, L.: Real-world noisy image denoising: a new benchmark. In: arXiv preprint (2018) Xu, J., Li, H., Liang, Z., Zhang, D., Zhang, L.: Real-world noisy image denoising: a new benchmark. In: arXiv preprint (2018)
38.
go back to reference Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H.: Restormer: efficient transformer for high-resolution image restoration. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022) Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H.: Restormer: efficient transformer for high-resolution image restoration. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
39.
go back to reference Zamir, S.W., et al.: Cycleisp: real image restoration via improved data synthesis. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020) Zamir, S.W., et al.: Cycleisp: real image restoration via improved data synthesis. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
40.
go back to reference Zbontar, J., et al.: fastMRI: an open dataset and benchmarks for accelerated MRI. In: arXiv preprint (2018) Zbontar, J., et al.: fastMRI: an open dataset and benchmarks for accelerated MRI. In: arXiv preprint (2018)
41.
go back to reference Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. (2017) Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. (2017)
42.
go back to reference Zhang, Y., et al.: A poisson-gaussian denoising dataset with real fluorescence microscopy images. In: CVPR (2019) Zhang, Y., et al.: A poisson-gaussian denoising dataset with real fluorescence microscopy images. In: CVPR (2019)
Metadata
Title
TTT-MIM: Test-Time Training with Masked Image Modeling for Denoising Distribution Shifts
Authors
Youssef Mansour
Xuyang Zhong
Serdar Caglar
Reinhard Heckel
Copyright Year
2025
DOI
https://doi.org/10.1007/978-3-031-73254-6_20

Premium Partner