Skip to main content
Top

2018 | OriginalPaper | Chapter

Spatio-Temporal Transformer Network for Video Restoration

Authors : Tae Hyun Kim, Mehdi S. M. Sajjadi, Michael Hirsch, Bernhard Schölkopf

Published in: Computer Vision – ECCV 2018

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

State-of-the-art video restoration methods integrate optical flow estimation networks to utilize temporal information. However, these networks typically consider only a pair of consecutive frames and hence are not capable of capturing long-range temporal dependencies and fall short of establishing correspondences across several timesteps. To alleviate these problems, we propose a novel Spatio-temporal Transformer Network (STTN) which handles multiple frames at once and thereby manages to mitigate the common nuisance of occlusions in optical flow estimation. Our proposed STTN comprises a module that estimates optical flow in both space and time and a resampling layer that selectively warps target frames using the estimated flow. In our experiments, we demonstrate the efficiency of the proposed network and show state-of-the-art restoration results in video super-resolution and video deblurring.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Liu, C., Sun, D.: A Bayesian approach to adaptive video super resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2011) Liu, C., Sun, D.: A Bayesian approach to adaptive video super resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2011)
2.
go back to reference Kim, T.H., Lee, K.M.: Generalized video deblurring for dynamic scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015) Kim, T.H., Lee, K.M.: Generalized video deblurring for dynamic scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
4.
go back to reference Dosovitskiy, A., et al.: FlowNet: learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015) Dosovitskiy, A., et al.: FlowNet: learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)
5.
go back to reference Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
6.
go back to reference Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
7.
go back to reference Ren, Z., Yan, J., Ni, B., Liu, B., Yang, X., Zha, H.: Unsupervised deep learning for optical flow estimation. In: Association for the Advancement of Artificial Intelligence (AAAI) (2017) Ren, Z., Yan, J., Ni, B., Liu, B., Yang, X., Zha, H.: Unsupervised deep learning for optical flow estimation. In: Association for the Advancement of Artificial Intelligence (AAAI) (2017)
8.
go back to reference Meister, S., Hur, J., Roth, S.: UnFlow: unsupervised learning of optical flow with a bidirectional census loss. In: AAAI (2017) Meister, S., Hur, J., Roth, S.: UnFlow: unsupervised learning of optical flow with a bidirectional census loss. In: AAAI (2017)
9.
go back to reference Dosovitskiy, A., Springenberg, J.T., Brox, T.: Learning to generate chairs with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015) Dosovitskiy, A., Springenberg, J.T., Brox, T.: Learning to generate chairs with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
10.
go back to reference Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
11.
go back to reference Ahmadi, A., Patras, I.: Unsupervised convolutional neural networks for motion estimation. In: IEEE International Conference on Image Processing (ICIP) (2016) Ahmadi, A., Patras, I.: Unsupervised convolutional neural networks for motion estimation. In: IEEE International Conference on Image Processing (ICIP) (2016)
13.
go back to reference Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. In: Advances in Neural Information Processing Systems (NIPS) (2015) Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. In: Advances in Neural Information Processing Systems (NIPS) (2015)
14.
15.
go back to reference Werlberger, M., Pock, T., Bischof, H.: Motion estimation with non-local total variation regularization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010) Werlberger, M., Pock, T., Bischof, H.: Motion estimation with non-local total variation regularization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010)
16.
go back to reference Lee, K.J., Kwon, D., Yun, I.D., Lee, S.U.: Optical flow estimation with adaptive convolution Kernel prior on discrete framework. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010) Lee, K.J., Kwon, D., Yun, I.D., Lee, S.U.: Optical flow estimation with adaptive convolution Kernel prior on discrete framework. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010)
17.
go back to reference Sun, D., Roth, S., Black, M.J.: Secrets of optical flow estimation and their principles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010) Sun, D., Roth, S., Black, M.J.: Secrets of optical flow estimation and their principles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010)
18.
go back to reference Xu, L., Jia, J., Matsushita, Y.: Motion detail preserving optical flow estimation. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 34(9), 1744–1757 (2012)CrossRef Xu, L., Jia, J., Matsushita, Y.: Motion detail preserving optical flow estimation. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 34(9), 1744–1757 (2012)CrossRef
19.
go back to reference Kim, T.H., Lee, H.S., Lee, K.M.: Optical flow via locally adaptive fusion of complementary data costs. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2013) Kim, T.H., Lee, H.S., Lee, K.M.: Optical flow via locally adaptive fusion of complementary data costs. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2013)
20.
go back to reference Volz, S., Bruhn, A., Valgaerts, L., Zimmer, H.: Modeling temporal coherence for optical flow. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2011) Volz, S., Bruhn, A., Valgaerts, L., Zimmer, H.: Modeling temporal coherence for optical flow. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2011)
21.
go back to reference Baker, S., Scharstein, D., Lewis, J., Roth, S., Black, M.J., Szeliski, R.: A database and evaluation methodology for optical flow. Int. J.Comput. Vis. (IJCV) 92(1), 1–31 (2011)CrossRef Baker, S., Scharstein, D., Lewis, J., Roth, S., Black, M.J., Szeliski, R.: A database and evaluation methodology for optical flow. Int. J.Comput. Vis. (IJCV) 92(1), 1–31 (2011)CrossRef
23.
go back to reference Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012) Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
24.
go back to reference Xu, J., Ranftl, R., Koltun, V.: Accurate optical flow via direct cost volume processing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) Xu, J., Ranftl, R., Koltun, V.: Accurate optical flow via direct cost volume processing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
26.
go back to reference Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: Deepmatching: hierarchical deformable dense matching. Int. J. Comput. Vis. (IJCV) 120(3), 300–323 (2016)MathSciNetCrossRef Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: Deepmatching: hierarchical deformable dense matching. Int. J. Comput. Vis. (IJCV) 120(3), 300–323 (2016)MathSciNetCrossRef
27.
go back to reference Bailer, C., Taetz, B., Stricker, D.: Flow fields: dense correspondence fields for highly accurate large displacement optical flow estimation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015) Bailer, C., Taetz, B., Stricker, D.: Flow fields: dense correspondence fields for highly accurate large displacement optical flow estimation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)
29.
go back to reference Kappeler, A., Yoo, S., Dai, Q., Katsaggelos, A.K.: Video super-resolution with convolutional neural networks. IEEE Trans. Comput. Imaging 2(2), 109–122 (2016)MathSciNetCrossRef Kappeler, A., Yoo, S., Dai, Q., Katsaggelos, A.K.: Video super-resolution with convolutional neural networks. IEEE Trans. Comput. Imaging 2(2), 109–122 (2016)MathSciNetCrossRef
30.
go back to reference Caballero, J., Ledig, C., Aitken, A., Acosta, A., Totz, J.: Real-time video super-resolution with spatio-temporal networks and motion compensation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) Caballero, J., Ledig, C., Aitken, A., Acosta, A., Totz, J.: Real-time video super-resolution with spatio-temporal networks and motion compensation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
31.
go back to reference Makansi, O., Ilg, E., Brox, T.: End-to-end learning of video super-resolution with motion compensation. In: Proceedings of the German Conference on Pattern Recognition (GCPR) (2017) Makansi, O., Ilg, E., Brox, T.: End-to-end learning of video super-resolution with motion compensation. In: Proceedings of the German Conference on Pattern Recognition (GCPR) (2017)
32.
go back to reference Tao, X., Gao, H., Liao, R., Wang, J., Jia, J.: Detail-revealing deep video super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017) Tao, X., Gao, H., Liao, R., Wang, J., Jia, J.: Detail-revealing deep video super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017)
33.
go back to reference Liu, D., et al.: Robust video super-resolution with learned temporal dynamics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) Liu, D., et al.: Robust video super-resolution with learned temporal dynamics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
34.
go back to reference Sajjadi, M.S.M., Vemulapalli, R., Brown, M.: Frame-recurrent video super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Sajjadi, M.S.M., Vemulapalli, R., Brown, M.: Frame-recurrent video super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
35.
go back to reference Cai, J.F., Ji, H., Liu, C., Shen, Z.: Blind motion deblurring using multiple images. J. Comput. Phys. 228(14), 5057–5071 (2009)MathSciNetCrossRef Cai, J.F., Ji, H., Liu, C., Shen, Z.: Blind motion deblurring using multiple images. J. Comput. Phys. 228(14), 5057–5071 (2009)MathSciNetCrossRef
36.
go back to reference Zhang, H., Wipf, D., Zhang, Y.: Multi-image blind deblurring using a coupled adaptive sparse prior. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2013) Zhang, H., Wipf, D., Zhang, Y.: Multi-image blind deblurring using a coupled adaptive sparse prior. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2013)
37.
go back to reference Cho, S., Cho, H., Tai, Y.W., Lee, S.: Registration based non-uniform motion deblurring. Comput. Graph. Forum 31(7), 2183–2192 (2012)CrossRef Cho, S., Cho, H., Tai, Y.W., Lee, S.: Registration based non-uniform motion deblurring. Comput. Graph. Forum 31(7), 2183–2192 (2012)CrossRef
38.
go back to reference Zhang, H., Carin, L.: Multi-shot imaging: joint alignment, deblurring and resolution-enhancement. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014) Zhang, H., Carin, L.: Multi-shot imaging: joint alignment, deblurring and resolution-enhancement. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
39.
go back to reference Zhang, H., Yang, J.: Intra-frame deblurring by leveraging inter-frame camera motion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015) Zhang, H., Yang, J.: Intra-frame deblurring by leveraging inter-frame camera motion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
40.
go back to reference Li, Y., Kang, S.B., Joshi, N., Seitz, S.M., Huttenlocher, D.P.: Generating sharp panoramas from motion-blurred videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010) Li, Y., Kang, S.B., Joshi, N., Seitz, S.M., Huttenlocher, D.P.: Generating sharp panoramas from motion-blurred videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010)
42.
go back to reference Kim, T.H., Nah, S., Lee, K.M.: Dynamic video deblurring using a locally adaptive linear blur model. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) (2018) Kim, T.H., Nah, S., Lee, K.M.: Dynamic video deblurring using a locally adaptive linear blur model. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) (2018)
43.
go back to reference Kim, T.H., Lee, K.M.: Segmentation-free dynamic scene deblurring. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014) Kim, T.H., Lee, K.M.: Segmentation-free dynamic scene deblurring. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
44.
go back to reference Nah, S., Kim, T.H., Lee, K.M.: Deep multi-scale convolutional neural network for dynamic scene deblurring. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) Nah, S., Kim, T.H., Lee, K.M.: Deep multi-scale convolutional neural network for dynamic scene deblurring. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
45.
go back to reference Su, S., Delbracio, M., Wang, J., Sapiro, G., Heidrich, W., Wang, O.: Deep video deblurring. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) Su, S., Delbracio, M., Wang, J., Sapiro, G., Heidrich, W., Wang, O.: Deep video deblurring. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
46.
go back to reference Wieschollek, P., Hirsch, M., Schölkopf, B., Lensch, H.P.: Learning blind motion deblurring. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017) Wieschollek, P., Hirsch, M., Schölkopf, B., Lensch, H.P.: Learning blind motion deblurring. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017)
47.
go back to reference Kim, T.H., Lee, K.M., Schölkopf, B., Hirsch, M.: Online video deblurring via dynamic temporal blending network. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017) Kim, T.H., Lee, K.M., Schölkopf, B., Hirsch, M.: Online video deblurring via dynamic temporal blending network. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017)
49.
go back to reference Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning (2015) Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning (2015)
50.
go back to reference Sajjadi, M.S.M., Schölkopf, B., Hirsch, M.: EnhanceNet: single image super-resolution through automated texture synthesis. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017) Sajjadi, M.S.M., Schölkopf, B., Hirsch, M.: EnhanceNet: single image super-resolution through automated texture synthesis. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017)
51.
go back to reference Kim, J., Kwon Lee, J., Mu Lee, K.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) Kim, J., Kwon Lee, J., Mu Lee, K.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
52.
go back to reference Pérez, J.S., Meinhardt-Llopis, E., Facciolo, G.: TV-L1 optical flow estimation. Image Process. On Line (IPOL) 3, 137–150 (2013)CrossRef Pérez, J.S., Meinhardt-Llopis, E., Facciolo, G.: TV-L1 optical flow estimation. Image Process. On Line (IPOL) 3, 137–150 (2013)CrossRef
Metadata
Title
Spatio-Temporal Transformer Network for Video Restoration
Authors
Tae Hyun Kim
Mehdi S. M. Sajjadi
Michael Hirsch
Bernhard Schölkopf
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-030-01219-9_7

Premium Partner