Skip to main content
Top
Published in:
Cover of the book

2021 | OriginalPaper | Chapter

Adaptive Future Frame Prediction with Ensemble Network

Authors : Wonjik Kim, Masayuki Tanaka, Masatoshi Okutomi, Yoko Sasaki

Published in: Pattern Recognition. ICPR International Workshops and Challenges

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Future frame prediction in videos is a challenging problem because videos include complicated movements and large appearance changes. Learning-based future frame prediction approaches have been proposed in kinds of literature. A common limitation of the existing learning-based approaches is a mismatch of training data and test data. In the future frame prediction task, we can obtain the ground truth data by just waiting for a few frames. It means we can update the prediction model online in the test phase. Then, we propose an adaptive update framework for the future frame prediction task. The proposed adaptive updating framework consists of a pre-trained prediction network, a continuous-updating prediction network, and a weight estimation network. We also show that our pre-trained prediction model achieves comparable performance to the existing state-of-the-art approaches. We demonstrate that our approach outperforms existing methods especially for dynamically changing scenes.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
2.
go back to reference Bovik, A.C.: The Essential Guide to Video Processing. Academic Press, Cambridge (2009) Bovik, A.C.: The Essential Guide to Video Processing. Academic Press, Cambridge (2009)
3.
go back to reference Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
4.
go back to reference Ditzler, G., Rosen, G., Polikar, R.: Domain adaptation bounds for multiple expert systems under concept drift. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp. 595–601. IEEE (2014) Ditzler, G., Rosen, G., Polikar, R.: Domain adaptation bounds for multiple expert systems under concept drift. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp. 595–601. IEEE (2014)
5.
go back to reference Dollár, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: a benchmark. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 304–311. IEEE (2009) Dollár, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: a benchmark. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 304–311. IEEE (2009)
6.
go back to reference Gao, H., Xu, H., Cai, Q.Z., Wang, R., Yu, F., Darrell, T.: Disentangling propagation and generation for video prediction. In: The IEEE International Conference on Computer Vision (ICCV) (October 2019) Gao, H., Xu, H., Cai, Q.Z., Wang, R., Yu, F., Darrell, T.: Disentangling propagation and generation for video prediction. In: The IEEE International Conference on Computer Vision (ICCV) (October 2019)
7.
go back to reference Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. (IJRR) 32, 1231–1237 (2013)CrossRef Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. (IJRR) 32, 1231–1237 (2013)CrossRef
8.
go back to reference Ghazikhani, A., Monsefi, R., Yazdi, H.S.: Online neural network model for non-stationary and imbalanced data stream classification. Int. J. Mach. Learn. Cybern. 5(1), 51–62 (2014)CrossRef Ghazikhani, A., Monsefi, R., Yazdi, H.S.: Online neural network model for non-stationary and imbalanced data stream classification. Int. J. Mach. Learn. Cybern. 5(1), 51–62 (2014)CrossRef
9.
go back to reference Kim, Y., Nam, S., Cho, I., Kim, S.J.: Unsupervised keypoint learning for guiding class-conditional video prediction. Adv. Neural Inf. Process. Syst. 32, 3814–3824 (2019) Kim, Y., Nam, S., Cho, I., Kim, S.J.: Unsupervised keypoint learning for guiding class-conditional video prediction. Adv. Neural Inf. Process. Syst. 32, 3814–3824 (2019)
11.
go back to reference Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012)
12.
go back to reference Kuncheva, L.I., Žliobaitė, I.: On the window size for classification in changing environments. Intell. Data Anal. 13(6), 861–872 (2009)CrossRef Kuncheva, L.I., Žliobaitė, I.: On the window size for classification in changing environments. Intell. Data Anal. 13(6), 861–872 (2009)CrossRef
13.
go back to reference Liu, Z., Yeh, R.A., Tang, X., Liu, Y., Agarwala, A.: Video frame synthesis using deep voxel flow. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4463–4471 (2017) Liu, Z., Yeh, R.A., Tang, X., Liu, Y., Agarwala, A.: Video frame synthesis using deep voxel flow. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4463–4471 (2017)
14.
go back to reference Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015) Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
15.
go back to reference Lotter, W., Kreiman, G., Cox, D.: Deep predictive coding networks for video prediction and unsupervised learning. arXiv preprint arXiv:1605.08104 (2016) Lotter, W., Kreiman, G., Cox, D.: Deep predictive coding networks for video prediction and unsupervised learning. arXiv preprint arXiv:​1605.​08104 (2016)
16.
go back to reference Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3061–3070 (2015) Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3061–3070 (2015)
17.
go back to reference Minku, L.L., Yao, X.: DDD: a new ensemble approach for dealing with concept drift. IEEE Trans. Knowl. Data Eng. 24(4), 619–633 (2011)CrossRef Minku, L.L., Yao, X.: DDD: a new ensemble approach for dealing with concept drift. IEEE Trans. Knowl. Data Eng. 24(4), 619–633 (2011)CrossRef
19.
go back to reference Pérez-Sánchez, B., Fontenla-Romero, O., Guijarro-Berdiñas, B.: A review of adaptive online learning for artificial neural networks. Artif. Intell. Rev. 49(2), 281–299 (2018)CrossRef Pérez-Sánchez, B., Fontenla-Romero, O., Guijarro-Berdiñas, B.: A review of adaptive online learning for artificial neural networks. Artif. Intell. Rev. 49(2), 281–299 (2018)CrossRef
21.
go back to reference Soomro, K., Zamir, A.R., Shah, M.: Ucf101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012) Soomro, K., Zamir, A.R., Shah, M.: Ucf101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:​1212.​0402 (2012)
22.
go back to reference Tekalp, A.M., Tekalp, A.M.: Digital Video Processing, vol. 1. Prentice Hall PTR, Upper Saddle river (1995) Tekalp, A.M., Tekalp, A.M.: Digital Video Processing, vol. 1. Prentice Hall PTR, Upper Saddle river (1995)
23.
go back to reference Wang, J., Wang, W., Gao, W.: Predicting diverse future frames with local transformation-guided masking. IEEE Trans. Circuits Syst. Video Technol. 29(12), 3531–3543 (2018)CrossRef Wang, J., Wang, W., Gao, W.: Predicting diverse future frames with local transformation-guided masking. IEEE Trans. Circuits Syst. Video Technol. 29(12), 3531–3543 (2018)CrossRef
24.
go back to reference Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)CrossRef Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)CrossRef
25.
go back to reference Xingjian, S., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.C.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Advances in Neural Information Processing Systems, pp. 802–810 (2015) Xingjian, S., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.C.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Advances in Neural Information Processing Systems, pp. 802–810 (2015)
26.
go back to reference Yan, J., Qin, G., Zhao, R., Liang, Y., Xu, Q.: Mixpred: video prediction beyond optical flow. IEEE Access 7, 185654–185665 (2019)CrossRef Yan, J., Qin, G., Zhao, R., Liang, Y., Xu, Q.: Mixpred: video prediction beyond optical flow. IEEE Access 7, 185654–185665 (2019)CrossRef
27.
go back to reference Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR (2018) Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR (2018)
Metadata
Title
Adaptive Future Frame Prediction with Ensemble Network
Authors
Wonjik Kim
Masayuki Tanaka
Masatoshi Okutomi
Yoko Sasaki
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-68787-8_1

Premium Partner