Skip to main content
Erschienen in:
Buchtitelbild

2021 | OriginalPaper | Buchkapitel

Adaptive Future Frame Prediction with Ensemble Network

verfasst von : Wonjik Kim, Masayuki Tanaka, Masatoshi Okutomi, Yoko Sasaki

Erschienen in: Pattern Recognition. ICPR International Workshops and Challenges

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Future frame prediction in videos is a challenging problem because videos include complicated movements and large appearance changes. Learning-based future frame prediction approaches have been proposed in kinds of literature. A common limitation of the existing learning-based approaches is a mismatch of training data and test data. In the future frame prediction task, we can obtain the ground truth data by just waiting for a few frames. It means we can update the prediction model online in the test phase. Then, we propose an adaptive update framework for the future frame prediction task. The proposed adaptive updating framework consists of a pre-trained prediction network, a continuous-updating prediction network, and a weight estimation network. We also show that our pre-trained prediction model achieves comparable performance to the existing state-of-the-art approaches. We demonstrate that our approach outperforms existing methods especially for dynamically changing scenes.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
2.
Zurück zum Zitat Bovik, A.C.: The Essential Guide to Video Processing. Academic Press, Cambridge (2009) Bovik, A.C.: The Essential Guide to Video Processing. Academic Press, Cambridge (2009)
3.
Zurück zum Zitat Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
4.
Zurück zum Zitat Ditzler, G., Rosen, G., Polikar, R.: Domain adaptation bounds for multiple expert systems under concept drift. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp. 595–601. IEEE (2014) Ditzler, G., Rosen, G., Polikar, R.: Domain adaptation bounds for multiple expert systems under concept drift. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp. 595–601. IEEE (2014)
5.
Zurück zum Zitat Dollár, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: a benchmark. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 304–311. IEEE (2009) Dollár, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: a benchmark. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 304–311. IEEE (2009)
6.
Zurück zum Zitat Gao, H., Xu, H., Cai, Q.Z., Wang, R., Yu, F., Darrell, T.: Disentangling propagation and generation for video prediction. In: The IEEE International Conference on Computer Vision (ICCV) (October 2019) Gao, H., Xu, H., Cai, Q.Z., Wang, R., Yu, F., Darrell, T.: Disentangling propagation and generation for video prediction. In: The IEEE International Conference on Computer Vision (ICCV) (October 2019)
7.
Zurück zum Zitat Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. (IJRR) 32, 1231–1237 (2013)CrossRef Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. (IJRR) 32, 1231–1237 (2013)CrossRef
8.
Zurück zum Zitat Ghazikhani, A., Monsefi, R., Yazdi, H.S.: Online neural network model for non-stationary and imbalanced data stream classification. Int. J. Mach. Learn. Cybern. 5(1), 51–62 (2014)CrossRef Ghazikhani, A., Monsefi, R., Yazdi, H.S.: Online neural network model for non-stationary and imbalanced data stream classification. Int. J. Mach. Learn. Cybern. 5(1), 51–62 (2014)CrossRef
9.
Zurück zum Zitat Kim, Y., Nam, S., Cho, I., Kim, S.J.: Unsupervised keypoint learning for guiding class-conditional video prediction. Adv. Neural Inf. Process. Syst. 32, 3814–3824 (2019) Kim, Y., Nam, S., Cho, I., Kim, S.J.: Unsupervised keypoint learning for guiding class-conditional video prediction. Adv. Neural Inf. Process. Syst. 32, 3814–3824 (2019)
11.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012)
12.
Zurück zum Zitat Kuncheva, L.I., Žliobaitė, I.: On the window size for classification in changing environments. Intell. Data Anal. 13(6), 861–872 (2009)CrossRef Kuncheva, L.I., Žliobaitė, I.: On the window size for classification in changing environments. Intell. Data Anal. 13(6), 861–872 (2009)CrossRef
13.
Zurück zum Zitat Liu, Z., Yeh, R.A., Tang, X., Liu, Y., Agarwala, A.: Video frame synthesis using deep voxel flow. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4463–4471 (2017) Liu, Z., Yeh, R.A., Tang, X., Liu, Y., Agarwala, A.: Video frame synthesis using deep voxel flow. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4463–4471 (2017)
14.
Zurück zum Zitat Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015) Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
15.
Zurück zum Zitat Lotter, W., Kreiman, G., Cox, D.: Deep predictive coding networks for video prediction and unsupervised learning. arXiv preprint arXiv:1605.08104 (2016) Lotter, W., Kreiman, G., Cox, D.: Deep predictive coding networks for video prediction and unsupervised learning. arXiv preprint arXiv:​1605.​08104 (2016)
16.
Zurück zum Zitat Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3061–3070 (2015) Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3061–3070 (2015)
17.
Zurück zum Zitat Minku, L.L., Yao, X.: DDD: a new ensemble approach for dealing with concept drift. IEEE Trans. Knowl. Data Eng. 24(4), 619–633 (2011)CrossRef Minku, L.L., Yao, X.: DDD: a new ensemble approach for dealing with concept drift. IEEE Trans. Knowl. Data Eng. 24(4), 619–633 (2011)CrossRef
19.
Zurück zum Zitat Pérez-Sánchez, B., Fontenla-Romero, O., Guijarro-Berdiñas, B.: A review of adaptive online learning for artificial neural networks. Artif. Intell. Rev. 49(2), 281–299 (2018)CrossRef Pérez-Sánchez, B., Fontenla-Romero, O., Guijarro-Berdiñas, B.: A review of adaptive online learning for artificial neural networks. Artif. Intell. Rev. 49(2), 281–299 (2018)CrossRef
21.
Zurück zum Zitat Soomro, K., Zamir, A.R., Shah, M.: Ucf101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012) Soomro, K., Zamir, A.R., Shah, M.: Ucf101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:​1212.​0402 (2012)
22.
Zurück zum Zitat Tekalp, A.M., Tekalp, A.M.: Digital Video Processing, vol. 1. Prentice Hall PTR, Upper Saddle river (1995) Tekalp, A.M., Tekalp, A.M.: Digital Video Processing, vol. 1. Prentice Hall PTR, Upper Saddle river (1995)
23.
Zurück zum Zitat Wang, J., Wang, W., Gao, W.: Predicting diverse future frames with local transformation-guided masking. IEEE Trans. Circuits Syst. Video Technol. 29(12), 3531–3543 (2018)CrossRef Wang, J., Wang, W., Gao, W.: Predicting diverse future frames with local transformation-guided masking. IEEE Trans. Circuits Syst. Video Technol. 29(12), 3531–3543 (2018)CrossRef
24.
Zurück zum Zitat Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)CrossRef Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)CrossRef
25.
Zurück zum Zitat Xingjian, S., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.C.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Advances in Neural Information Processing Systems, pp. 802–810 (2015) Xingjian, S., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.C.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Advances in Neural Information Processing Systems, pp. 802–810 (2015)
26.
Zurück zum Zitat Yan, J., Qin, G., Zhao, R., Liang, Y., Xu, Q.: Mixpred: video prediction beyond optical flow. IEEE Access 7, 185654–185665 (2019)CrossRef Yan, J., Qin, G., Zhao, R., Liang, Y., Xu, Q.: Mixpred: video prediction beyond optical flow. IEEE Access 7, 185654–185665 (2019)CrossRef
27.
Zurück zum Zitat Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR (2018) Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR (2018)
Metadaten
Titel
Adaptive Future Frame Prediction with Ensemble Network
verfasst von
Wonjik Kim
Masayuki Tanaka
Masatoshi Okutomi
Yoko Sasaki
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-68787-8_1