Skip to main content
Top
Published in: Optical Memory and Neural Networks 3/2023

01-09-2023

Machine Learning for Multiscale Video Coding

Author: M. V. Gashnikov

Published in: Optical Memory and Neural Networks | Issue 3/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The research concerns the use of machine learning algorithms for multiscale coding of digital video sequences. Based on machine learning, the digital image coder is generalized to the coding of video sequences. To this end, we offer an algorithm that allows for videoframes interdependency by using linear regression. The generalized image coder uses multiscale representation of videoframes, neural network three-dimensional interpolation of multiscale videoframe interpretation levels and generative-adversarial neural net replacement of homogeneous portions of a videoframe by synthetic video data. The method of coding the entire video and method of coding videoframes are exemplified by block diagrams. Formalized description of how videoframe correlation is taken into account is given. Real video sequences are used to carry out numerical experiments. The experimental data allow us to make a conclusion about the promise of using the algorithm in video coding and processing.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Hoang, T.M. and Zhou, J., Recent trending on learning based video compression: A survey, Cognit. Rob., 2021, vol. 1, pp. 145–158. Hoang, T.M. and Zhou, J., Recent trending on learning based video compression: A survey, Cognit. Rob., 2021, vol. 1, pp. 145–158.
2.
go back to reference Yasin, H.M. and Ameen, S.Y., Review and evaluation of end-to-end video compression with deep-learning, in 2021 International Conference of Modern Trends in Information and Communication Technology Industry (MTICTI), IEEE, 2021, pp. 1–8. Yasin, H.M. and Ameen, S.Y., Review and evaluation of end-to-end video compression with deep-learning, in 2021 International Conference of Modern Trends in Information and Communication Technology Industry (MTICTI), IEEE, 2021, pp. 1–8.
3.
go back to reference Saideni, W., Helbert, D., Courreges, F., and Cances, J.P., An overview on deep learning techniques for video compressive sensing, Appl. Sci., 2022, vol. 12, no. 5, p. 2734.CrossRef Saideni, W., Helbert, D., Courreges, F., and Cances, J.P., An overview on deep learning techniques for video compressive sensing, Appl. Sci., 2022, vol. 12, no. 5, p. 2734.CrossRef
4.
go back to reference Chen, Z., Lu, G., Hu, Z., Liu, S., Jiang, W., and Xu, D., LSVC: A learning-based stereo video compression framework, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 6073–6082. Chen, Z., Lu, G., Hu, Z., Liu, S., Jiang, W., and Xu, D., LSVC: A learning-based stereo video compression framework, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 6073–6082.
5.
go back to reference Mandhane, A., Zhernov, A., Rauh, M., Gu, C., Wang, M., Xue, F., … and Mann, T., Muzero with self-competition for rate control in vp9 video compression, 2022. arXiv preprint arXiv:2202.06626. Mandhane, A., Zhernov, A., Rauh, M., Gu, C., Wang, M., Xue, F., … and Mann, T., Muzero with self-competition for rate control in vp9 video compression, 2022. arXiv preprint arXiv:2202.06626.
6.
go back to reference Chen, M.J., Lee, C.A., Tsai, Y.H., Yang, C.M., Yeh, C.H., Kau, L.J., and Chang, C.Y., Efficient partition decision based on visual perception and machine learning for H. 266/Versatile video coding, IEEE Access, 2022, vol. 10, pp. 42141–42150.CrossRef Chen, M.J., Lee, C.A., Tsai, Y.H., Yang, C.M., Yeh, C.H., Kau, L.J., and Chang, C.Y., Efficient partition decision based on visual perception and machine learning for H. 266/Versatile video coding, IEEE Access, 2022, vol. 10, pp. 42141–42150.CrossRef
7.
go back to reference Mentzer, F., Toderici, G., Minnen, D., Hwang, S.J., Caelles, S., Lucic, M., and Agustsson, E., Vct: A video compression transformer, 2022. arXiv preprint arXiv:2206.07307. Mentzer, F., Toderici, G., Minnen, D., Hwang, S.J., Caelles, S., Lucic, M., and Agustsson, E., Vct: A video compression transformer, 2022. arXiv preprint arXiv:2206.07307.
8.
go back to reference Zhang, Q., Wang, S., Zhang, X., Jia, C., Pan, J., Ma, S., and Gao, W., SMR: Satisfied Machine Ratio Modeling for Machine Recognition-Oriented Image and Video Compression, 2022. arXiv preprint arXiv:2211.06797. Zhang, Q., Wang, S., Zhang, X., Jia, C., Pan, J., Ma, S., and Gao, W., SMR: Satisfied Machine Ratio Modeling for Machine Recognition-Oriented Image and Video Compression, 2022. arXiv preprint arXiv:2211.06797.
9.
go back to reference Duong, L.R., Li, B., Chen, C., and Han, J., Multi-rate adaptive transform coding for video compression, 2022. arXiv preprint arXiv:2210.14308. Duong, L.R., Li, B., Chen, C., and Han, J., Multi-rate adaptive transform coding for video compression, 2022. arXiv preprint arXiv:2210.14308.
10.
go back to reference Gashnikov, M.V., Use of neural networks and decision trees in compression of 2D and 3D digital signals, Opt. Mem. Neural Networks, 2022, vol. 31, no. 4, pp. 379–392.CrossRef Gashnikov, M.V., Use of neural networks and decision trees in compression of 2D and 3D digital signals, Opt. Mem. Neural Networks, 2022, vol. 31, no. 4, pp. 379–392.CrossRef
11.
go back to reference Sergeyev, V.V, Glumov, N.I., and Gashnikov, M.V., Compression rate control during hierarchical image compression, 7th Int. Conference on Pattern Recognition and image analysis: New Information Technologies, 2004, vol. 1, pp. 217–219. Sergeyev, V.V, Glumov, N.I., and Gashnikov, M.V., Compression rate control during hierarchical image compression, 7th Int. Conference on Pattern Recognition and image analysis: New Information Technologies, 2004, vol. 1, pp. 217–219.
12.
go back to reference Dynamic Scenes Data Set. http://vision.eecs.yorku.ca/research/dynamic-scenes. Dynamic Scenes Data Set. http://​vision.​eecs.​yorku.​ca/​research/​dynamic-scenes.​
Metadata
Title
Machine Learning for Multiscale Video Coding
Author
M. V. Gashnikov
Publication date
01-09-2023
Publisher
Pleiades Publishing
Published in
Optical Memory and Neural Networks / Issue 3/2023
Print ISSN: 1060-992X
Electronic ISSN: 1934-7898
DOI
https://doi.org/10.3103/S1060992X23030037

Other articles of this Issue 3/2023

Optical Memory and Neural Networks 3/2023 Go to the issue

Premium Partner