Skip to main content
Erschienen in: Optical Memory and Neural Networks 4/2023

01.12.2023

Video Codec Using Machine Learning Based on Parametric Orthogonal Filters

verfasst von: M. V. Gashnikov

Erschienen in: Optical Memory and Neural Networks | Ausgabe 4/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The research deals with video encoding using a machine learning-based videoframe approximator. The use of neural networks and hierarchical classifiers is considered in the context of this sort of approximator. Using a machine learning-based hierarchical classifier, the approximator switches at each point of a videoframe between elementary approximators from a predefined set of elementary classifiers. Convolutional filters with parametric orthogonal kernels are used as elementary classifiers. An algorithm for optimizing the hierarchical classifier is considered. The algorithm is based on recursive recalculations of the entropy quality index, which provides a good approximation of the encoded-data size. This sort of videoframe approximator is intended for a video codec using nested representations of videoframes. Real video sequences are used in computational experiments. The results indicate that the use of the videoframe approximator with a hierarchical classifier engaging parametric orthogonal kernels enables a noticeable reduction of the size of the encoded-data array.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ibaba, A., Adeshina, S., and Aibinu, A.M., A review of video compression optimization techniques, in 2021 1st International Conference on Multidisciplinary Engineering and Applied Science (ICMEAS), IEEE, pp. 1–5. Ibaba, A., Adeshina, S., and Aibinu, A.M., A review of video compression optimization techniques, in 2021 1st International Conference on Multidisciplinary Engineering and Applied Science (ICMEAS), IEEE, pp. 1–5.
2.
Zurück zum Zitat Zhang, T. and Mao, S., An overview of emerging video coding standards, GetMobile: Mobile Comput. Commun., 2019, vol. 22, no. 4, pp. 13–20.CrossRef Zhang, T. and Mao, S., An overview of emerging video coding standards, GetMobile: Mobile Comput. Commun., 2019, vol. 22, no. 4, pp. 13–20.CrossRef
3.
Zurück zum Zitat Jamil, S. and Piran, M., Learning-driven lossy image compression; A Comprehensive Survey, 2022. arXiv preprint arXiv:2201.09240. Jamil, S. and Piran, M., Learning-driven lossy image compression; A Comprehensive Survey, 2022. arXiv preprint arXiv:2201.09240.
4.
Zurück zum Zitat Li, Y., Liu, G., Sun, Y., Liu, Q., and Chen, S., 3D tensor auto-encoder with application to video compression, ACM Trans. Multimedia Comput., Commun., Appl. (TOMM), 2021, vol. 17, no. 2, pp. 1–18.CrossRef Li, Y., Liu, G., Sun, Y., Liu, Q., and Chen, S., 3D tensor auto-encoder with application to video compression, ACM Trans. Multimedia Comput., Commun., Appl. (TOMM), 2021, vol. 17, no. 2, pp. 1–18.CrossRef
5.
Zurück zum Zitat Yang, R., Mentzer, F., Van Gool, L., and Timofte, R., Learning for video compression with recurrent auto-encoder and recurrent probability model, IEEE J. Sel. Top. Signal Process., 2020, vol. 15, no. 2, pp. 388–401.CrossRef Yang, R., Mentzer, F., Van Gool, L., and Timofte, R., Learning for video compression with recurrent auto-encoder and recurrent probability model, IEEE J. Sel. Top. Signal Process., 2020, vol. 15, no. 2, pp. 388–401.CrossRef
6.
Zurück zum Zitat Sara, U., Akter, M., and Uddin, M.S., Image quality assessment through FSIM, SSIM, MSE and PSNR – A comparative study, J. Comput. Commun., 2019, vol. 7, no. 3, pp. 8–18.CrossRef Sara, U., Akter, M., and Uddin, M.S., Image quality assessment through FSIM, SSIM, MSE and PSNR – A comparative study, J. Comput. Commun., 2019, vol. 7, no. 3, pp. 8–18.CrossRef
7.
Zurück zum Zitat Lin, J., Liu, D., Li, H., and Wu, F., M-LVC: Multiple frames prediction for learned video compression, in Proceeding of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3546–3554. Lin, J., Liu, D., Li, H., and Wu, F., M-LVC: Multiple frames prediction for learned video compression, in Proceeding of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3546–3554.
8.
Zurück zum Zitat De Cock, J., De Decker, A., and Sivashanmugam, S., Low-complexity quality measurement for real-time video compression, in SMPTE 2022 Media Technology Summit, SMPTE, 2022, pp. 1–12. De Cock, J., De Decker, A., and Sivashanmugam, S., Low-complexity quality measurement for real-time video compression, in SMPTE 2022 Media Technology Summit, SMPTE, 2022, pp. 1–12.
9.
Zurück zum Zitat Antsiferova, A., Lavrushkin, S., Smirnov, M., Gushchin, A., Vatolin, D., and Kulikov, D., Video compression dataset and benchmark of learning-based video-quality metrics, 2022. arXiv preprint arXiv:2211.12109. Antsiferova, A., Lavrushkin, S., Smirnov, M., Gushchin, A., Vatolin, D., and Kulikov, D., Video compression dataset and benchmark of learning-based video-quality metrics, 2022. arXiv preprint arXiv:2211.12109.
10.
Zurück zum Zitat Mansri, I., Doghmane, N., Kouadria, N., Harize, S., and Bekhouch, A., Comparative evaluation of VVC, HEVC, H. 264, AV1, and VP9 encoders for low-delay video applications, in 2020 Fourth International Conference on Multimedia Computing, Networking and Applications (MCNA), IEEE, 2020, pp. 38–43. Mansri, I., Doghmane, N., Kouadria, N., Harize, S., and Bekhouch, A., Comparative evaluation of VVC, HEVC, H. 264, AV1, and VP9 encoders for low-delay video applications, in 2020 Fourth International Conference on Multimedia Computing, Networking and Applications (MCNA), IEEE, 2020, pp. 38–43.
11.
Zurück zum Zitat Habibian, A., Rozendaal, T.V., Tomczak, J.M., and Cohen, T.S., Video compression with rate-distortion autoencoders, in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 7033–7042. Habibian, A., Rozendaal, T.V., Tomczak, J.M., and Cohen, T.S., Video compression with rate-distortion autoencoders, in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 7033–7042.
12.
Zurück zum Zitat Sergeyev, V.V., Glumov, N.I., Gashnikov, M.V., Myasnikov, V.V., and Farberov, E., A software environment for image compression and visualization based on hierarchical grid interpolation, Pattern Recognit. Image Anal., 2001, vol. 11, no. 2, pp. 428–429. Sergeyev, V.V., Glumov, N.I., Gashnikov, M.V., Myasnikov, V.V., and Farberov, E., A software environment for image compression and visualization based on hierarchical grid interpolation, Pattern Recognit. Image Anal., 2001, vol. 11, no. 2, pp. 428–429.
13.
Zurück zum Zitat Gashnikov, M.V., Use of neural networks and decision trees in compression of 2D and 3D digital signals, Opt. Mem. Neural Networks, 2022, vol. 31, no. 4, pp. 379–392.CrossRef Gashnikov, M.V., Use of neural networks and decision trees in compression of 2D and 3D digital signals, Opt. Mem. Neural Networks, 2022, vol. 31, no. 4, pp. 379–392.CrossRef
14.
Zurück zum Zitat Dynamic Scenes Data Set. http://vision.eecs.yorku.ca/research/dynamic-scenes. Dynamic Scenes Data Set. http://​vision.​eecs.​yorku.​ca/​research/​dynamic-scenes.​
Metadaten
Titel
Video Codec Using Machine Learning Based on Parametric Orthogonal Filters
verfasst von
M. V. Gashnikov
Publikationsdatum
01.12.2023
Verlag
Pleiades Publishing
Erschienen in
Optical Memory and Neural Networks / Ausgabe 4/2023
Print ISSN: 1060-992X
Elektronische ISSN: 1934-7898
DOI
https://doi.org/10.3103/S1060992X23040021

Weitere Artikel der Ausgabe 4/2023

Optical Memory and Neural Networks 4/2023 Zur Ausgabe

Premium Partner