Skip to main content
Erschienen in: Wireless Personal Communications 4/2014

01.02.2014

Online Dictionary Learning Based Intra-frame Video Coding

verfasst von: Yipeng Sun, Mai Xu, Xiaoming Tao, Jianhua Lu

Erschienen in: Wireless Personal Communications | Ausgabe 4/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we propose an online learning based intra-frame video coding approach, exploiting the texture sparsity of natural images. The proposed method is capable of learning the basic texture elements from previous frames with convergence guaranteed, leading to effective dictionaries for sparser representation of incoming frames. Benefiting from online learning, the proposed online dictionary learning based codec (ODL codec) is able to achieve a goal that the more video frames are being coded, the less non-zero coefficients are required to be transmitted. Then, these non-zero coefficients for image patches are further quantized and coded combined with dictionary synchronization. The experimental results demonstrate that the number of non-zero coefficients of each frame decreases rapidly while more frames are encoded. Compared to the off-line mode training, the proposed ODL codec, learning from video on the fly, is able to reduce the computational complexity with fast convergence. Finally, the rate distortion performance shows improvement in terms of PSNR compared with the K-SVD dictionary based compression and H.264/AVC for intra-frame video at low bit rates.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The state-of-the-art K-SVD dictionary is well learned off-line from large numbers of training data.
 
Literatur
1.
Zurück zum Zitat Aharon, M., & Elad, M. (2006). K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 54(11), 4311–4322.CrossRef Aharon, M., & Elad, M. (2006). K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 54(11), 4311–4322.CrossRef
2.
Zurück zum Zitat Bross, B., Han, W. J., Ohm, J. R., & Sullivan, G. (2012). High efficiency video coding (HEVC) text specification draft 8. document JCTVC-J1003. Bross, B., Han, W. J., Ohm, J. R., & Sullivan, G. (2012). High efficiency video coding (HEVC) text specification draft 8. document JCTVC-J1003.
3.
Zurück zum Zitat Bryt, O., & Elad, M. (2008). Compression of facial images using the K-SVD algorithm. Journal of Visual Communication and Image Representation, 19(4), 270–282.CrossRef Bryt, O., & Elad, M. (2008). Compression of facial images using the K-SVD algorithm. Journal of Visual Communication and Image Representation, 19(4), 270–282.CrossRef
4.
Zurück zum Zitat Candes, E. J., Romberg, J., & Tao, T. (2006). Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory, 52(2), 489–509.CrossRefMATHMathSciNet Candes, E. J., Romberg, J., & Tao, T. (2006). Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory, 52(2), 489–509.CrossRefMATHMathSciNet
5.
Zurück zum Zitat Cisco visual networking index (VNI). (2012). Global mobile data traffic forecast 2012–2017. Cisco visual networking index (VNI). (2012). Global mobile data traffic forecast 2012–2017.
6.
Zurück zum Zitat Dai, W., & Milenkovic, O. (2009). Subspace pursuit for compressive sensing signal reconstruction. IEEE Transactions on Information Theory, 55(5), 2230–2249.CrossRefMathSciNet Dai, W., & Milenkovic, O. (2009). Subspace pursuit for compressive sensing signal reconstruction. IEEE Transactions on Information Theory, 55(5), 2230–2249.CrossRefMathSciNet
7.
Zurück zum Zitat ISO/IEC 15444–1 (2000). JPEG 2000 Part I Final Committee Draft Version 1.0. ISO/IEC 15444–1 (2000). JPEG 2000 Part I Final Committee Draft Version 1.0.
8.
Zurück zum Zitat Kang, J. W., Kuo, C. C., Cohen, R., & Vetro, A. (2011). Efficient dictionary based video coding with reduced side information. In 2011 IEEE International Symposium on Circuits and Systems (ISCAS) (pp. 109–112). Kang, J. W., Kuo, C. C., Cohen, R., & Vetro, A. (2011). Efficient dictionary based video coding with reduced side information. In 2011 IEEE International Symposium on Circuits and Systems (ISCAS) (pp. 109–112).
9.
Zurück zum Zitat Karklin, Y., & Lewicki, M. (2008). Emergence of complex cell properties by learning to generalize in natural scenes. Nature, 457(7225), 83–86. Karklin, Y., & Lewicki, M. (2008). Emergence of complex cell properties by learning to generalize in natural scenes. Nature, 457(7225), 83–86.
10.
Zurück zum Zitat Lee, H., Battle, A., Raina, R., & Ng, A.Y. (2006). Efficient sparse coding algorithms. In Advances in neural information processing systems (NIPS’06) (pp. 801–808). Lee, H., Battle, A., Raina, R., & Ng, A.Y. (2006). Efficient sparse coding algorithms. In Advances in neural information processing systems (NIPS’06) (pp. 801–808).
11.
Zurück zum Zitat Mairal, J., & Bach, F. (2009). Online dictionary learning for sparse coding. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML’09 (pp. 689–696). ACM. Mairal, J., & Bach, F. (2009). Online dictionary learning for sparse coding. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML’09 (pp. 689–696). ACM.
12.
Zurück zum Zitat Mairal, J., & Bach, F. (2010). Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research, 11, 19–60.MATHMathSciNet Mairal, J., & Bach, F. (2010). Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research, 11, 19–60.MATHMathSciNet
13.
Zurück zum Zitat Mallat, S., & Zhang, Z. (1993). Matching pursuits with time-frequency dictionaries. IEEE Transactions on Signal Processing, 41(12), 3397–3415.CrossRefMATH Mallat, S., & Zhang, Z. (1993). Matching pursuits with time-frequency dictionaries. IEEE Transactions on Signal Processing, 41(12), 3397–3415.CrossRefMATH
14.
Zurück zum Zitat Marpe, D. (2006). The H.264/MPEG4 advanced video coding standard and its applications. IEEE Communications Magazine, 44(8), 134–143.CrossRef Marpe, D. (2006). The H.264/MPEG4 advanced video coding standard and its applications. IEEE Communications Magazine, 44(8), 134–143.CrossRef
15.
Zurück zum Zitat Needell, D., & Tropp, J. (2009). Cosamp: Iterative signal recovery from incomplete and inaccurate samples. Applied and Computational Harmonic Analysis, 26(3), 301–321.CrossRefMATHMathSciNet Needell, D., & Tropp, J. (2009). Cosamp: Iterative signal recovery from incomplete and inaccurate samples. Applied and Computational Harmonic Analysis, 26(3), 301–321.CrossRefMATHMathSciNet
16.
Zurück zum Zitat Neff, R., & Zakhor, A. (1997). Very low bit-rate video coding based on matching pursuits. Circuits and Systems for Video Technology, IEEE Transactions on, 7(1), 158–171.CrossRef Neff, R., & Zakhor, A. (1997). Very low bit-rate video coding based on matching pursuits. Circuits and Systems for Video Technology, IEEE Transactions on, 7(1), 158–171.CrossRef
17.
Zurück zum Zitat Neff, R., & Zakhor, A. (2002). Matching pursuit video coding. i. Dictionary approximation. Circuits and Systems for Video Technology, IEEE Transactions on, 12(1), 13–26.CrossRef Neff, R., & Zakhor, A. (2002). Matching pursuit video coding. i. Dictionary approximation. Circuits and Systems for Video Technology, IEEE Transactions on, 12(1), 13–26.CrossRef
18.
Zurück zum Zitat Olshausen, B., & Field, D. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607–609.CrossRef Olshausen, B., & Field, D. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607–609.CrossRef
19.
Zurück zum Zitat Olshausen, B. A., & Field, D. J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by v1? Vision Research, 37, 3311–3325.CrossRef Olshausen, B. A., & Field, D. J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by v1? Vision Research, 37, 3311–3325.CrossRef
20.
Zurück zum Zitat Pati, Y., & Rezaiifar, R. (1993). Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In 1993 Conference Record of The Twenty-Seventh Asilomar Conference on Signals Systems and Computers (Vol. 1, pp. 40–44). Pati, Y., & Rezaiifar, R. (1993). Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In 1993 Conference Record of The Twenty-Seventh Asilomar Conference on Signals Systems and Computers (Vol. 1, pp. 40–44).
21.
Zurück zum Zitat Rubinstein, R. (2010). Dictionaries for sparse representation modeling. Proceedings of the IEEE, 98(6), 1045–1057.CrossRef Rubinstein, R. (2010). Dictionaries for sparse representation modeling. Proceedings of the IEEE, 98(6), 1045–1057.CrossRef
22.
Zurück zum Zitat Skretting, K., & Engan, K. (2011). Image compression using learned dictionaries by RLS-DLA and compared with K-SVD. In 2011 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) (pp. 1517–1520). Skretting, K., & Engan, K. (2011). Image compression using learned dictionaries by RLS-DLA and compared with K-SVD. In 2011 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) (pp. 1517–1520).
23.
Zurück zum Zitat Skretting, K., & Engan, K. (2010). Recursive least squares dictionary learning algorithm. IEEE Transactions on Signal Processing, 58(4), 2121–2130.CrossRefMathSciNet Skretting, K., & Engan, K. (2010). Recursive least squares dictionary learning algorithm. IEEE Transactions on Signal Processing, 58(4), 2121–2130.CrossRefMathSciNet
24.
Zurück zum Zitat Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.MATHMathSciNet Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.MATHMathSciNet
25.
Zurück zum Zitat Trevor, B. E., & Hastie, T. (2002). Least angle regression. Annals of Statistics, 32, 407–499. Trevor, B. E., & Hastie, T. (2002). Least angle regression. Annals of Statistics, 32, 407–499.
26.
Zurück zum Zitat Tseng, P. (2001). Convergence of a block coordinate descent method for nondifferentiable minimization. Journal of Optimization Theory and Applications, 109(3), 475–494. Tseng, P. (2001). Convergence of a block coordinate descent method for nondifferentiable minimization. Journal of Optimization Theory and Applications, 109(3), 475–494.
27.
Zurück zum Zitat Turkan, M, & Guillemot, C. (2011). Online dictionaries for image prediction. In 2011 18th IEEE International Conference on Image Processing (ICIP) (pp. 293–296). Turkan, M, & Guillemot, C. (2011). Online dictionaries for image prediction. In 2011 18th IEEE International Conference on Image Processing (ICIP) (pp. 293–296).
28.
Zurück zum Zitat Wiegand, T., Sullivan, G., Bjontegaard, G., & Luthra, A. (2003). Overview of the H.264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology, 13(7), 560–576.CrossRef Wiegand, T., Sullivan, G., Bjontegaard, G., & Luthra, A. (2003). Overview of the H.264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology, 13(7), 560–576.CrossRef
29.
Zurück zum Zitat Zepeda, J., & Guillemot, C. (2011). Image compression using sparse representations and the iteration-tuned and aligned dictionary. IEEE Journal of Selected Topics in Signal Processing, 5(5), 1061–1073.CrossRef Zepeda, J., & Guillemot, C. (2011). Image compression using sparse representations and the iteration-tuned and aligned dictionary. IEEE Journal of Selected Topics in Signal Processing, 5(5), 1061–1073.CrossRef
Metadaten
Titel
Online Dictionary Learning Based Intra-frame Video Coding
verfasst von
Yipeng Sun
Mai Xu
Xiaoming Tao
Jianhua Lu
Publikationsdatum
01.02.2014
Verlag
Springer US
Erschienen in
Wireless Personal Communications / Ausgabe 4/2014
Print ISSN: 0929-6212
Elektronische ISSN: 1572-834X
DOI
https://doi.org/10.1007/s11277-013-1577-y

Weitere Artikel der Ausgabe 4/2014

Wireless Personal Communications 4/2014 Zur Ausgabe

Neuer Inhalt