Skip to main content
Erschienen in: Neural Processing Letters 2/2019

14.05.2018

Stacked Marginal Time Warping for Temporal Alignment

verfasst von: Xiang Zhang, Liquan Nie, Long Lan, Xuhui Huang, Zhigang Luo

Erschienen in: Neural Processing Letters | Ausgabe 2/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Time warping is the popular technique of temporally aligning two sequences and has successfully applied in temporal alignment tasks such as activity recognition. However, existing time warping methods suffer from limited representation ability because aligning process is performed on either raw sequences or the projected lower-dimensional features. In this paper, we propose a stacked time warping framework (STW) to learn layer-wise representation for temporal alignment in a stacked structure. By using this structure, STW gives higher flexibility than existing methods meanwhile unifies them into a deep architecture. Based on the proposed STW framework, we explore a stacked marginal time warping (SMTW) method by using marginal stacked denoising autoencoder (mSDA) as the regularization term which enables SMTW to marginalize out noises and learn layer-wise non-linear representations with the effective closed-form solution. Benefitting from the incorporation of mSDA, SMTW achieves better alignment performance and keeps comparable time efficiency with regular time warping methods. Experiments on both synthetic data and practical human activity recognition datasets demonstrate that SMTW is superior to the state-of-the-art time warping methods in quantity.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust, Speech, Signal Process 26(1):43–49CrossRefMATH Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust, Speech, Signal Process 26(1):43–49CrossRefMATH
2.
Zurück zum Zitat Zhou F, Torre F (2009) Canonical time warping for alignment of human behavior. In: Advances in Neural Information Processing Systems, pp 2286–2294 Zhou F, Torre F (2009) Canonical time warping for alignment of human behavior. In: Advances in Neural Information Processing Systems, pp 2286–2294
3.
Zurück zum Zitat King B, Smaragdis P, Mysore GJ (2012) Noise-robust dynamic time warping using plca features. In: IEEE International Conference on Acoustics. Speech and Signal Processing, pp 1973–1976 King B, Smaragdis P, Mysore GJ (2012) Noise-robust dynamic time warping using plca features. In: IEEE International Conference on Acoustics. Speech and Signal Processing, pp 1973–1976
4.
Zurück zum Zitat Listgarten J, Neal RM, Roweis ST, Emili A (2004) Multiple alignment of continuous time series. In: Advances in Neural Information Processing Systems, pp 817–824 Listgarten J, Neal RM, Roweis ST, Emili A (2004) Multiple alignment of continuous time series. In: Advances in Neural Information Processing Systems, pp 817–824
5.
Zurück zum Zitat Junejo IN, Dexter E, Laptev I, Perez P (2011) View-independent action recognition from temporal self-similarities. IEEE Trans Pattern Anal Mach Intell 33(1):172–185CrossRef Junejo IN, Dexter E, Laptev I, Perez P (2011) View-independent action recognition from temporal self-similarities. IEEE Trans Pattern Anal Mach Intell 33(1):172–185CrossRef
6.
Zurück zum Zitat Li X, Liu T, Deng J, Tao D (2016) Video face editing using temporal-spatial-smooth warping. ACM Trans Intell Syst Technol 7(3):1–28 Li X, Liu T, Deng J, Tao D (2016) Video face editing using temporal-spatial-smooth warping. ACM Trans Intell Syst Technol 7(3):1–28
7.
Zurück zum Zitat Shariat S, Pavlovic V (2011) Isotonic cca for sequence alignment and activity recognition. In: International Conference on Computer Vision, pp 2572–2578 Shariat S, Pavlovic V (2011) Isotonic cca for sequence alignment and activity recognition. In: International Conference on Computer Vision, pp 2572–2578
8.
9.
Zurück zum Zitat Liu W, Zha ZJ, Wang Y, Lu K, Tao D (2016) \(p\)-laplacian regularized sparse coding for human activity recognition. IEEE Trans Industrial Electron 63(8):5120–5129 Liu W, Zha ZJ, Wang Y, Lu K, Tao D (2016) \(p\)-laplacian regularized sparse coding for human activity recognition. IEEE Trans Industrial Electron 63(8):5120–5129
10.
Zurück zum Zitat Zhou F, De la Torre F (2012) Generalized time warping for multi-modal alignment of human motion. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1282–1289 Zhou F, De la Torre F (2012) Generalized time warping for multi-modal alignment of human motion. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1282–1289
11.
Zurück zum Zitat Anderson TW (1962) An introduction to multivariate statistical analysis. Tech. rep, Wiley, New York Anderson TW (1962) An introduction to multivariate statistical analysis. Tech. rep, Wiley, New York
12.
Zurück zum Zitat Gong D, Medioni G (2011) Dynamic manifold warping for view invariant action recognition. In: International Conference on Computer Vision, pp 571–578 Gong D, Medioni G (2011) Dynamic manifold warping for view invariant action recognition. In: International Conference on Computer Vision, pp 571–578
13.
Zurück zum Zitat Vu HT, Carey C, Mahadevan S (2012) Manifold warping: Manifold alignment over time. In: The 26th AAAI Conference on Artificial Intelligence, pp 1155–1161 Vu HT, Carey C, Mahadevan S (2012) Manifold warping: Manifold alignment over time. In: The 26th AAAI Conference on Artificial Intelligence, pp 1155–1161
14.
Zurück zum Zitat Panagakis Y, Nicolaou MA, Zafeiriou S, Pantic M (2013) Robust canonical time warping for the alignment of grossly corrupted sequences. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 540–547 Panagakis Y, Nicolaou MA, Zafeiriou S, Pantic M (2013) Robust canonical time warping for the alignment of grossly corrupted sequences. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 540–547
15.
Zurück zum Zitat Zhou F, Torre FDL (2016) Generalized canonical time warping. IEEE Trans Pattern Anal Mach Intell 38(2):279–294CrossRef Zhou F, Torre FDL (2016) Generalized canonical time warping. IEEE Trans Pattern Anal Mach Intell 38(2):279–294CrossRef
16.
Zurück zum Zitat Su B, Hua G (2017) Order-preserving wasserstein distance for sequence matching. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 2906–2914 Su B, Hua G (2017) Order-preserving wasserstein distance for sequence matching. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 2906–2914
17.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition pp 770–778
18.
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp 1097–1105 Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp 1097–1105
19.
Zurück zum Zitat Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representation Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representation
20.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: IEEE International Conference on Computer Vision, pp 1026–1034 He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: IEEE International Conference on Computer Vision, pp 1026–1034
21.
Zurück zum Zitat Girshick R (2015) Fast r-cnn. In: IEEE International Conference on Computer Vision, pp 1440–1448 Girshick R (2015) Fast r-cnn. In: IEEE International Conference on Computer Vision, pp 1440–1448
22.
Zurück zum Zitat Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp 91–99 Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp 91–99
23.
Zurück zum Zitat Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition
24.
Zurück zum Zitat Wang N, Yeung DY (2013) Learning a deep compact image representation for visual tracking. In: Advances in Neural Information Processing Systems, pp 809–817 Wang N, Yeung DY (2013) Learning a deep compact image representation for visual tracking. In: Advances in Neural Information Processing Systems, pp 809–817
25.
Zurück zum Zitat Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: International Conference on Machine Learning, pp 1096–1103 Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: International Conference on Machine Learning, pp 1096–1103
26.
Zurück zum Zitat Chen M, Weinberger KQ, Xu ZE, Sha F (2015) Marginalizing stacked linear denoising autoencoders. J Mach Learn Res 16:3849–3875MathSciNet Chen M, Weinberger KQ, Xu ZE, Sha F (2015) Marginalizing stacked linear denoising autoencoders. J Mach Learn Res 16:3849–3875MathSciNet
27.
Zurück zum Zitat Wei P, Ke Y, Goh CK (2016) Deep nonlinear feature coding for unsupervised domain adaptation. In: International Joint Conference on Artificial Intelligence, pp 2189–2195 Wei P, Ke Y, Goh CK (2016) Deep nonlinear feature coding for unsupervised domain adaptation. In: International Joint Conference on Artificial Intelligence, pp 2189–2195
28.
Zurück zum Zitat Ding Z, Shao M, Fu Y (2015) Deep low-rank coding for transfer learning. In: International Joint Conference on Artificial Intelligence, pp 3453–3459 Ding Z, Shao M, Fu Y (2015) Deep low-rank coding for transfer learning. In: International Joint Conference on Artificial Intelligence, pp 3453–3459
29.
Zurück zum Zitat Zhou JT, Pan SJ, Tsang IW, Yan Y (2014) Hybrid heterogeneous transfer learning through deep learning. In: The 28th AAAI Conference on Artificial Intelligence, pp 2213–2220 Zhou JT, Pan SJ, Tsang IW, Yan Y (2014) Hybrid heterogeneous transfer learning through deep learning. In: The 28th AAAI Conference on Artificial Intelligence, pp 2213–2220
30.
Zurück zum Zitat Jiang W, Gao H, Chung Fl, Huang H (2016) The \(l_{2,1}\)-norm stacked robust autoencoders for domain adaptation. In: The Thirtieth AAAI Conference on Artificial Intelligence, pp 1723–1729 Jiang W, Gao H, Chung Fl, Huang H (2016) The \(l_{2,1}\)-norm stacked robust autoencoders for domain adaptation. In: The Thirtieth AAAI Conference on Artificial Intelligence, pp 1723–1729
31.
Zurück zum Zitat Li S, Kawale J, Fu Y (2015) Deep collaborative filtering via marginalized denoising auto-encoder. In: ACM International on Conference on Information and Knowledge Management, pp 811–820 Li S, Kawale J, Fu Y (2015) Deep collaborative filtering via marginalized denoising auto-encoder. In: ACM International on Conference on Information and Knowledge Management, pp 811–820
32.
Zurück zum Zitat Chen Z, Chen M, Weinberger KQ, Zhang W (2015) Marginalized denoising for link prediction and multi-label learning. In: The 29th AAAI Conference on Artificial Intelligence, pp 1707–1713 Chen Z, Chen M, Weinberger KQ, Zhang W (2015) Marginalized denoising for link prediction and multi-label learning. In: The 29th AAAI Conference on Artificial Intelligence, pp 1707–1713
34.
Zurück zum Zitat Shao M, Li S, Ding Z, Fu Y (2015) Deep linear coding for fast graph clustering. In: The 29th AAAI Conference on Artificial Intelligence, pp 3798–3804 Shao M, Li S, Ding Z, Fu Y (2015) Deep linear coding for fast graph clustering. In: The 29th AAAI Conference on Artificial Intelligence, pp 3798–3804
35.
Zurück zum Zitat Xu ZE, Chen M, Weinberger KQ, Sha F (2012) From sbow to dcot marginalized encoders for text representation. In: ACM International Conference on Information and Knowledge Management, pp 1879–1884 Xu ZE, Chen M, Weinberger KQ, Sha F (2012) From sbow to dcot marginalized encoders for text representation. In: ACM International Conference on Information and Knowledge Management, pp 1879–1884
36.
Zurück zum Zitat Nie L, Wang Y, Zhang X, Huang X, Luo Z (2016) Enhancing temporal alignment with autoencoder. In: International Joint Conference on Neural Network, pp 4873–4879 Nie L, Wang Y, Zhang X, Huang X, Luo Z (2016) Enhancing temporal alignment with autoencoder. In: International Joint Conference on Neural Network, pp 4873–4879
37.
Zurück zum Zitat Liu W, Yang X, Tao D, Cheng J, Tang Y (2017) Multiview dimension reduction via hessian multiset canonical correlations. Inf Fus 41:119–128CrossRef Liu W, Yang X, Tao D, Cheng J, Tang Y (2017) Multiview dimension reduction via hessian multiset canonical correlations. Inf Fus 41:119–128CrossRef
38.
Zurück zum Zitat Yang X, Liu W, Tao D, Cheng J (2017) Canonical correlation analysis networks for two-view image recognition. Inf Sci 385:338–352CrossRef Yang X, Liu W, Tao D, Cheng J (2017) Canonical correlation analysis networks for two-view image recognition. Inf Sci 385:338–352CrossRef
39.
Zurück zum Zitat Guan N, Zhang X, Luo Z, Lan L (2012) Sparse representation based discriminative canonical correlation analysis for face recognition. In: International Conference on Machine Learning and Applications, pp 51–56 Guan N, Zhang X, Luo Z, Lan L (2012) Sparse representation based discriminative canonical correlation analysis for face recognition. In: International Conference on Machine Learning and Applications, pp 51–56
40.
Zurück zum Zitat Van Der Maaten L, Chen M, Tyree S, Weinberger KQ (2013) Learning with marginalized corrupted features. In: International Conference on Machine Learning, pp 410–418 Van Der Maaten L, Chen M, Tyree S, Weinberger KQ (2013) Learning with marginalized corrupted features. In: International Conference on Machine Learning, pp 410–418
41.
Zurück zum Zitat Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1–9 Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1–9
42.
Zurück zum Zitat Ouyang W, Zeng X, Wang X (2016) Partial occlusion handling in pedestrian detection with a deep model. IEEE Trans Circuits Syst Video Technol 26(11):2123–2137CrossRef Ouyang W, Zeng X, Wang X (2016) Partial occlusion handling in pedestrian detection with a deep model. IEEE Trans Circuits Syst Video Technol 26(11):2123–2137CrossRef
43.
Zurück zum Zitat Bengio Y, Lamblin P, Popovici D, Larochelle H et al (2006) Greedy layer-wise training of deep networks. Adv Neural Inf Process Syst 19:153–160 Bengio Y, Lamblin P, Popovici D, Larochelle H et al (2006) Greedy layer-wise training of deep networks. Adv Neural Inf Process Syst 19:153–160
44.
Zurück zum Zitat Liu T, Gong M, Tao D (2017) Large-cone nonnegative matrix factorization. IEEE Trans Neural Netw Learn Syst 28(9):2129–2142MathSciNet Liu T, Gong M, Tao D (2017) Large-cone nonnegative matrix factorization. IEEE Trans Neural Netw Learn Syst 28(9):2129–2142MathSciNet
45.
Zurück zum Zitat Liu T, Tao D, Xu D (2016) Dimensionality-dependent generalization bounds for k-dimensional coding schemes. Neural Comput 28(10):2213–2249MathSciNetCrossRef Liu T, Tao D, Xu D (2016) Dimensionality-dependent generalization bounds for k-dimensional coding schemes. Neural Comput 28(10):2213–2249MathSciNetCrossRef
46.
Zurück zum Zitat Mairal J, Bach F, Ponce J, Sapiro G (2009) Online dictionary learning for sparse coding. In: International Conference on Machine Learning, pp 689–696 Mairal J, Bach F, Ponce J, Sapiro G (2009) Online dictionary learning for sparse coding. In: International Conference on Machine Learning, pp 689–696
48.
Zurück zum Zitat Nikitidis S, Zafeiriou S, Pantic M (2014) Merging svms with linear discriminant analysis: a combined model. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1067–1074 Nikitidis S, Zafeiriou S, Pantic M (2014) Merging svms with linear discriminant analysis: a combined model. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1067–1074
49.
Zurück zum Zitat Nene SA, Nayar SK, Murase H, et al (1996) Columbia object image library (coil-20). Tech. rep., Technical report CUCS-005-96 Nene SA, Nayar SK, Murase H, et al (1996) Columbia object image library (coil-20). Tech. rep., Technical report CUCS-005-96
50.
Zurück zum Zitat Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local svm approach. Int Conf Pattern Recognit 3:32–36 Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local svm approach. Int Conf Pattern Recognit 3:32–36
51.
Zurück zum Zitat Jolliffe I (2002) Principal component analysis. Wiley, New YorkMATH Jolliffe I (2002) Principal component analysis. Wiley, New YorkMATH
52.
Zurück zum Zitat Alpaydm E (1999) Combined 5 \(\times \) 2 cv f test for comparing supervised classification learning algorithms. Neural Comput 11(8):1885–1892CrossRef Alpaydm E (1999) Combined 5 \(\times \) 2 cv f test for comparing supervised classification learning algorithms. Neural Comput 11(8):1885–1892CrossRef
Metadaten
Titel
Stacked Marginal Time Warping for Temporal Alignment
verfasst von
Xiang Zhang
Liquan Nie
Long Lan
Xuhui Huang
Zhigang Luo
Publikationsdatum
14.05.2018
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 2/2019
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-018-9834-4

Weitere Artikel der Ausgabe 2/2019

Neural Processing Letters 2/2019 Zur Ausgabe

Neuer Inhalt