Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 9/2019

19.11.2018 | Original Article

Depth estimation from infrared video using local-feature-flow neural network

verfasst von: Shouchuan Wu, Haitao Zhao, Shaoyuan Sun

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 9/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Depth estimation is essential for infrared video processing. In this paper, a novel depth estimation method, called local-feature-flow neural network (LFFNN), is proposed for generating depth maps for each frame of an infrared video. LFFNN extracts local features of a frame with the addition of inter-frame features, which is extracted from the previous frames on the corresponding region in the infrared video. LFFNN is designed for extracting the local features flow in the infrared video, learning better depth-related features through three control gates by inter-frame features propagation as the video progresses. After feature extraction, a pixel-level classifier is created to estimate depth level of different pixels in the infrared video. Our proposed approach achieves state-of-the-art depth estimation performances on the test dataset.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Al-Smadi M, Talafha B, Al-Ayyoub M et al (2018) Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. Int J Mach Learn Cybern 2018(3):1–13 Al-Smadi M, Talafha B, Al-Ayyoub M et al (2018) Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. Int J Mach Learn Cybern 2018(3):1–13
2.
Zurück zum Zitat Andreas J, Rohrbach M, Trevor D, Klein D (2016) Neural module networks. In: IEEE conference on computer vision and pattern recognition, pp 39–48 Andreas J, Rohrbach M, Trevor D, Klein D (2016) Neural module networks. In: IEEE conference on computer vision and pattern recognition, pp 39–48
3.
Zurück zum Zitat Hendricks LA, Venugopalan S, Rohrbach M, Mooney R, Saenko K, Darrell T (2016) Deep compositional captioning: Describing novel object categories without paired training data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–10 Hendricks LA, Venugopalan S, Rohrbach M, Mooney R, Saenko K, Darrell T (2016) Deep compositional captioning: Describing novel object categories without paired training data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–10
4.
Zurück zum Zitat Bi T, Liu Y, Weng D, Wang Y (2016) Monocular depth estimation of outdoor scenes using rgb-d datasets. In: Asian conference on computer vision, pp 88–99 Bi T, Liu Y, Weng D, Wang Y (2016) Monocular depth estimation of outdoor scenes using rgb-d datasets. In: Asian conference on computer vision, pp 88–99
5.
Zurück zum Zitat Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv:1406.1078 Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv:​1406.​1078
6.
Zurück zum Zitat Chung J, Gulcehre C, Cho K et al (2015) Gated feedback recurrent neural networks. Int Conf Mach Learn 2015:2067–2075 Chung J, Gulcehre C, Cho K et al (2015) Gated feedback recurrent neural networks. Int Conf Mach Learn 2015:2067–2075
7.
Zurück zum Zitat Donahue J, Hendricks L A, Guadarrama S et al (2015) Long-term recurrent convolutional networks for visual recognition and description. IEEE Conf Comput Vision Pattern Recogn 2015:2625–2634 Donahue J, Hendricks L A, Guadarrama S et al (2015) Long-term recurrent convolutional networks for visual recognition and description. IEEE Conf Comput Vision Pattern Recogn 2015:2625–2634
8.
Zurück zum Zitat Eigen D, Puhrsch C, Fergus R et al (2014) Depth map prediction from a single image using a multi-scale deep network. Neural Inf Proc Syst 2014:2366–2374 Eigen D, Puhrsch C, Fergus R et al (2014) Depth map prediction from a single image using a multi-scale deep network. Neural Inf Proc Syst 2014:2366–2374
9.
Zurück zum Zitat Fragkiadaki K, Salas M, Arbelaez P et al (2014) Grouping-based low-rank trajectory completion and 3D reconstruction. Neural Inf Proc Syst 2014:55–63 Fragkiadaki K, Salas M, Arbelaez P et al (2014) Grouping-based low-rank trajectory completion and 3D reconstruction. Neural Inf Proc Syst 2014:55–63
10.
Zurück zum Zitat Garg R, Roussos A, Agapito L et al (2013) Dense variational reconstruction of non-rigid surfaces from monocular video. IEEE Conf Comput Vision Pattern Recogn 2013:1272–1279 Garg R, Roussos A, Agapito L et al (2013) Dense variational reconstruction of non-rigid surfaces from monocular video. IEEE Conf Comput Vision Pattern Recogn 2013:1272–1279
11.
Zurück zum Zitat Girshick R, Donahue J, Darrell T et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. IEEE Conf Comput Vision Pattern Recogn 2014:580–587 Girshick R, Donahue J, Darrell T et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. IEEE Conf Comput Vision Pattern Recogn 2014:580–587
12.
Zurück zum Zitat Ha H, Im S, Park J et al (2016) High-quality depth from uncalibrated small motion clip. IEEE Conf Comput Vision Pattern Recogn 2016:5413–5421 Ha H, Im S, Park J et al (2016) High-quality depth from uncalibrated small motion clip. IEEE Conf Comput Vision Pattern Recogn 2016:5413–5421
13.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
14.
Zurück zum Zitat Hochreiter Sepp, Schmidhuber Jurgen (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef Hochreiter Sepp, Schmidhuber Jurgen (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef
15.
Zurück zum Zitat Karpathy A, Feifei L (2017) Deep visual-semantic alignments for generating image descriptions. IEEE Trans Pattern Anal Mach Intell 39(4):664–676CrossRef Karpathy A, Feifei L (2017) Deep visual-semantic alignments for generating image descriptions. IEEE Trans Pattern Anal Mach Intell 39(4):664–676CrossRef
16.
Zurück zum Zitat Karsch Kevin, Liu Ce, Kang Sing Bing (2014) Depth transfer: depth extraction from video using non-parametric sampling. IEEE Trans Pattern Anal Mach Intell 36(11):2144–2158CrossRef Karsch Kevin, Liu Ce, Kang Sing Bing (2014) Depth transfer: depth extraction from video using non-parametric sampling. IEEE Trans Pattern Anal Mach Intell 36(11):2144–2158CrossRef
17.
Zurück zum Zitat Kong N, Black MJ (2015) Intrinsic depth: improving depth transfer with intrinsic images. IEEE Conf Comput Vision Pattern Recogn 2015:3514–3522 Kong N, Black MJ (2015) Intrinsic depth: improving depth transfer with intrinsic images. IEEE Conf Comput Vision Pattern Recogn 2015:3514–3522
18.
Zurück zum Zitat Konrad Janusz, Wang Meng, Ishwar Prakash, Chen Wu, Mukherjee D (2013) Learning-based, automatic 2d-to-3d image and video conversion. IEEE Trans Image Process 22(9):3485–3496CrossRef Konrad Janusz, Wang Meng, Ishwar Prakash, Chen Wu, Mukherjee D (2013) Learning-based, automatic 2d-to-3d image and video conversion. IEEE Trans Image Process 22(9):3485–3496CrossRef
19.
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton GE et al (2012) Imagenet classification with deep convolutional neural networks. Neural Inf Proc Syst 2012:1097–1105 Krizhevsky A, Sutskever I, Hinton GE et al (2012) Imagenet classification with deep convolutional neural networks. Neural Inf Proc Syst 2012:1097–1105
20.
Zurück zum Zitat Liu B, Gould S, Koller D (2010) Single image depth estimation from predicted semantic labels. Comput Vis Pattern Recognit (CVPR), 2010 IEEE conference on. IEEE, pp 1253–1260 Liu B, Gould S, Koller D (2010) Single image depth estimation from predicted semantic labels. Comput Vis Pattern Recognit (CVPR), 2010 IEEE conference on. IEEE, pp 1253–1260
21.
Zurück zum Zitat Liu F, Shen C, Lin G et al (2015) Deep convolutional neural fields for depth estimation from a single image. IEEE Trans Pattern Anal Mach Intell 2015:5162–5170 Liu F, Shen C, Lin G et al (2015) Deep convolutional neural fields for depth estimation from a single image. IEEE Trans Pattern Anal Mach Intell 2015:5162–5170
22.
Zurück zum Zitat Liu M, Salzmann M, He X et al (2014) Discrete-continuous depth estimation from a single image. IEEE Conf Comput Vision Pattern Recogn 2014:716–723 Liu M, Salzmann M, He X et al (2014) Discrete-continuous depth estimation from a single image. IEEE Conf Comput Vision Pattern Recogn 2014:716–723
23.
Zurück zum Zitat Long J, Shelhamer E, Darrell T et al (2015) Fully convolutional networks for semantic segmentation. IEEE Conf Comput Vision Pattern Recogn 2015:3431–3440 Long J, Shelhamer E, Darrell T et al (2015) Fully convolutional networks for semantic segmentation. IEEE Conf Comput Vision Pattern Recogn 2015:3431–3440
24.
Zurück zum Zitat Madani Kurosh, Hassan Dayana, Sabourin Christophe (2017) A dual approach for machine-awareness in indoor environment combining pseudo-3d imaging and soft-computing techniques. Int J Mach Learn Cybern 8(6):1795–1814CrossRef Madani Kurosh, Hassan Dayana, Sabourin Christophe (2017) A dual approach for machine-awareness in indoor environment combining pseudo-3d imaging and soft-computing techniques. Int J Mach Learn Cybern 8(6):1795–1814CrossRef
25.
Zurück zum Zitat Malinowski M, Fritz M (2014) A multi-world approach to question answering about real-world scenes based on uncertain input. Neural Inf Proc Syst 2014:1682–1690 Malinowski M, Fritz M (2014) A multi-world approach to question answering about real-world scenes based on uncertain input. Neural Inf Proc Syst 2014:1682–1690
26.
Zurück zum Zitat Ranftl R, Vineet V, Chen Q, et al (2016) Dense monocular depth estimation in complex dynamic scenes. IEEE Conf Comput Vision Pattern Recogn 2016:4058–4066 Ranftl R, Vineet V, Chen Q, et al (2016) Dense monocular depth estimation in complex dynamic scenes. IEEE Conf Comput Vision Pattern Recogn 2016:4058–4066
27.
Zurück zum Zitat Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788 Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
28.
Zurück zum Zitat Ren S, He K, Girshick R B et al (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149CrossRef Ren S, He K, Girshick R B et al (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149CrossRef
29.
Zurück zum Zitat Roy A, Todorovic S (2016) Monocular depth estimation using neural regression forest. IEEE Conf Comput Vision Pattern Recogn 2016:5506–5514 Roy A, Todorovic S (2016) Monocular depth estimation using neural regression forest. IEEE Conf Comput Vision Pattern Recogn 2016:5506–5514
30.
31.
Zurück zum Zitat Den Oord A V, Kalchbrenner N, Kavukcuoglu K et al (2016) Pixel recurrent neural networks. Int Conf Mach Learn 2016:1747–1756 Den Oord A V, Kalchbrenner N, Kavukcuoglu K et al (2016) Pixel recurrent neural networks. Int Conf Mach Learn 2016:1747–1756
32.
Zurück zum Zitat Vinyals O, Toshev A, Bengio S et al (2015) Show and tell: a neural image caption generator. IEEE Conf Comput Vision Pattern Recogn 2015:3156–3164 Vinyals O, Toshev A, Bengio S et al (2015) Show and tell: a neural image caption generator. IEEE Conf Comput Vision Pattern Recogn 2015:3156–3164
33.
Zurück zum Zitat Visin F, Kastner K, Cho K et al (2015) Renet: a recurrent neural network based alternative to convolutional networks. arXiv:1505.00393 Visin F, Kastner K, Cho K et al (2015) Renet: a recurrent neural network based alternative to convolutional networks. arXiv:​1505.​00393
34.
Zurück zum Zitat Xiao Yu, Hua Yu, Tian Xian-Yun, Guang Yu, Li Xiao-mei, Zhang Xue, Wang Ju-Yun (2017) Recognition of college students from weibo with deep neural networks. Int J Mach Learn Cybern 8(5):1447–1455CrossRef Xiao Yu, Hua Yu, Tian Xian-Yun, Guang Yu, Li Xiao-mei, Zhang Xue, Wang Ju-Yun (2017) Recognition of college students from weibo with deep neural networks. Int J Mach Learn Cybern 8(5):1447–1455CrossRef
35.
Zurück zum Zitat Zhang Guofeng, Jia Jiaya, Hua Wei, Bao Hujun (2011) Robust bilayer segmentation and motion/depth estimation with a handheld camera. IEEE Trans Pattern Anal Mach Intell 33(3):603–617CrossRef Zhang Guofeng, Jia Jiaya, Hua Wei, Bao Hujun (2011) Robust bilayer segmentation and motion/depth estimation with a handheld camera. IEEE Trans Pattern Anal Mach Intell 33(3):603–617CrossRef
Metadaten
Titel
Depth estimation from infrared video using local-feature-flow neural network
verfasst von
Shouchuan Wu
Haitao Zhao
Shaoyuan Sun
Publikationsdatum
19.11.2018
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 9/2019
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-018-0891-9

Weitere Artikel der Ausgabe 9/2019

International Journal of Machine Learning and Cybernetics 9/2019 Zur Ausgabe

Neuer Inhalt