Skip to main content

2018 | OriginalPaper | Buchkapitel

Temporal Semantic Motion Segmentation Using Spatio Temporal Optimization

verfasst von : Nazrul Haque, N. Dinesh Reddy, Madhava Krishna

Erschienen in: Energy Minimization Methods in Computer Vision and Pattern Recognition

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Segmenting moving objects in a video sequence has been a challenging problem and critical to outdoor robotic navigation. While recent literature has laid focus on regularizing object labels over a sequence of frames, exploiting the spatio-temporal features for motion segmentation has been scarce. Particularly in real world dynamic scenes, existing approaches fail to exploit temporal consistency in segmenting moving objects with large camera motion.
In this paper, we present an approach for exploiting semantic information and temporal constraints in a joint framework for motion segmentation in a video. We propose a formulation for inferring per-frame joint semantic and motion labels using semantic potentials from dilated CNN framework and motion potentials from depth and geometric constraints. We integrate the potentials obtained into a 3D (space-time) fully connected CRF framework with overlapping/connected blocks. We solve for a feature space embedding in the spatio-temporal space by enforcing temporal constraints using optical flow and long term tracks as a least-squares problem. We evaluate our approach on outdoor driving benchmarks - KITTI and Cityscapes dataset.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Badrinarayanan, V., Handa, A., Cipolla, R.: Segnet: a deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling. arXiv preprint arXiv:1505.07293 (2015) Badrinarayanan, V., Handa, A., Cipolla, R.: Segnet: a deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling. arXiv preprint arXiv:​1505.​07293 (2015)
2.
Zurück zum Zitat Chen, T., Lu, S.: Object-level motion detection from moving cameras. IEEE Trans. Circ. Syst. Video Technol. 27, 2333–2343 (2016)CrossRef Chen, T., Lu, S.: Object-level motion detection from moving cameras. IEEE Trans. Circ. Syst. Video Technol. 27, 2333–2343 (2016)CrossRef
3.
Zurück zum Zitat Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In: CVPR (2016) Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In: CVPR (2016)
4.
Zurück zum Zitat Dollár, P., Zitnick, C.L.: Fast edge detection using structured forests. PAMI 37, 1558–1570 (2015)CrossRef Dollár, P., Zitnick, C.L.: Fast edge detection using structured forests. PAMI 37, 1558–1570 (2015)CrossRef
5.
Zurück zum Zitat Fragkiadaki, K., Arbeláez, P., Felsen, P., Malik, J.: Learning to segment moving objects in videos. In: CVPR. IEEE (2015) Fragkiadaki, K., Arbeláez, P., Felsen, P., Malik, J.: Learning to segment moving objects in videos. In: CVPR. IEEE (2015)
6.
Zurück zum Zitat Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the KITTI vision benchmark suite. In: CVPR (2012) Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the KITTI vision benchmark suite. In: CVPR (2012)
7.
Zurück zum Zitat Geiger, A., Ziegler, J., Stiller, C.: Stereoscan: Dense 3D reconstruction in real-time. In: Intelligent Vehicles Symposium (IV) (2011) Geiger, A., Ziegler, J., Stiller, C.: Stereoscan: Dense 3D reconstruction in real-time. In: Intelligent Vehicles Symposium (IV) (2011)
8.
Zurück zum Zitat Haque, N., Reddy, D., Krishna, M.: Joint semantic and motion segmentation for dynamic scenes using deep convolutional networks. In: VISAPP (2017) Haque, N., Reddy, D., Krishna, M.: Joint semantic and motion segmentation for dynamic scenes using deep convolutional networks. In: VISAPP (2017)
9.
Zurück zum Zitat Hirschmuller, H.: Stereo processing by semiglobal matching and mutual information. IEEE Trans. PAMI 30, 328–341 (2008)CrossRef Hirschmuller, H.: Stereo processing by semiglobal matching and mutual information. IEEE Trans. PAMI 30, 328–341 (2008)CrossRef
10.
Zurück zum Zitat Huang, S.J., Yu, Y., Zhou, Z.H.: Multi-label hypothesis reuse. In: KDD. ACM (2012) Huang, S.J., Yu, Y., Zhou, Z.H.: Multi-label hypothesis reuse. In: KDD. ACM (2012)
11.
Zurück zum Zitat Jain, S., Madhav Govindu, V.: Efficient higher-order clustering on the grassmann manifold. In: ICCV, pp. 3511–3518 (2013) Jain, S., Madhav Govindu, V.: Efficient higher-order clustering on the grassmann manifold. In: ICCV, pp. 3511–3518 (2013)
12.
Zurück zum Zitat Koltun, V.: Efficient inference in fully connected CRFS with Gaussian edge potentials. In: NIPS (2011) Koltun, V.: Efficient inference in fully connected CRFS with Gaussian edge potentials. In: NIPS (2011)
13.
Zurück zum Zitat Kundu, A., Krishna, K., Sivaswamy, J.: Moving object detection by multi-view geometric techniques from a single camera mounted robot. In: IROS (2009) Kundu, A., Krishna, K., Sivaswamy, J.: Moving object detection by multi-view geometric techniques from a single camera mounted robot. In: IROS (2009)
14.
Zurück zum Zitat Kundu, A., Vineet, V., Koltun, V.: Feature space optimization for semantic video segmentation. In: CVPR (2016) Kundu, A., Vineet, V., Koltun, V.: Feature space optimization for semantic video segmentation. In: CVPR (2016)
15.
Zurück zum Zitat Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: ICCV, pp. 3431–3440 (2015) Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: ICCV, pp. 3431–3440 (2015)
16.
Zurück zum Zitat Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: ICCV, pp. 1520–1528 (2015) Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: ICCV, pp. 1520–1528 (2015)
17.
Zurück zum Zitat Reddy, N.D., Singhal, P., Chari, V., Krishna, K.M.: Dynamic body VSLAM with semantic constraints. In: IROS (2015) Reddy, N.D., Singhal, P., Chari, V., Krishna, K.M.: Dynamic body VSLAM with semantic constraints. In: IROS (2015)
18.
Zurück zum Zitat Reddy, N.D., Singhal, P., Krishna, K.M.: Semantic motion segmentation using dense CRF formulation. In: ICVGIP (2014) Reddy, N.D., Singhal, P., Krishna, K.M.: Semantic motion segmentation using dense CRF formulation. In: ICVGIP (2014)
19.
Zurück zum Zitat Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: CVPR (2016) Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: CVPR (2016)
20.
Zurück zum Zitat Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015) Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)
21.
Zurück zum Zitat Ros, G., Ramos, S., Granados, M., Bakhtiary, A., Vazquez, D., Lopez, A.: Vision-based offline-online perception paradigm for autonomous driving. In: WACV (2015) Ros, G., Ramos, S., Granados, M., Bakhtiary, A., Vazquez, D., Lopez, A.: Vision-based offline-online perception paradigm for autonomous driving. In: WACV (2015)
22.
Zurück zum Zitat Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: CVPR. IEEE (2008) Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: CVPR. IEEE (2008)
23.
Zurück zum Zitat Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: NIPS, pp. 568–576 (2014) Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: NIPS, pp. 568–576 (2014)
24.
26.
Zurück zum Zitat Tourani, S., Krishna, K.M.: Using in-frame shear constraints for monocular motion segmentation of rigid bodies. JIRS 82(2), 237–255 (2016) Tourani, S., Krishna, K.M.: Using in-frame shear constraints for monocular motion segmentation of rigid bodies. JIRS 82(2), 237–255 (2016)
27.
Zurück zum Zitat Vertens, J., Valada, A., Burgard, W.: SMSnet: semantic motion segmentation using deep convolutional neural networks. In: IROS (2017) Vertens, J., Valada, A., Burgard, W.: SMSnet: semantic motion segmentation using deep convolutional neural networks. In: IROS (2017)
28.
Zurück zum Zitat Vidal, R., Sastry, S.: Optimal segmentation of dynamic scenes from two perspective views. In: CVPR, vol. 2 (2003) Vidal, R., Sastry, S.: Optimal segmentation of dynamic scenes from two perspective views. In: CVPR, vol. 2 (2003)
29.
Zurück zum Zitat Weinzaepfel, P., Revaud, J., Harchaoui, Z., Schmid, C.: Deepflow: large displacement optical flow with deep matching. In: ICCV (2013) Weinzaepfel, P., Revaud, J., Harchaoui, Z., Schmid, C.: Deepflow: large displacement optical flow with deep matching. In: ICCV (2013)
32.
Zurück zum Zitat Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.H.: Conditional random fields as recurrent neural networks. In: ICCV, pp. 1529–1537 (2015) Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.H.: Conditional random fields as recurrent neural networks. In: ICCV, pp. 1529–1537 (2015)
33.
Zurück zum Zitat Zografos, V., Nordberg, K.: Fast and accurate motion segmentation using linear combination of views. In: BMVC (2011) Zografos, V., Nordberg, K.: Fast and accurate motion segmentation using linear combination of views. In: BMVC (2011)
Metadaten
Titel
Temporal Semantic Motion Segmentation Using Spatio Temporal Optimization
verfasst von
Nazrul Haque
N. Dinesh Reddy
Madhava Krishna
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-78199-0_7