Skip to main content

2016 | OriginalPaper | Buchkapitel

Fast, Exact and Multi-scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs

verfasst von : Siddhartha Chandra, Iasonas Kokkinos

Erschienen in: Computer Vision – ECCV 2016

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this work we propose a structured prediction technique that combines the virtues of Gaussian Conditional Random Fields (G-CRF) with Deep Learning: (a) our structured prediction task has a unique global optimum that is obtained exactly from the solution of a linear system (b) the gradients of our model parameters are analytically computed using closed form expressions, in contrast to the memory-demanding contemporary deep structured prediction approaches [1, 2] that rely on back-propagation-through-time, (c) our pairwise terms do not have to be simple hand-crafted expressions, as in the line of works building on the DenseCRF [1, 3], but can rather be ‘discovered’ from data through deep architectures, and (d) out system can trained in an end-to-end manner. Building on standard tools from numerical analysis we develop very efficient algorithms for inference and learning, as well as a customized technique adapted to the semantic segmentation task. This efficiency allows us to explore more sophisticated architectures for structured prediction in deep learning: we introduce multi-resolution architectures to couple information across scales in a joint optimization framework, yielding systematic improvements. We demonstrate the utility of our approach on the challenging VOC PASCAL 2012 image segmentation benchmark, showing substantial improvements over strong baselines. We make all of our code and experiments available at https://​github.​com/​siddharthachandr​a/​gcrf.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.: Conditional random fields as recurrent neural networks. In: ICCV (2015) Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.: Conditional random fields as recurrent neural networks. In: ICCV (2015)
2.
Zurück zum Zitat Vemulapalli, R., Tuzel, O., Liu, M.Y., Chellapa, R.: Gaussian conditional random field network for semantic segmentation. In: CVPR, June 2016 Vemulapalli, R., Tuzel, O., Liu, M.Y., Chellapa, R.: Gaussian conditional random field network for semantic segmentation. In: CVPR, June 2016
3.
Zurück zum Zitat Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. arXiv preprint arXiv:1412.7062 (2014) Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. arXiv preprint arXiv:​1412.​7062 (2014)
4.
Zurück zum Zitat Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. PAMI 35, 1915–1929 (2013)CrossRef Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. PAMI 35, 1915–1929 (2013)CrossRef
5.
Zurück zum Zitat Mostajabi, M., Yadollahpour, P., Shakhnarovich, G.: Feedforward semantic segmentation with zoom-out features. In: CVPR (2015) Mostajabi, M., Yadollahpour, P., Shakhnarovich, G.: Feedforward semantic segmentation with zoom-out features. In: CVPR (2015)
6.
Zurück zum Zitat Hariharan, B., Arbeláez, P., Girshick, R., Malik, J.: Hypercolumns for object segmentation and fine-grained localization. In: CVPR (2015) Hariharan, B., Arbeláez, P., Girshick, R., Malik, J.: Hypercolumns for object segmentation and fine-grained localization. In: CVPR (2015)
7.
Zurück zum Zitat Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015) Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)
8.
Zurück zum Zitat Farabet, C., Couprie, C., Najman, L., Lecun, Y.: Scene parsing with multiscale feature learning, purity trees, and optimal covers. In: ICML (2012) Farabet, C., Couprie, C., Najman, L., Lecun, Y.: Scene parsing with multiscale feature learning, purity trees, and optimal covers. In: ICML (2012)
9.
Zurück zum Zitat Chen, L.C., Schwing, A.G., Yuille, A.L., Urtasun, R.: Learning deep structured models. In: ICML (2015) Chen, L.C., Schwing, A.G., Yuille, A.L., Urtasun, R.: Learning deep structured models. In: ICML (2015)
10.
Zurück zum Zitat Vemulapalli, R., Tuzel, O., Liu, M.: Deep Gaussian conditional random field network: a model-based deep network for discriminative denoising. In: CVPR (2016) Vemulapalli, R., Tuzel, O., Liu, M.: Deep Gaussian conditional random field network: a model-based deep network for discriminative denoising. In: CVPR (2016)
11.
Zurück zum Zitat Ionescu, C., Vantzos, O., Sminchisescu, C.: Matrix backpropagation for deep networks with structured layers. In: ICCV (2015) Ionescu, C., Vantzos, O., Sminchisescu, C.: Matrix backpropagation for deep networks with structured layers. In: ICCV (2015)
12.
Zurück zum Zitat Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: NIPS (2011) Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: NIPS (2011)
13.
Zurück zum Zitat Couprie, C.: Multi-label energy minimization for object class segmentation. In: 2012 Proceedings of the 20th European on Signal Processing Conference (EUSIPCO), pp. 2233–2237. IEEE (2012) Couprie, C.: Multi-label energy minimization for object class segmentation. In: 2012 Proceedings of the 20th European on Signal Processing Conference (EUSIPCO), pp. 2233–2237. IEEE (2012)
14.
Zurück zum Zitat Lin, G., Shen, C., Reid, I.D., van den Hengel, A.: Efficient piecewise training of deep structured models for semantic segmentation. In: CVPR (2016) Lin, G., Shen, C., Reid, I.D., van den Hengel, A.: Efficient piecewise training of deep structured models for semantic segmentation. In: CVPR (2016)
15.
Zurück zum Zitat Liu, Z., Li, X., Luo, P., Loy, C.C., Tang, X.: Semantic image segmentation via deep parsing network. In: CVPR, pp. 1377–1385 (2015) Liu, Z., Li, X., Luo, P., Loy, C.C., Tang, X.: Semantic image segmentation via deep parsing network. In: CVPR, pp. 1377–1385 (2015)
16.
Zurück zum Zitat Tappen, M.F., Liu, C., Adelson, E.H., Freeman, W.T.: Learning Gaussian conditional random fields for low-level vision. In: CVPR (2007) Tappen, M.F., Liu, C., Adelson, E.H., Freeman, W.T.: Learning Gaussian conditional random fields for low-level vision. In: CVPR (2007)
17.
Zurück zum Zitat Jancsary, J., Nowozin, S., Sharp, T., Rother, C.: Regression tree fields - an efficient, non-parametric approach to image labeling problems. In: CVPR (2012) Jancsary, J., Nowozin, S., Sharp, T., Rother, C.: Regression tree fields - an efficient, non-parametric approach to image labeling problems. In: CVPR (2012)
18.
Zurück zum Zitat Vu, T.H., Osokin, A., Laptev, I.: Context-aware CNNs for person head detection. In: ICCV, pp. 2893–2901 (2015) Vu, T.H., Osokin, A., Laptev, I.: Context-aware CNNs for person head detection. In: ICCV, pp. 2893–2901 (2015)
20.
Zurück zum Zitat Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C, 2nd edn. Cambridge University Press, New York (1992)MATH Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C, 2nd edn. Cambridge University Press, New York (1992)MATH
21.
Zurück zum Zitat Golub, G.H., Loan, C.F.V.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996)MATH Golub, G.H., Loan, C.F.V.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996)MATH
22.
Zurück zum Zitat Grady, L.: Random walks for image segmentation. PAMI 28, 1768–1783 (2006)CrossRef Grady, L.: Random walks for image segmentation. PAMI 28, 1768–1783 (2006)CrossRef
23.
Zurück zum Zitat Golub, G.H., Loan, V., F., C: Matrix computations. 3(1–2), 510 (1996) Golub, G.H., Loan, V., F., C: Matrix computations. 3(1–2), 510 (1996)
24.
Zurück zum Zitat Rue, H., Held, L.: Gaussian Markov Random Fields: Theory and Applications. Monographs on Statistics and Applied Probability, vol. 104. Chapman & Hall, London (2005)MATH Rue, H., Held, L.: Gaussian Markov Random Fields: Theory and Applications. Monographs on Statistics and Applied Probability, vol. 104. Chapman & Hall, London (2005)MATH
25.
Zurück zum Zitat Wainwright, M.J., Jordan, M.I.: Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 1(1–2), 136–138 (2008)MATH Wainwright, M.J., Jordan, M.I.: Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 1(1–2), 136–138 (2008)MATH
26.
Zurück zum Zitat Chen, L., Yang, Y., Wang, J., Xu, W., Yuille, A.L.: Attention to scale: scale-aware semantic image segmentation. In: CVPR (2016) Chen, L., Yang, Y., Wang, J., Xu, W., Yuille, A.L.: Attention to scale: scale-aware semantic image segmentation. In: CVPR (2016)
27.
Zurück zum Zitat Kokkinos, I.: Pushing the boundaries of boundary detection using deep learning. In: ICLR (2016) Kokkinos, I.: Pushing the boundaries of boundary detection using deep learning. In: ICLR (2016)
28.
Zurück zum Zitat Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: ECCV (2014) Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: ECCV (2014)
29.
Zurück zum Zitat Chen, L.C., Papandreou, G., Murphy, K., Yuille, A.L.: Weakly- and semi-supervised learning of a deep convolutional network for semantic image segmentation. In: ICCV (2015) Chen, L.C., Papandreou, G., Murphy, K., Yuille, A.L.: Weakly- and semi-supervised learning of a deep convolutional network for semantic image segmentation. In: ICCV (2015)
30.
Zurück zum Zitat Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
31.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
32.
Zurück zum Zitat Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. arXiv:1606.00915 (2016) Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. arXiv:​1606.​00915 (2016)
33.
Zurück zum Zitat Eigen, D., Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: ICCV, pp. 2650–2658 (2015) Eigen, D., Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: ICCV, pp. 2650–2658 (2015)
34.
Zurück zum Zitat Kokkinos, I.: Ubernet: a universal cnn for the joint treatment of low-, mid-, and high- level vision problems. In: POCV Workshop (2016) Kokkinos, I.: Ubernet: a universal cnn for the joint treatment of low-, mid-, and high- level vision problems. In: POCV Workshop (2016)
Metadaten
Titel
Fast, Exact and Multi-scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs
verfasst von
Siddhartha Chandra
Iasonas Kokkinos
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-46478-7_25

Premium Partner