Skip to main content
Top

2017 | OriginalPaper | Chapter

Joint Training of Generic CNN-CRF Models with Stochastic Optimization

Authors : A. Kirillov, D. Schlesinger, S. Zheng, B. Savchynskyy, P. H. S. Torr, C. Rother

Published in: Computer Vision – ACCV 2016

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We propose a new CNN-CRF end-to-end learning framework, which is based on joint stochastic optimization with respect to both Convolutional Neural Network (CNN) and Conditional Random Field (CRF) parameters. While stochastic gradient descent is a standard technique for CNN training, it was not used for joint models so far. We show that our learning method is (i) general, i.e. it applies to arbitrary CNN and CRF architectures and potential functions; (ii) scalable, i.e. it has a low memory footprint and straightforwardly parallelizes on GPUs; (iii) easy in implementation. Additionally, the unified CNN-CRF optimization approach simplifies a potential hardware implementation. We empirically evaluate our method on the task of semantic labeling of body parts in depth images and show that it compares favorably to competing techniques.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
2
We use the commonly adopted terminology from the CNN literature for technical details, to allow reproducibility of our results.
 
Literature
1.
go back to reference Lin, G., Shen, C., Reid, I.D., van den Hengel, A.: Efficient piecewise training of deep structured models for semantic segmentation. preprint arXiv:1504.01013 (2015) Lin, G., Shen, C., Reid, I.D., van den Hengel, A.: Efficient piecewise training of deep structured models for semantic segmentation. preprint arXiv:​1504.​01013 (2015)
2.
go back to reference Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML, pp. 282–289 (2001) Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML, pp. 282–289 (2001)
3.
go back to reference Chen, L., Schwing, A.G., Yuille, A.L., Urtasun, R.: Learning deep structured models. In: ICML, pp. 1785–1794 (2015) Chen, L., Schwing, A.G., Yuille, A.L., Urtasun, R.: Learning deep structured models. In: ICML, pp. 1785–1794 (2015)
4.
go back to reference Nowozin, S., Rother, C., Bagon, S., Sharp, T., Yao, B., Kohli, P.: Decision tree fields. In: ICCV (2011) Nowozin, S., Rother, C., Bagon, S., Sharp, T., Yao, B., Kohli, P.: Decision tree fields. In: ICCV (2011)
5.
go back to reference Sethi, I.K.: Entropy nets: from decision trees to neural networks. Proc. IEEE 78, 1605–1613 (1990)CrossRef Sethi, I.K.: Entropy nets: from decision trees to neural networks. Proc. IEEE 78, 1605–1613 (1990)CrossRef
6.
go back to reference Richmond, D.L., Kainmueller, D., Yang, M.Y., Myers, E.W., Rother, C.: Relating cascaded random forests to deep convolutional neural networks for semantic segmentation. preprint arXiv:1507.07583 (2015) Richmond, D.L., Kainmueller, D., Yang, M.Y., Myers, E.W., Rother, C.: Relating cascaded random forests to deep convolutional neural networks for semantic segmentation. preprint arXiv:​1507.​07583 (2015)
7.
8.
go back to reference Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. TPAMI 35, 1915–1929 (2013)CrossRef Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. TPAMI 35, 1915–1929 (2013)CrossRef
9.
go back to reference Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. preprint arXiv:1412.7062 (2014) Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. preprint arXiv:​1412.​7062 (2014)
10.
go back to reference Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with gaussian edge potentials. In: NIPS (2011) Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with gaussian edge potentials. In: NIPS (2011)
11.
go back to reference Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.H.S.: Conditional random fields as recurrent neural networks. In: Proceedings of ICCV (2015) Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.H.S.: Conditional random fields as recurrent neural networks. In: Proceedings of ICCV (2015)
13.
go back to reference Adams, A., Baek, J., Davis, M.A.: Fast high-dimensional filtering using the permutohedral lattice. In: Computer Graphics Forum, vol. 29. Wiley Online Library (2010) Adams, A., Baek, J., Davis, M.A.: Fast high-dimensional filtering using the permutohedral lattice. In: Computer Graphics Forum, vol. 29. Wiley Online Library (2010)
14.
go back to reference Domke, J.: Learning graphical model parameters with approximate marginal inference. TPAMI 35, 2454–2467 (2013) Domke, J.: Learning graphical model parameters with approximate marginal inference. TPAMI 35, 2454–2467 (2013)
15.
go back to reference Kiefel, M., Gehler, P.V.: Human pose estimation with fields of parts. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 331–346. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10602-1_22 Kiefel, M., Gehler, P.V.: Human pose estimation with fields of parts. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 331–346. Springer, Heidelberg (2014). doi:10.​1007/​978-3-319-10602-1_​22
16.
go back to reference Barbu, A.: Training an active random field for real-time image denoising. IEEE Trans. Image Process. 18, 2451–2462 (2009)MathSciNetCrossRef Barbu, A.: Training an active random field for real-time image denoising. IEEE Trans. Image Process. 18, 2451–2462 (2009)MathSciNetCrossRef
17.
go back to reference Ross, S., Munoz, D., Hebert, M., Bagnell, J.A.: Learning message-passing inference machines for structured prediction. In: Proceedings of CVPR (2011) Ross, S., Munoz, D., Hebert, M., Bagnell, J.A.: Learning message-passing inference machines for structured prediction. In: Proceedings of CVPR (2011)
18.
go back to reference Stoyanov, V., Ropson, A., Eisner, J.: Empirical risk minimization of graphical model parameters given approximate inference, decoding, and model structure. In: Proceedings of AISTATS (2011) Stoyanov, V., Ropson, A., Eisner, J.: Empirical risk minimization of graphical model parameters given approximate inference, decoding, and model structure. In: Proceedings of AISTATS (2011)
19.
go back to reference Tompson, J.J., Jain, A., LeCun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Proceedings of NIPS (2014) Tompson, J.J., Jain, A., LeCun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Proceedings of NIPS (2014)
20.
go back to reference Liu, Z., Li, X., Luo, P., Loy, C.C., Tang, X.: Semantic image segmentation via deep parsing network. In: Proceedings of ICCV (2015) Liu, Z., Li, X., Luo, P., Loy, C.C., Tang, X.: Semantic image segmentation via deep parsing network. In: Proceedings of ICCV (2015)
21.
go back to reference Sutton, C., McCallum, A.: Piecewise training of undirected models. In: Conference on Uncertainty in Artificial Intelligence (UAI) (2005) Sutton, C., McCallum, A.: Piecewise training of undirected models. In: Conference on Uncertainty in Artificial Intelligence (UAI) (2005)
22.
go back to reference He, X., Zemel, R.S., Carreira-perpiñán, M.Á.: Multiscale conditional random fields for image labeling. In: CVPR. Citeseer (2004) He, X., Zemel, R.S., Carreira-perpiñán, M.Á.: Multiscale conditional random fields for image labeling. In: CVPR. Citeseer (2004)
23.
go back to reference Wainwright, M.J., Jordan, M.I.: Graphical models, exponential families, and variational inference. Found. Trends\({\textregistered }\) Mach. Learn. 1, 1–305 (2008) Wainwright, M.J., Jordan, M.I.: Graphical models, exponential families, and variational inference. Found. Trends\({\textregistered }\) Mach. Learn. 1, 1–305 (2008)
24.
go back to reference Geman, S., Geman, D.: Stochastic relaxation, gibbs distributions, and the Bayesian restoration of images. TPAMI 6, 721–741 (1984)CrossRefMATH Geman, S., Geman, D.: Stochastic relaxation, gibbs distributions, and the Bayesian restoration of images. TPAMI 6, 721–741 (1984)CrossRefMATH
25.
go back to reference Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 400–407 (1951) Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 400–407 (1951)
26.
go back to reference Spall, J.C.: Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control, vol. 65. Wiley, Hoboken (2005)MATH Spall, J.C.: Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control, vol. 65. Wiley, Hoboken (2005)MATH
27.
go back to reference Geyer, C.J.: Practical Markov chain Monte Carlo. Stat. Sci. 473–483 (1992) Geyer, C.J.: Practical Markov chain Monte Carlo. Stat. Sci. 473–483 (1992)
28.
go back to reference Lauritzen, S.L.: Graphical Models. Oxford University Press, Oxford (1996)MATH Lauritzen, S.L.: Graphical Models. Oxford University Press, Oxford (1996)MATH
29.
go back to reference Gonzalez, J., Low, Y., Gretton, A., Guestrin, C.: Parallel Gibbs sampling: from colored fields to thin junction trees. In: International Conference on Artificial Intelligence and Statistics. pp. 324–332 (2011) Gonzalez, J., Low, Y., Gretton, A., Guestrin, C.: Parallel Gibbs sampling: from colored fields to thin junction trees. In: International Conference on Artificial Intelligence and Statistics. pp. 324–332 (2011)
30.
go back to reference Hinton, G.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14, 1771–1800 (2002)CrossRefMATH Hinton, G.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14, 1771–1800 (2002)CrossRefMATH
31.
go back to reference Yuille, A.L.: The convergence of contrastive divergences. In: NIPS (2004) Yuille, A.L.: The convergence of contrastive divergences. In: NIPS (2004)
32.
go back to reference Tieleman, T.: Training restricted Boltzmann machines using approximations to the likelihood gradient. In: ICML. ACM, New York (2008) Tieleman, T.: Training restricted Boltzmann machines using approximations to the likelihood gradient. In: ICML. ACM, New York (2008)
33.
go back to reference Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2012 (VOC 2012) Results Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2012 (VOC 2012) Results
34.
go back to reference Hariharan, B., Arbelaez, P., Bourdev, L., Maji, S., Malik, J.: Semantic contours from inverse detectors. In: International Conference on Computer Vision (ICCV) (2011) Hariharan, B., Arbelaez, P., Bourdev, L., Maji, S., Malik, J.: Semantic contours from inverse detectors. In: International Conference on Computer Vision (ICCV) (2011)
35.
go back to reference Denil, M., Matheson, D., de Freitas, N.: Consistency of online random forests. In: ICML (2013) Denil, M., Matheson, D., de Freitas, N.: Consistency of online random forests. In: ICML (2013)
36.
go back to reference Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates Inc., New York (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates Inc., New York (2012)
37.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)
38.
go back to reference Ren, S., Cao, X., Wei, Y., Sun, J.: Global refinement of random forest. In: CVPR (2015) Ren, S., Cao, X., Wei, Y., Sun, J.: Global refinement of random forest. In: CVPR (2015)
39.
go back to reference Cheng, M.M., Prisacariu, V.A., Zheng, S., Torr, P.H.S., Rother, C.: Densecut: densely connected CRFs for realtime Grabcut. Comput. Graph. Forum 34, 193–201 (2015)CrossRef Cheng, M.M., Prisacariu, V.A., Zheng, S., Torr, P.H.S., Rother, C.: Densecut: densely connected CRFs for realtime Grabcut. Comput. Graph. Forum 34, 193–201 (2015)CrossRef
Metadata
Title
Joint Training of Generic CNN-CRF Models with Stochastic Optimization
Authors
A. Kirillov
D. Schlesinger
S. Zheng
B. Savchynskyy
P. H. S. Torr
C. Rother
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-54184-6_14

Premium Partner