Skip to main content

2017 | OriginalPaper | Buchkapitel

Joint Training of Generic CNN-CRF Models with Stochastic Optimization

verfasst von : A. Kirillov, D. Schlesinger, S. Zheng, B. Savchynskyy, P. H. S. Torr, C. Rother

Erschienen in: Computer Vision – ACCV 2016

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We propose a new CNN-CRF end-to-end learning framework, which is based on joint stochastic optimization with respect to both Convolutional Neural Network (CNN) and Conditional Random Field (CRF) parameters. While stochastic gradient descent is a standard technique for CNN training, it was not used for joint models so far. We show that our learning method is (i) general, i.e. it applies to arbitrary CNN and CRF architectures and potential functions; (ii) scalable, i.e. it has a low memory footprint and straightforwardly parallelizes on GPUs; (iii) easy in implementation. Additionally, the unified CNN-CRF optimization approach simplifies a potential hardware implementation. We empirically evaluate our method on the task of semantic labeling of body parts in depth images and show that it compares favorably to competing techniques.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
2
We use the commonly adopted terminology from the CNN literature for technical details, to allow reproducibility of our results.
 
Literatur
1.
Zurück zum Zitat Lin, G., Shen, C., Reid, I.D., van den Hengel, A.: Efficient piecewise training of deep structured models for semantic segmentation. preprint arXiv:1504.01013 (2015) Lin, G., Shen, C., Reid, I.D., van den Hengel, A.: Efficient piecewise training of deep structured models for semantic segmentation. preprint arXiv:​1504.​01013 (2015)
2.
Zurück zum Zitat Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML, pp. 282–289 (2001) Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML, pp. 282–289 (2001)
3.
Zurück zum Zitat Chen, L., Schwing, A.G., Yuille, A.L., Urtasun, R.: Learning deep structured models. In: ICML, pp. 1785–1794 (2015) Chen, L., Schwing, A.G., Yuille, A.L., Urtasun, R.: Learning deep structured models. In: ICML, pp. 1785–1794 (2015)
4.
Zurück zum Zitat Nowozin, S., Rother, C., Bagon, S., Sharp, T., Yao, B., Kohli, P.: Decision tree fields. In: ICCV (2011) Nowozin, S., Rother, C., Bagon, S., Sharp, T., Yao, B., Kohli, P.: Decision tree fields. In: ICCV (2011)
5.
Zurück zum Zitat Sethi, I.K.: Entropy nets: from decision trees to neural networks. Proc. IEEE 78, 1605–1613 (1990)CrossRef Sethi, I.K.: Entropy nets: from decision trees to neural networks. Proc. IEEE 78, 1605–1613 (1990)CrossRef
6.
Zurück zum Zitat Richmond, D.L., Kainmueller, D., Yang, M.Y., Myers, E.W., Rother, C.: Relating cascaded random forests to deep convolutional neural networks for semantic segmentation. preprint arXiv:1507.07583 (2015) Richmond, D.L., Kainmueller, D., Yang, M.Y., Myers, E.W., Rother, C.: Relating cascaded random forests to deep convolutional neural networks for semantic segmentation. preprint arXiv:​1507.​07583 (2015)
7.
Zurück zum Zitat Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. preprint arXiv:1411.4038 (2014) Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. preprint arXiv:​1411.​4038 (2014)
8.
Zurück zum Zitat Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. TPAMI 35, 1915–1929 (2013)CrossRef Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. TPAMI 35, 1915–1929 (2013)CrossRef
9.
Zurück zum Zitat Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. preprint arXiv:1412.7062 (2014) Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. preprint arXiv:​1412.​7062 (2014)
10.
Zurück zum Zitat Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with gaussian edge potentials. In: NIPS (2011) Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with gaussian edge potentials. In: NIPS (2011)
11.
Zurück zum Zitat Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.H.S.: Conditional random fields as recurrent neural networks. In: Proceedings of ICCV (2015) Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.H.S.: Conditional random fields as recurrent neural networks. In: Proceedings of ICCV (2015)
13.
Zurück zum Zitat Adams, A., Baek, J., Davis, M.A.: Fast high-dimensional filtering using the permutohedral lattice. In: Computer Graphics Forum, vol. 29. Wiley Online Library (2010) Adams, A., Baek, J., Davis, M.A.: Fast high-dimensional filtering using the permutohedral lattice. In: Computer Graphics Forum, vol. 29. Wiley Online Library (2010)
14.
Zurück zum Zitat Domke, J.: Learning graphical model parameters with approximate marginal inference. TPAMI 35, 2454–2467 (2013) Domke, J.: Learning graphical model parameters with approximate marginal inference. TPAMI 35, 2454–2467 (2013)
15.
Zurück zum Zitat Kiefel, M., Gehler, P.V.: Human pose estimation with fields of parts. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 331–346. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10602-1_22 Kiefel, M., Gehler, P.V.: Human pose estimation with fields of parts. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 331–346. Springer, Heidelberg (2014). doi:10.​1007/​978-3-319-10602-1_​22
16.
Zurück zum Zitat Barbu, A.: Training an active random field for real-time image denoising. IEEE Trans. Image Process. 18, 2451–2462 (2009)MathSciNetCrossRef Barbu, A.: Training an active random field for real-time image denoising. IEEE Trans. Image Process. 18, 2451–2462 (2009)MathSciNetCrossRef
17.
Zurück zum Zitat Ross, S., Munoz, D., Hebert, M., Bagnell, J.A.: Learning message-passing inference machines for structured prediction. In: Proceedings of CVPR (2011) Ross, S., Munoz, D., Hebert, M., Bagnell, J.A.: Learning message-passing inference machines for structured prediction. In: Proceedings of CVPR (2011)
18.
Zurück zum Zitat Stoyanov, V., Ropson, A., Eisner, J.: Empirical risk minimization of graphical model parameters given approximate inference, decoding, and model structure. In: Proceedings of AISTATS (2011) Stoyanov, V., Ropson, A., Eisner, J.: Empirical risk minimization of graphical model parameters given approximate inference, decoding, and model structure. In: Proceedings of AISTATS (2011)
19.
Zurück zum Zitat Tompson, J.J., Jain, A., LeCun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Proceedings of NIPS (2014) Tompson, J.J., Jain, A., LeCun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Proceedings of NIPS (2014)
20.
Zurück zum Zitat Liu, Z., Li, X., Luo, P., Loy, C.C., Tang, X.: Semantic image segmentation via deep parsing network. In: Proceedings of ICCV (2015) Liu, Z., Li, X., Luo, P., Loy, C.C., Tang, X.: Semantic image segmentation via deep parsing network. In: Proceedings of ICCV (2015)
21.
Zurück zum Zitat Sutton, C., McCallum, A.: Piecewise training of undirected models. In: Conference on Uncertainty in Artificial Intelligence (UAI) (2005) Sutton, C., McCallum, A.: Piecewise training of undirected models. In: Conference on Uncertainty in Artificial Intelligence (UAI) (2005)
22.
Zurück zum Zitat He, X., Zemel, R.S., Carreira-perpiñán, M.Á.: Multiscale conditional random fields for image labeling. In: CVPR. Citeseer (2004) He, X., Zemel, R.S., Carreira-perpiñán, M.Á.: Multiscale conditional random fields for image labeling. In: CVPR. Citeseer (2004)
23.
Zurück zum Zitat Wainwright, M.J., Jordan, M.I.: Graphical models, exponential families, and variational inference. Found. Trends\({\textregistered }\) Mach. Learn. 1, 1–305 (2008) Wainwright, M.J., Jordan, M.I.: Graphical models, exponential families, and variational inference. Found. Trends\({\textregistered }\) Mach. Learn. 1, 1–305 (2008)
24.
Zurück zum Zitat Geman, S., Geman, D.: Stochastic relaxation, gibbs distributions, and the Bayesian restoration of images. TPAMI 6, 721–741 (1984)CrossRefMATH Geman, S., Geman, D.: Stochastic relaxation, gibbs distributions, and the Bayesian restoration of images. TPAMI 6, 721–741 (1984)CrossRefMATH
25.
Zurück zum Zitat Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 400–407 (1951) Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 400–407 (1951)
26.
Zurück zum Zitat Spall, J.C.: Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control, vol. 65. Wiley, Hoboken (2005)MATH Spall, J.C.: Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control, vol. 65. Wiley, Hoboken (2005)MATH
27.
Zurück zum Zitat Geyer, C.J.: Practical Markov chain Monte Carlo. Stat. Sci. 473–483 (1992) Geyer, C.J.: Practical Markov chain Monte Carlo. Stat. Sci. 473–483 (1992)
28.
Zurück zum Zitat Lauritzen, S.L.: Graphical Models. Oxford University Press, Oxford (1996)MATH Lauritzen, S.L.: Graphical Models. Oxford University Press, Oxford (1996)MATH
29.
Zurück zum Zitat Gonzalez, J., Low, Y., Gretton, A., Guestrin, C.: Parallel Gibbs sampling: from colored fields to thin junction trees. In: International Conference on Artificial Intelligence and Statistics. pp. 324–332 (2011) Gonzalez, J., Low, Y., Gretton, A., Guestrin, C.: Parallel Gibbs sampling: from colored fields to thin junction trees. In: International Conference on Artificial Intelligence and Statistics. pp. 324–332 (2011)
30.
Zurück zum Zitat Hinton, G.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14, 1771–1800 (2002)CrossRefMATH Hinton, G.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14, 1771–1800 (2002)CrossRefMATH
31.
Zurück zum Zitat Yuille, A.L.: The convergence of contrastive divergences. In: NIPS (2004) Yuille, A.L.: The convergence of contrastive divergences. In: NIPS (2004)
32.
Zurück zum Zitat Tieleman, T.: Training restricted Boltzmann machines using approximations to the likelihood gradient. In: ICML. ACM, New York (2008) Tieleman, T.: Training restricted Boltzmann machines using approximations to the likelihood gradient. In: ICML. ACM, New York (2008)
33.
Zurück zum Zitat Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2012 (VOC 2012) Results Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2012 (VOC 2012) Results
34.
Zurück zum Zitat Hariharan, B., Arbelaez, P., Bourdev, L., Maji, S., Malik, J.: Semantic contours from inverse detectors. In: International Conference on Computer Vision (ICCV) (2011) Hariharan, B., Arbelaez, P., Bourdev, L., Maji, S., Malik, J.: Semantic contours from inverse detectors. In: International Conference on Computer Vision (ICCV) (2011)
35.
Zurück zum Zitat Denil, M., Matheson, D., de Freitas, N.: Consistency of online random forests. In: ICML (2013) Denil, M., Matheson, D., de Freitas, N.: Consistency of online random forests. In: ICML (2013)
36.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates Inc., New York (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates Inc., New York (2012)
37.
Zurück zum Zitat Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)
38.
Zurück zum Zitat Ren, S., Cao, X., Wei, Y., Sun, J.: Global refinement of random forest. In: CVPR (2015) Ren, S., Cao, X., Wei, Y., Sun, J.: Global refinement of random forest. In: CVPR (2015)
39.
Zurück zum Zitat Cheng, M.M., Prisacariu, V.A., Zheng, S., Torr, P.H.S., Rother, C.: Densecut: densely connected CRFs for realtime Grabcut. Comput. Graph. Forum 34, 193–201 (2015)CrossRef Cheng, M.M., Prisacariu, V.A., Zheng, S., Torr, P.H.S., Rother, C.: Densecut: densely connected CRFs for realtime Grabcut. Comput. Graph. Forum 34, 193–201 (2015)CrossRef
Metadaten
Titel
Joint Training of Generic CNN-CRF Models with Stochastic Optimization
verfasst von
A. Kirillov
D. Schlesinger
S. Zheng
B. Savchynskyy
P. H. S. Torr
C. Rother
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-54184-6_14

Premium Partner