Skip to main content
main-content
Top

Hint

Swipe to navigate through the articles of this issue

Published in: International Journal of Computer Vision 6/2020

22-01-2020

Discriminative Training of Conditional Random Fields with Probably Submodular Constraints

Authors: Maxim Berman, Matthew B. Blaschko

Published in: International Journal of Computer Vision | Issue 6/2020

Login to get access
share
SHARE

Abstract

Problems of segmentation, denoising, registration and 3D reconstruction are often addressed with the graph cut algorithm. However, solving an unconstrained graph cut problem is NP-hard. For tractable optimization, pairwise potentials have to fulfill the submodularity inequality. In our learning paradigm, pairwise potentials are created as the dot product of a learned vector w with positive feature vectors. In order to constrain such a model to remain tractable, previous approaches have enforced the weight vector to be positive for pairwise potentials in which the labels differ, and set pairwise potentials to zero in the case that the label remains the same. Such constraints are sufficient to guarantee that the resulting pairwise potentials satisfy the submodularity inequality. However, we show that such an approach unnecessarily restricts the capacity of the learned models. Guaranteeing submodularity for all possible inputs, no matter how improbable, reduces inference error to effectively zero, but increases model error. In contrast, we relax the requirement of guaranteed submodularity to solutions that are probably approximately submodular. We show that the conceptually simple strategy of enforcing submodularity on the training examples guarantees with low sample complexity that test images will also yield submodular pairwise potentials. Results are presented in the binary and muticlass settings, showing substantial improvement from the resulting increased model capacity.

To get access to this content you need the following product:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 69.000 Bücher
  • über 500 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Testen Sie jetzt 15 Tage kostenlos.

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 58.000 Bücher
  • über 300 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Testen Sie jetzt 15 Tage kostenlos.

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 50.000 Bücher
  • über 380 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




Testen Sie jetzt 15 Tage kostenlos.

Footnotes
1
In the following, we write \(\varvec{\phi }_u(x^k)\) and \(\varvec{\phi }_p(x^k,x^l)\) as a shorthand for \(\varvec{\phi }_u^{k}(x^k,x^l)\) and \(\varvec{\phi }_p^{k,l}(x^k,x^l)\).
 
Literature
go back to reference Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Susstrunk, S. (2012). SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2274–2282. CrossRef Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Susstrunk, S. (2012). SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2274–2282. CrossRef
go back to reference Anguelov, D., Taskar, B., Chatalbashev, V., Koller, D., Gupta, D., Heitz, G., et al. (2005). Discriminative learning of Markov random fields for segmentation of 3D scan data. In Proceedings of the IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 169–176). Anguelov, D., Taskar, B., Chatalbashev, V., Koller, D., Gupta, D., Heitz, G., et al. (2005). Discriminative learning of Markov random fields for segmentation of 3D scan data. In Proceedings of the IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 169–176).
go back to reference Bakır, G. H., Hofmann, T., Schölkopf, B., Smola, A. J., Taskar, B., & Vishwanathan, S. V. N. (2007). Predicting structured data. Cambridge: The MIT Press. CrossRef Bakır, G. H., Hofmann, T., Schölkopf, B., Smola, A. J., Taskar, B., & Vishwanathan, S. V. N. (2007). Predicting structured data. Cambridge: The MIT Press. CrossRef
go back to reference Barahona, F. (1982). On the computational complexity of ising spin glass models. Journal of Physics A: Mathematical and General, 15(10), 3241. MathSciNetCrossRef Barahona, F. (1982). On the computational complexity of ising spin glass models. Journal of Physics A: Mathematical and General, 15(10), 3241. MathSciNetCrossRef
go back to reference Bartlett, P. L., Jordan, M. I., & McAuliffe, J. D. (2006). Convexity, classification, and risk bounds. Journal of the American Statistical Association, 101(473), 138–156. MathSciNetCrossRef Bartlett, P. L., Jordan, M. I., & McAuliffe, J. D. (2006). Convexity, classification, and risk bounds. Journal of the American Statistical Association, 101(473), 138–156. MathSciNetCrossRef
go back to reference Billingsley, P. (1995). Probability and measure. New York: Wiley. MATH Billingsley, P. (1995). Probability and measure. New York: Wiley. MATH
go back to reference Boykov, Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In ICCV (pp 105–112). Boykov, Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In ICCV (pp 105–112).
go back to reference Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239. CrossRef Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239. CrossRef
go back to reference Chandra, S., & Kokkinos, I. (2016). Fast, exact and multi-scale inference for semantic image segmentation with deep Gaussian CRFs. In European conference on computer vision. Chandra, S., & Kokkinos, I. (2016). Fast, exact and multi-scale inference for semantic image segmentation with deep Gaussian CRFs. In European conference on computer vision.
go back to reference Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848. CrossRef Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848. CrossRef
go back to reference Chen, L. C., Schwing, A. G., Yuille, A. L., & Urtasun, R. (2015). Learning deep structured models. In Proceedings of ICML. Chen, L. C., Schwing, A. G., Yuille, A. L., & Urtasun, R. (2015). Learning deep structured models. In Proceedings of ICML.
go back to reference Cheng, H. C., & Varshney, A. (2017). Volume segmentation using convolutional neural networks with limited training data. In 2017 IEEE international conference on image processing (ICIP) (pp. 590–594). Cheng, H. C., & Varshney, A. (2017). Volume segmentation using convolutional neural networks with limited training data. In 2017 IEEE international conference on image processing (ICIP) (pp. 590–594).
go back to reference Elisseeff, A., & Weston, J. (2001). A kernel method for multi-labelled classification. In NIPS. Elisseeff, A., & Weston, J. (2001). A kernel method for multi-labelled classification. In NIPS.
go back to reference Felzenszwalb, P. F., Girshick, R. B., & McAllester, D. (2010). Cascade object detection with deformable part models. In CVPR. Felzenszwalb, P. F., Girshick, R. B., & McAllester, D. (2010). Cascade object detection with deformable part models. In CVPR.
go back to reference Finley, T., & Joachims, T. (2008). Training structural SVMs when exact inference is intractable. In ICML. Finley, T., & Joachims, T. (2008). Training structural SVMs when exact inference is intractable. In ICML.
go back to reference Gelasca, E. D., Byun, J., Obara, B., & Manjunath, B. (2008). Evaluation and benchmark for biological image segmentation. In 15th IEEE international conference on image processing, 2008. ICIP 2008 (pp. 1816–1819). IEEE. Gelasca, E. D., Byun, J., Obara, B., & Manjunath, B. (2008). Evaluation and benchmark for biological image segmentation. In 15th IEEE international conference on image processing, 2008. ICIP 2008 (pp. 1816–1819). IEEE.
go back to reference Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6), 721–741. CrossRef Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6), 721–741. CrossRef
go back to reference Gjorgjevikj, D., & Madjarov, G. (2011). Two stage classifier chain architecture for efficient pair-wise multi-label learning. In 2011 IEEE international workshop on machine learning for signal processing (pp 1–6). Gjorgjevikj, D., & Madjarov, G. (2011). Two stage classifier chain architecture for efficient pair-wise multi-label learning. In 2011 IEEE international workshop on machine learning for signal processing (pp 1–6).
go back to reference Haralick, R. M., Shanmugam, K. S., & Dinstein, I. (1973). Textural features for image classification. IEEE Transactions on Systems, Man, and Cybernetics, 3, 610–621. CrossRef Haralick, R. M., Shanmugam, K. S., & Dinstein, I. (1973). Textural features for image classification. IEEE Transactions on Systems, Man, and Cybernetics, 3, 610–621. CrossRef
go back to reference Kolmogorov, V. (2005). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1568–1583. CrossRef Kolmogorov, V. (2005). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1568–1583. CrossRef
go back to reference Kolmogorov, V., & Zabih, R. (2004). What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 147–159. CrossRef Kolmogorov, V., & Zabih, R. (2004). What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 147–159. CrossRef
go back to reference Lafferty, J. D., McCallum, A., & Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the eighteenth international conference on machine learning (pp 282–289). Lafferty, J. D., McCallum, A., & Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the eighteenth international conference on machine learning (pp 282–289).
go back to reference Leibe, B., Leonardis, A., & Schiele, B. (2004). Combined object categorization and segmentation with an implicit shape model. In Workshop on statistical learning in computer vision, ECCV. Leibe, B., Leonardis, A., & Schiele, B. (2004). Combined object categorization and segmentation with an implicit shape model. In Workshop on statistical learning in computer vision, ECCV.
go back to reference Lucchi, A., Márquez-Neila, P., Becker, C., Li, Y., Smith, K., Knott, G., et al. (2015). Learning structured models for segmentation of 2-d and 3-d imagery. IEEE Transactions on Medical Imaging, 34(5), 1096–1110. CrossRef Lucchi, A., Márquez-Neila, P., Becker, C., Li, Y., Smith, K., Knott, G., et al. (2015). Learning structured models for segmentation of 2-d and 3-d imagery. IEEE Transactions on Medical Imaging, 34(5), 1096–1110. CrossRef
go back to reference Lucchi, A., Smith, K., Achanta, R., Knott, G., & Fua, P. (2012). Supervoxel-based segmentation of mitochondria in EM image stacks with learned shape features. IEEE Transactions on Medical Imaging, 31(2), 474–486. CrossRef Lucchi, A., Smith, K., Achanta, R., Knott, G., & Fua, P. (2012). Supervoxel-based segmentation of mitochondria in EM image stacks with learned shape features. IEEE Transactions on Medical Imaging, 31(2), 474–486. CrossRef
go back to reference Magnus, J. R., & Neudecker, H. (1995). Matrix differential calculus with applications in statistics and econometrics. New York: Wiley. MATH Magnus, J. R., & Neudecker, H. (1995). Matrix differential calculus with applications in statistics and econometrics. New York: Wiley. MATH
go back to reference Metzen, J. H., Kumar, M. C., Brox, T., & Fischer, V. (2017). Universal adversarial perturbations against semantic image segmentation. In The IEEE international conference on computer vision (ICCV). Metzen, J. H., Kumar, M. C., Brox, T., & Fischer, V. (2017). Universal adversarial perturbations against semantic image segmentation. In The IEEE international conference on computer vision (ICCV).
go back to reference Müller, A. C., & Behnke, S. (2014). PyStruct—Learning structured prediction in python. Journal of Machine Learning Research, 15, 2055–2060. MathSciNetMATH Müller, A. C., & Behnke, S. (2014). PyStruct—Learning structured prediction in python. Journal of Machine Learning Research, 15, 2055–2060. MathSciNetMATH
go back to reference Nowozin, S., Gehler, P. V., & Lampert, C. H. (2010). On parameter learning in CRF-based approaches to object class image segmentation. ECCV, 6, 98–111. Nowozin, S., Gehler, P. V., & Lampert, C. H. (2010). On parameter learning in CRF-based approaches to object class image segmentation. ECCV, 6, 98–111.
go back to reference Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. In N. Navab, J. Hornegger, W. M. Wells, & A. F. Frangi (Eds.), Medical image computing and computer-assisted intervention (pp. 234–241). Berlin: Springer. Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. In N. Navab, J. Hornegger, W. M. Wells, & A. F. Frangi (Eds.), Medical image computing and computer-assisted intervention (pp. 234–241). Berlin: Springer.
go back to reference Rother, C., Kolmogorov, V., Lempitsky, V., & Szummer, M. (2007). Optimizing binary MRFs via extended roof duality. In 2007 IEEE conference on computer vision and pattern recognition (pp. 1–8). IEEE. Rother, C., Kolmogorov, V., Lempitsky, V., & Szummer, M. (2007). Optimizing binary MRFs via extended roof duality. In 2007 IEEE conference on computer vision and pattern recognition (pp. 1–8). IEEE.
go back to reference Rother, C., Kumar, S., Kolmogorov, V., & Blake, A. (2005). Digital tapestry [automatic image synthesis]. In IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR 2005 (Vol. 1, pp. 589–596). IEEE. Rother, C., Kumar, S., Kolmogorov, V., & Blake, A. (2005). Digital tapestry [automatic image synthesis]. In IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR 2005 (Vol. 1, pp. 589–596). IEEE.
go back to reference Szummer, M., Kohli, P., & Hoiem, D. (2008). Learning CRFs using graph cuts. In D. A. Forsyth, P. H. S. Torr, & A. Zisserman (Eds.) European conference on computer vision. Lecture notes in computer science (Vol. 5303, pp. 582–595). Berlin: Springer. Szummer, M., Kohli, P., & Hoiem, D. (2008). Learning CRFs using graph cuts. In D. A. Forsyth, P. H. S. Torr, & A. Zisserman (Eds.) European conference on computer vision. Lecture notes in computer science (Vol. 5303, pp. 582–595). Berlin: Springer.
go back to reference Tsochantaridis, I., Joachims, T., Hofmann, T., & Altun, Y. (2005). Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6, 1453–1484. MathSciNetMATH Tsochantaridis, I., Joachims, T., Hofmann, T., & Altun, Y. (2005). Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6, 1453–1484. MathSciNetMATH
go back to reference Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27(11), 1134–1142. CrossRef Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27(11), 1134–1142. CrossRef
go back to reference Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley. MATH Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley. MATH
go back to reference Zaremba, W., & Blaschko, M. B. (2016). Discriminative training of CRF models with probably submodular constraints. In IEEE winter conference on applications of computer vision. Zaremba, W., & Blaschko, M. B. (2016). Discriminative training of CRF models with probably submodular constraints. In IEEE winter conference on applications of computer vision.
Metadata
Title
Discriminative Training of Conditional Random Fields with Probably Submodular Constraints
Authors
Maxim Berman
Matthew B. Blaschko
Publication date
22-01-2020
Publisher
Springer US
Published in
International Journal of Computer Vision / Issue 6/2020
Print ISSN: 0920-5691
Electronic ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-019-01277-y

Other articles of this Issue 6/2020

International Journal of Computer Vision 6/2020 Go to the issue

Premium Partner