Skip to main content
Erschienen in: International Journal of Computer Vision 6/2020

22.01.2020

Discriminative Training of Conditional Random Fields with Probably Submodular Constraints

verfasst von: Maxim Berman, Matthew B. Blaschko

Erschienen in: International Journal of Computer Vision | Ausgabe 6/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Problems of segmentation, denoising, registration and 3D reconstruction are often addressed with the graph cut algorithm. However, solving an unconstrained graph cut problem is NP-hard. For tractable optimization, pairwise potentials have to fulfill the submodularity inequality. In our learning paradigm, pairwise potentials are created as the dot product of a learned vector w with positive feature vectors. In order to constrain such a model to remain tractable, previous approaches have enforced the weight vector to be positive for pairwise potentials in which the labels differ, and set pairwise potentials to zero in the case that the label remains the same. Such constraints are sufficient to guarantee that the resulting pairwise potentials satisfy the submodularity inequality. However, we show that such an approach unnecessarily restricts the capacity of the learned models. Guaranteeing submodularity for all possible inputs, no matter how improbable, reduces inference error to effectively zero, but increases model error. In contrast, we relax the requirement of guaranteed submodularity to solutions that are probably approximately submodular. We show that the conceptually simple strategy of enforcing submodularity on the training examples guarantees with low sample complexity that test images will also yield submodular pairwise potentials. Results are presented in the binary and muticlass settings, showing substantial improvement from the resulting increased model capacity.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
In the following, we write \(\varvec{\phi }_u(x^k)\) and \(\varvec{\phi }_p(x^k,x^l)\) as a shorthand for \(\varvec{\phi }_u^{k}(x^k,x^l)\) and \(\varvec{\phi }_p^{k,l}(x^k,x^l)\).
 
Literatur
Zurück zum Zitat Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Susstrunk, S. (2012). SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2274–2282.CrossRef Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Susstrunk, S. (2012). SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2274–2282.CrossRef
Zurück zum Zitat Anguelov, D., Taskar, B., Chatalbashev, V., Koller, D., Gupta, D., Heitz, G., et al. (2005). Discriminative learning of Markov random fields for segmentation of 3D scan data. In Proceedings of the IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 169–176). Anguelov, D., Taskar, B., Chatalbashev, V., Koller, D., Gupta, D., Heitz, G., et al. (2005). Discriminative learning of Markov random fields for segmentation of 3D scan data. In Proceedings of the IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 169–176).
Zurück zum Zitat Bakır, G. H., Hofmann, T., Schölkopf, B., Smola, A. J., Taskar, B., & Vishwanathan, S. V. N. (2007). Predicting structured data. Cambridge: The MIT Press.CrossRef Bakır, G. H., Hofmann, T., Schölkopf, B., Smola, A. J., Taskar, B., & Vishwanathan, S. V. N. (2007). Predicting structured data. Cambridge: The MIT Press.CrossRef
Zurück zum Zitat Barahona, F. (1982). On the computational complexity of ising spin glass models. Journal of Physics A: Mathematical and General, 15(10), 3241.MathSciNetCrossRef Barahona, F. (1982). On the computational complexity of ising spin glass models. Journal of Physics A: Mathematical and General, 15(10), 3241.MathSciNetCrossRef
Zurück zum Zitat Bartlett, P. L., Jordan, M. I., & McAuliffe, J. D. (2006). Convexity, classification, and risk bounds. Journal of the American Statistical Association, 101(473), 138–156.MathSciNetCrossRef Bartlett, P. L., Jordan, M. I., & McAuliffe, J. D. (2006). Convexity, classification, and risk bounds. Journal of the American Statistical Association, 101(473), 138–156.MathSciNetCrossRef
Zurück zum Zitat Billingsley, P. (1995). Probability and measure. New York: Wiley.MATH Billingsley, P. (1995). Probability and measure. New York: Wiley.MATH
Zurück zum Zitat Boykov, Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In ICCV (pp 105–112). Boykov, Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In ICCV (pp 105–112).
Zurück zum Zitat Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.CrossRef Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.CrossRef
Zurück zum Zitat Casser, V., Kang, K. C., Pfister, H., & Haehn, D. (2018). Fast mitochondria segmentation for connectomics. CoRR arXiv:1812.06024. Casser, V., Kang, K. C., Pfister, H., & Haehn, D. (2018). Fast mitochondria segmentation for connectomics. CoRR arXiv:​1812.​06024.
Zurück zum Zitat Chandra, S., & Kokkinos, I. (2016). Fast, exact and multi-scale inference for semantic image segmentation with deep Gaussian CRFs. In European conference on computer vision. Chandra, S., & Kokkinos, I. (2016). Fast, exact and multi-scale inference for semantic image segmentation with deep Gaussian CRFs. In European conference on computer vision.
Zurück zum Zitat Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848.CrossRef Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848.CrossRef
Zurück zum Zitat Chen, L. C., Schwing, A. G., Yuille, A. L., & Urtasun, R. (2015). Learning deep structured models. In Proceedings of ICML. Chen, L. C., Schwing, A. G., Yuille, A. L., & Urtasun, R. (2015). Learning deep structured models. In Proceedings of ICML.
Zurück zum Zitat Cheng, H. C., & Varshney, A. (2017). Volume segmentation using convolutional neural networks with limited training data. In 2017 IEEE international conference on image processing (ICIP) (pp. 590–594). Cheng, H. C., & Varshney, A. (2017). Volume segmentation using convolutional neural networks with limited training data. In 2017 IEEE international conference on image processing (ICIP) (pp. 590–594).
Zurück zum Zitat Elisseeff, A., & Weston, J. (2001). A kernel method for multi-labelled classification. In NIPS. Elisseeff, A., & Weston, J. (2001). A kernel method for multi-labelled classification. In NIPS.
Zurück zum Zitat Felzenszwalb, P. F., Girshick, R. B., & McAllester, D. (2010). Cascade object detection with deformable part models. In CVPR. Felzenszwalb, P. F., Girshick, R. B., & McAllester, D. (2010). Cascade object detection with deformable part models. In CVPR.
Zurück zum Zitat Finley, T., & Joachims, T. (2008). Training structural SVMs when exact inference is intractable. In ICML. Finley, T., & Joachims, T. (2008). Training structural SVMs when exact inference is intractable. In ICML.
Zurück zum Zitat Gelasca, E. D., Byun, J., Obara, B., & Manjunath, B. (2008). Evaluation and benchmark for biological image segmentation. In 15th IEEE international conference on image processing, 2008. ICIP 2008 (pp. 1816–1819). IEEE. Gelasca, E. D., Byun, J., Obara, B., & Manjunath, B. (2008). Evaluation and benchmark for biological image segmentation. In 15th IEEE international conference on image processing, 2008. ICIP 2008 (pp. 1816–1819). IEEE.
Zurück zum Zitat Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6), 721–741.CrossRef Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6), 721–741.CrossRef
Zurück zum Zitat Gjorgjevikj, D., & Madjarov, G. (2011). Two stage classifier chain architecture for efficient pair-wise multi-label learning. In 2011 IEEE international workshop on machine learning for signal processing (pp 1–6). Gjorgjevikj, D., & Madjarov, G. (2011). Two stage classifier chain architecture for efficient pair-wise multi-label learning. In 2011 IEEE international workshop on machine learning for signal processing (pp 1–6).
Zurück zum Zitat Haralick, R. M., Shanmugam, K. S., & Dinstein, I. (1973). Textural features for image classification. IEEE Transactions on Systems, Man, and Cybernetics, 3, 610–621.CrossRef Haralick, R. M., Shanmugam, K. S., & Dinstein, I. (1973). Textural features for image classification. IEEE Transactions on Systems, Man, and Cybernetics, 3, 610–621.CrossRef
Zurück zum Zitat Kolmogorov, V. (2005). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1568–1583.CrossRef Kolmogorov, V. (2005). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1568–1583.CrossRef
Zurück zum Zitat Kolmogorov, V., & Zabih, R. (2004). What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 147–159.CrossRef Kolmogorov, V., & Zabih, R. (2004). What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 147–159.CrossRef
Zurück zum Zitat Lafferty, J. D., McCallum, A., & Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the eighteenth international conference on machine learning (pp 282–289). Lafferty, J. D., McCallum, A., & Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the eighteenth international conference on machine learning (pp 282–289).
Zurück zum Zitat Leibe, B., Leonardis, A., & Schiele, B. (2004). Combined object categorization and segmentation with an implicit shape model. In Workshop on statistical learning in computer vision, ECCV. Leibe, B., Leonardis, A., & Schiele, B. (2004). Combined object categorization and segmentation with an implicit shape model. In Workshop on statistical learning in computer vision, ECCV.
Zurück zum Zitat Lucchi, A., Márquez-Neila, P., Becker, C., Li, Y., Smith, K., Knott, G., et al. (2015). Learning structured models for segmentation of 2-d and 3-d imagery. IEEE Transactions on Medical Imaging, 34(5), 1096–1110.CrossRef Lucchi, A., Márquez-Neila, P., Becker, C., Li, Y., Smith, K., Knott, G., et al. (2015). Learning structured models for segmentation of 2-d and 3-d imagery. IEEE Transactions on Medical Imaging, 34(5), 1096–1110.CrossRef
Zurück zum Zitat Lucchi, A., Smith, K., Achanta, R., Knott, G., & Fua, P. (2012). Supervoxel-based segmentation of mitochondria in EM image stacks with learned shape features. IEEE Transactions on Medical Imaging, 31(2), 474–486.CrossRef Lucchi, A., Smith, K., Achanta, R., Knott, G., & Fua, P. (2012). Supervoxel-based segmentation of mitochondria in EM image stacks with learned shape features. IEEE Transactions on Medical Imaging, 31(2), 474–486.CrossRef
Zurück zum Zitat Magnus, J. R., & Neudecker, H. (1995). Matrix differential calculus with applications in statistics and econometrics. New York: Wiley.MATH Magnus, J. R., & Neudecker, H. (1995). Matrix differential calculus with applications in statistics and econometrics. New York: Wiley.MATH
Zurück zum Zitat Metzen, J. H., Kumar, M. C., Brox, T., & Fischer, V. (2017). Universal adversarial perturbations against semantic image segmentation. In The IEEE international conference on computer vision (ICCV). Metzen, J. H., Kumar, M. C., Brox, T., & Fischer, V. (2017). Universal adversarial perturbations against semantic image segmentation. In The IEEE international conference on computer vision (ICCV).
Zurück zum Zitat Müller, A. C., & Behnke, S. (2014). PyStruct—Learning structured prediction in python. Journal of Machine Learning Research, 15, 2055–2060.MathSciNetMATH Müller, A. C., & Behnke, S. (2014). PyStruct—Learning structured prediction in python. Journal of Machine Learning Research, 15, 2055–2060.MathSciNetMATH
Zurück zum Zitat Nowozin, S., Gehler, P. V., & Lampert, C. H. (2010). On parameter learning in CRF-based approaches to object class image segmentation. ECCV, 6, 98–111. Nowozin, S., Gehler, P. V., & Lampert, C. H. (2010). On parameter learning in CRF-based approaches to object class image segmentation. ECCV, 6, 98–111.
Zurück zum Zitat Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. In N. Navab, J. Hornegger, W. M. Wells, & A. F. Frangi (Eds.), Medical image computing and computer-assisted intervention (pp. 234–241). Berlin: Springer. Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. In N. Navab, J. Hornegger, W. M. Wells, & A. F. Frangi (Eds.), Medical image computing and computer-assisted intervention (pp. 234–241). Berlin: Springer.
Zurück zum Zitat Rother, C., Kolmogorov, V., Lempitsky, V., & Szummer, M. (2007). Optimizing binary MRFs via extended roof duality. In 2007 IEEE conference on computer vision and pattern recognition (pp. 1–8). IEEE. Rother, C., Kolmogorov, V., Lempitsky, V., & Szummer, M. (2007). Optimizing binary MRFs via extended roof duality. In 2007 IEEE conference on computer vision and pattern recognition (pp. 1–8). IEEE.
Zurück zum Zitat Rother, C., Kumar, S., Kolmogorov, V., & Blake, A. (2005). Digital tapestry [automatic image synthesis]. In IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR 2005 (Vol. 1, pp. 589–596). IEEE. Rother, C., Kumar, S., Kolmogorov, V., & Blake, A. (2005). Digital tapestry [automatic image synthesis]. In IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR 2005 (Vol. 1, pp. 589–596). IEEE.
Zurück zum Zitat Szummer, M., Kohli, P., & Hoiem, D. (2008). Learning CRFs using graph cuts. In D. A. Forsyth, P. H. S. Torr, & A. Zisserman (Eds.) European conference on computer vision. Lecture notes in computer science (Vol. 5303, pp. 582–595). Berlin: Springer. Szummer, M., Kohli, P., & Hoiem, D. (2008). Learning CRFs using graph cuts. In D. A. Forsyth, P. H. S. Torr, & A. Zisserman (Eds.) European conference on computer vision. Lecture notes in computer science (Vol. 5303, pp. 582–595). Berlin: Springer.
Zurück zum Zitat Tsochantaridis, I., Joachims, T., Hofmann, T., & Altun, Y. (2005). Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6, 1453–1484.MathSciNetMATH Tsochantaridis, I., Joachims, T., Hofmann, T., & Altun, Y. (2005). Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6, 1453–1484.MathSciNetMATH
Zurück zum Zitat Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27(11), 1134–1142.CrossRef Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27(11), 1134–1142.CrossRef
Zurück zum Zitat Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley.MATH Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley.MATH
Zurück zum Zitat Zaremba, W., & Blaschko, M. B. (2016). Discriminative training of CRF models with probably submodular constraints. In IEEE winter conference on applications of computer vision. Zaremba, W., & Blaschko, M. B. (2016). Discriminative training of CRF models with probably submodular constraints. In IEEE winter conference on applications of computer vision.
Metadaten
Titel
Discriminative Training of Conditional Random Fields with Probably Submodular Constraints
verfasst von
Maxim Berman
Matthew B. Blaschko
Publikationsdatum
22.01.2020
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 6/2020
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-019-01277-y

Weitere Artikel der Ausgabe 6/2020

International Journal of Computer Vision 6/2020 Zur Ausgabe