nach oben

International Journal of Computer Vision

Erschienen in:

22.01.2020

Discriminative Training of Conditional Random Fields with Probably Submodular Constraints

verfasst von: Maxim Berman, Matthew B. Blaschko

Erschienen in: International Journal of Computer Vision | Ausgabe 6/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Problems of segmentation, denoising, registration and 3D reconstruction are often addressed with the graph cut algorithm. However, solving an unconstrained graph cut problem is NP-hard. For tractable optimization, pairwise potentials have to fulfill the submodularity inequality. In our learning paradigm, pairwise potentials are created as the dot product of a learned vector w with positive feature vectors. In order to constrain such a model to remain tractable, previous approaches have enforced the weight vector to be positive for pairwise potentials in which the labels differ, and set pairwise potentials to zero in the case that the label remains the same. Such constraints are sufficient to guarantee that the resulting pairwise potentials satisfy the submodularity inequality. However, we show that such an approach unnecessarily restricts the capacity of the learned models. Guaranteeing submodularity for all possible inputs, no matter how improbable, reduces inference error to effectively zero, but increases model error. In contrast, we relax the requirement of guaranteed submodularity to solutions that are probably approximately submodular. We show that the conceptually simple strategy of enforcing submodularity on the training examples guarantees with low sample complexity that test images will also yield submodular pairwise potentials. Results are presented in the binary and muticlass settings, showing substantial improvement from the resulting increased model capacity.

Vorheriger Artikel Gated Fusion Network for Degraded Image Super Resolution

Nächster Artikel Weakly-Supervised Semantic Segmentation by Iterative Affinity Learning

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

In the following, we write \(\varvec{\phi }_u(x^k)\) and \(\varvec{\phi }_p(x^k,x^l)\) as a shorthand for \(\varvec{\phi }_u^{k}(x^k,x^l)\) and \(\varvec{\phi }_p^{k,l}(x^k,x^l)\).

Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Susstrunk, S. (2012). SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2274–2282.CrossRef

Anguelov, D., Taskar, B., Chatalbashev, V., Koller, D., Gupta, D., Heitz, G., et al. (2005). Discriminative learning of Markov random fields for segmentation of 3D scan data. In Proceedings of the IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 169–176).

Bakır, G. H., Hofmann, T., Schölkopf, B., Smola, A. J., Taskar, B., & Vishwanathan, S. V. N. (2007). Predicting structured data. Cambridge: The MIT Press.CrossRef

Barahona, F. (1982). On the computational complexity of ising spin glass models. Journal of Physics A: Mathematical and General, 15(10), 3241.MathSciNetCrossRef

Bartlett, P. L., Jordan, M. I., & McAuliffe, J. D. (2006). Convexity, classification, and risk bounds. Journal of the American Statistical Association, 101(473), 138–156.MathSciNetCrossRef

Billingsley, P. (1995). Probability and measure. New York: Wiley.MATH

Boykov, Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In ICCV (pp 105–112).

Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.CrossRef

Casser, V., Kang, K. C., Pfister, H., & Haehn, D. (2018). Fast mitochondria segmentation for connectomics. CoRR arXiv:1812.06024.

Chandra, S., & Kokkinos, I. (2016). Fast, exact and multi-scale inference for semantic image segmentation with deep Gaussian CRFs. In European conference on computer vision.

Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848.CrossRef

Chen, L. C., Schwing, A. G., Yuille, A. L., & Urtasun, R. (2015). Learning deep structured models. In Proceedings of ICML.

Cheng, H. C., & Varshney, A. (2017). Volume segmentation using convolutional neural networks with limited training data. In 2017 IEEE international conference on image processing (ICIP) (pp. 590–594).

Elisseeff, A., & Weston, J. (2001). A kernel method for multi-labelled classification. In NIPS.

Felzenszwalb, P. F., Girshick, R. B., & McAllester, D. (2010). Cascade object detection with deformable part models. In CVPR.

Finley, T., & Joachims, T. (2008). Training structural SVMs when exact inference is intractable. In ICML.

Gelasca, E. D., Byun, J., Obara, B., & Manjunath, B. (2008). Evaluation and benchmark for biological image segmentation. In 15th IEEE international conference on image processing, 2008. ICIP 2008 (pp. 1816–1819). IEEE.

Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6), 721–741.CrossRef

Gjorgjevikj, D., & Madjarov, G. (2011). Two stage classifier chain architecture for efficient pair-wise multi-label learning. In 2011 IEEE international workshop on machine learning for signal processing (pp 1–6).

Haralick, R. M., Shanmugam, K. S., & Dinstein, I. (1973). Textural features for image classification. IEEE Transactions on Systems, Man, and Cybernetics, 3, 610–621.CrossRef

Har-Peled, S. (2011). On the expected complexity of random convex hulls. CoRR arXiv:1111.5340.

Joachims, T., Finley, T., & Yu, C. N. J. (2009). Cutting-plane training of structural SVMs. Machine Learning, 77(1), 27–59. https://doi.org/10.1007/s10994-009-5108-8.CrossRefMATH

Kolmogorov, V. (2005). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1568–1583.CrossRef

Kolmogorov, V., & Zabih, R. (2004). What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 147–159.CrossRef

Lafferty, J. D., McCallum, A., & Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the eighteenth international conference on machine learning (pp 282–289).

Leibe, B., Leonardis, A., & Schiele, B. (2004). Combined object categorization and segmentation with an implicit shape model. In Workshop on statistical learning in computer vision, ECCV.

Lucchi, A., Márquez-Neila, P., Becker, C., Li, Y., Smith, K., Knott, G., et al. (2015). Learning structured models for segmentation of 2-d and 3-d imagery. IEEE Transactions on Medical Imaging, 34(5), 1096–1110.CrossRef

Lucchi, A., Smith, K., Achanta, R., Knott, G., & Fua, P. (2012). Supervoxel-based segmentation of mitochondria in EM image stacks with learned shape features. IEEE Transactions on Medical Imaging, 31(2), 474–486.CrossRef

Magnus, J. R., & Neudecker, H. (1995). Matrix differential calculus with applications in statistics and econometrics. New York: Wiley.MATH

Metzen, J. H., Kumar, M. C., Brox, T., & Fischer, V. (2017). Universal adversarial perturbations against semantic image segmentation. In The IEEE international conference on computer vision (ICCV).

Müller, A. C., & Behnke, S. (2014). PyStruct—Learning structured prediction in python. Journal of Machine Learning Research, 15, 2055–2060.MathSciNetMATH

Nowozin, S., Gehler, P. V., & Lampert, C. H. (2010). On parameter learning in CRF-based approaches to object class image segmentation. ECCV, 6, 98–111.

Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. In N. Navab, J. Hornegger, W. M. Wells, & A. F. Frangi (Eds.), Medical image computing and computer-assisted intervention (pp. 234–241). Berlin: Springer.

Rother, C., Kolmogorov, V., Lempitsky, V., & Szummer, M. (2007). Optimizing binary MRFs via extended roof duality. In 2007 IEEE conference on computer vision and pattern recognition (pp. 1–8). IEEE.

Rother, C., Kumar, S., Kolmogorov, V., & Blake, A. (2005). Digital tapestry [automatic image synthesis]. In IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR 2005 (Vol. 1, pp. 589–596). IEEE.

Szummer, M., Kohli, P., & Hoiem, D. (2008). Learning CRFs using graph cuts. In D. A. Forsyth, P. H. S. Torr, & A. Zisserman (Eds.) European conference on computer vision. Lecture notes in computer science (Vol. 5303, pp. 582–595). Berlin: Springer.

Tsochantaridis, I., Joachims, T., Hofmann, T., & Altun, Y. (2005). Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6, 1453–1484.MathSciNetMATH

Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27(11), 1134–1142.CrossRef

Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley.MATH

Zaremba, W., & Blaschko, M. B. (2016). Discriminative training of CRF models with probably submodular constraints. In IEEE winter conference on applications of computer vision.

Titel: Discriminative Training of Conditional Random Fields with Probably Submodular Constraints
verfasst von: Maxim Berman
Matthew B. Blaschko
Publikationsdatum: 22.01.2020
Verlag: Springer US
Erschienen in: International Journal of Computer Vision / Ausgabe 6/2020
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-019-01277-y

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 6/2020

Visual Social Relationship Recognition

Gated Fusion Network for Degraded Image Super Resolution

Bottom-Up Scene Text Detection with Markov Clustering Networks

Fine-Grained Person Re-identification

RGB-IR Person Re-identification by Cross-Modality Similarity Preservation

Scalable Person Re-Identification by Harmonious Attention