Skip to main content

2015 | OriginalPaper | Buchkapitel

Joint Visual Phrase Detection to Boost Scene Parsing

verfasst von : Keke Tang, Zhe Zhao, Xiaoping Chen

Erschienen in: Advances in Visual Computing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Scene parsing is a very challenging problem which attracts increasing interests in many fields such as computer vision and robotics. However, occluded or small objects which are difficult to parse are always ignored. To deal with these two problems, we integrate visual phrase into our joint system, which has been proved to have good performance on describing relationships between objects. In this paper, we propose a joint model which integrates scene classification, object and visual phrase detection, as well as scene parsing together. By encoding them into a Conditional Random Field model, all tasks mentioned above could be solved jointly. We evaluate our method on the MSRC-21 dataset. The experimental results demonstrate that our method achieves comparable and on some occasions even superior performance with respect to state-of-the-art joint methods especially when there exist partially occluded or small objects.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Liu, C., Yuen, J., Torralba, A.: Nonparametric scene parsing: Label transfer via dense scene alignment. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1972–1979. IEEE (2009) Liu, C., Yuen, J., Torralba, A.: Nonparametric scene parsing: Label transfer via dense scene alignment. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1972–1979. IEEE (2009)
2.
Zurück zum Zitat Tighe, J., Lazebnik, S.: SuperParsing: scalable nonparametric image parsing with superpixels. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 352–365. Springer, Heidelberg (2010) CrossRef Tighe, J., Lazebnik, S.: SuperParsing: scalable nonparametric image parsing with superpixels. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 352–365. Springer, Heidelberg (2010) CrossRef
3.
Zurück zum Zitat Ladický, Ľ., Sturgess, P., Alahari, K., Russell, C., Torr, P.H.S.: What, where and how many? combining object detectors and CRFs. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 424–437. Springer, Heidelberg (2010) CrossRef Ladický, Ľ., Sturgess, P., Alahari, K., Russell, C., Torr, P.H.S.: What, where and how many? combining object detectors and CRFs. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 424–437. Springer, Heidelberg (2010) CrossRef
4.
Zurück zum Zitat Ren, X., Bo, L., Fox, D.: Rgb-(d) scene labeling: Features and algorithms. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2759–2766. IEEE (2012) Ren, X., Bo, L., Fox, D.: Rgb-(d) scene labeling: Features and algorithms. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2759–2766. IEEE (2012)
5.
Zurück zum Zitat Shotton, J., Winn, J., Rother, C., Criminisi, A.: Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int. J. Comput. Vis. 81, 2–23 (2009)CrossRef Shotton, J., Winn, J., Rother, C., Criminisi, A.: Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int. J. Comput. Vis. 81, 2–23 (2009)CrossRef
6.
Zurück zum Zitat Tighe, J., Lazebnik, S.: Finding things: Image parsing with regions and per-exemplar detectors. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3001–3008. IEEE (2013) Tighe, J., Lazebnik, S.: Finding things: Image parsing with regions and per-exemplar detectors. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3001–3008. IEEE (2013)
7.
Zurück zum Zitat Yao, J., Fidler, S., Urtasun, R.: Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 702–709. IEEE (2012) Yao, J., Fidler, S., Urtasun, R.: Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 702–709. IEEE (2012)
8.
Zurück zum Zitat Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8. IEEE (2008) Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8. IEEE (2008)
9.
Zurück zum Zitat Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: Large-scale scene recognition from abbey to zoo. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3485–3492. IEEE (2010) Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: Large-scale scene recognition from abbey to zoo. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3485–3492. IEEE (2010)
10.
Zurück zum Zitat Wojek, C., Schiele, B.: A dynamic conditional random field model for joint labeling of object and scene classes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 733–747. Springer, Heidelberg (2008) CrossRef Wojek, C., Schiele, B.: A dynamic conditional random field model for joint labeling of object and scene classes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 733–747. Springer, Heidelberg (2008) CrossRef
11.
Zurück zum Zitat Yang, J., Price, B., Cohen, S., Yang, M.H.: Context driven scene parsing with attention to rare classes. In: Proceedings of the CVPR (2014) Yang, J., Price, B., Cohen, S., Yang, M.H.: Context driven scene parsing with attention to rare classes. In: Proceedings of the CVPR (2014)
12.
Zurück zum Zitat Sadeghi, M.A., Farhadi, A.: Recognition using visual phrases. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1745–1752. IEEE (2011) Sadeghi, M.A., Farhadi, A.: Recognition using visual phrases. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1745–1752. IEEE (2011)
13.
Zurück zum Zitat Li, C., Parikh, D., Chen, T.: Automatic discovery of groups of objects for scene understanding. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2735–2742. IEEE (2012) Li, C., Parikh, D., Chen, T.: Automatic discovery of groups of objects for scene understanding. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2735–2742. IEEE (2012)
14.
Zurück zum Zitat Sadovnik, A., Chen, T.: Hierarchical object groups for scene classification. In: 2012 19th IEEE International Conference on Image Processing (ICIP), pp. 1881–1884. IEEE (2012) Sadovnik, A., Chen, T.: Hierarchical object groups for scene classification. In: 2012 19th IEEE International Conference on Image Processing (ICIP), pp. 1881–1884. IEEE (2012)
15.
Zurück zum Zitat Choi, W., Chao, Y.W., Pantofaru, C., Savarese, S.: Understanding indoor scenes using 3d geometric phrases. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 33–40. IEEE (2013) Choi, W., Chao, Y.W., Pantofaru, C., Savarese, S.: Understanding indoor scenes using 3d geometric phrases. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 33–40. IEEE (2013)
16.
Zurück zum Zitat Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8. IEEE (2008) Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8. IEEE (2008)
17.
Zurück zum Zitat Malisiewicz, T., Gupta, A., Efros, A.A.: Ensemble of exemplar-svms for object detection and beyond. In: ICCV (2011) Malisiewicz, T., Gupta, A., Efros, A.A.: Ensemble of exemplar-svms for object detection and beyond. In: ICCV (2011)
18.
Zurück zum Zitat Hazan, T., Urtasun, R.: A primal-dual message-passing algorithm for approximated large scale structured prediction. In: Advances in Neural Information Processing Systems, pp. 838–846 (2010) Hazan, T., Urtasun, R.: A primal-dual message-passing algorithm for approximated large scale structured prediction. In: Advances in Neural Information Processing Systems, pp. 838–846 (2010)
19.
Zurück zum Zitat Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 898–916 (2011)CrossRef Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 898–916 (2011)CrossRef
20.
Zurück zum Zitat Kohli, P., Kumar, M.P., Torr, P.H.S.: P3 and beyond: Solving energies with higher order cliques. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2007) Kohli, P., Kumar, M.P., Torr, P.H.S.: P3 and beyond: Solving energies with higher order cliques. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2007)
21.
Zurück zum Zitat Schwing, A.G., Hazan, T., Pollefeys, M., Urtasun, R.: Distributed Message passing for large scale graphical models. In: Proceedings of the CVPR (2011) Schwing, A.G., Hazan, T., Pollefeys, M., Urtasun, R.: Distributed Message passing for large scale graphical models. In: Proceedings of the CVPR (2011)
22.
Zurück zum Zitat Krähenbühl, P., Koltun, V.: Efficient inference in fully connected crfs with gaussian edge potentials. In: Advances in Neural Information Processing Systems, pp. 109–117 (2011) Krähenbühl, P., Koltun, V.: Efficient inference in fully connected crfs with gaussian edge potentials. In: Advances in Neural Information Processing Systems, pp. 109–117 (2011)
Metadaten
Titel
Joint Visual Phrase Detection to Boost Scene Parsing
verfasst von
Keke Tang
Zhe Zhao
Xiaoping Chen
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-27863-6_36

Premium Partner