Skip to main content

2020 | OriginalPaper | Buchkapitel

Blended Grammar Network for Human Parsing

verfasst von : Xiaomei Zhang, Yingying Chen, Bingke Zhu, Jinqiao Wang, Ming Tang

Erschienen in: Computer Vision – ECCV 2020

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Although human parsing has made great progress, it still faces a challenge, i.e., how to extract the whole foreground from similar or cluttered scenes effectively. In this paper, we propose a Blended Grammar Network (BGNet), to deal with the challenge. BGNet exploits the inherent hierarchical structure of a human body and the relationship of different human parts by means of grammar rules in both cascaded and paralleled manner. In this way, conspicuous parts, which are easily distinguished from the background, can amend the segmentation of inconspicuous ones, improving the foreground extraction. We also design a Part-aware Convolutional Recurrent Neural Network (PCRNN) to pass messages which are generated by grammar rules. To train PCRNNs effectively, we present a blended grammar loss to supervise the training of PCRNNs. We conduct extensive experiments to evaluate BGNet on PASCAL-Person-Part, LIP, and PPSS datasets. BGNet obtains state-of-the-art performance on these human parsing datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Chen, L.C., et al.: Searching for efficient multi-scale architectures for dense image prediction. In: NeurIPS, pp. 8699–8710 (2018) Chen, L.C., et al.: Searching for efficient multi-scale architectures for dense image prediction. In: NeurIPS, pp. 8699–8710 (2018)
3.
Zurück zum Zitat Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE TPAMI 40(4), 834–848 (2018)CrossRef Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE TPAMI 40(4), 834–848 (2018)CrossRef
4.
Zurück zum Zitat Chen, L.C., Yang, Y., Wang, J., Xu, W., Yuille, A.L.: Attention to scale: scale-aware semantic image segmentation. In: CVPR, June 2016 Chen, L.C., Yang, Y., Wang, J., Xu, W., Yuille, A.L.: Attention to scale: scale-aware semantic image segmentation. In: CVPR, June 2016
5.
Zurück zum Zitat Chen, X., Mottaghi, R., Liu, X., Fidler, S., Urtasun, R., Yuille, A.: Detect what you can: detecting and representing objects using holistic models and body parts. In: CVPR, pp. 1979–1986 (2014) Chen, X., Mottaghi, R., Liu, X., Fidler, S., Urtasun, R., Yuille, A.: Detect what you can: detecting and representing objects using holistic models and body parts. In: CVPR, pp. 1979–1986 (2014)
6.
Zurück zum Zitat Fang, H.S., Lu, G., Fang, X., Xie, J., Tai, Y.W., Lu, C.: Weakly and semi supervised human body part parsing via pose-guided knowledge transfer. arXiv preprint arXiv:1805.04310 (2018) Fang, H.S., Lu, G., Fang, X., Xie, J., Tai, Y.W., Lu, C.: Weakly and semi supervised human body part parsing via pose-guided knowledge transfer. arXiv preprint arXiv:​1805.​04310 (2018)
7.
Zurück zum Zitat Fang, H., Xu, Y., Wang, W., Liu, X., Zhu, S.C.: Learning pose grammar to encode human body configuration for 3D pose estimation. In: AAAI (2018) Fang, H., Xu, Y., Wang, W., Liu, X., Zhu, S.C.: Learning pose grammar to encode human body configuration for 3D pose estimation. In: AAAI (2018)
8.
Zurück zum Zitat Farenzena, M., Bazzani, L., Perina, A., Murino, V., Cristani, M.: Person re-identification by symmetry-driven accumulation of local features. In: CVPR, pp. 2360–2367 (2010) Farenzena, M., Bazzani, L., Perina, A., Murino, V., Cristani, M.: Person re-identification by symmetry-driven accumulation of local features. In: CVPR, pp. 2360–2367 (2010)
9.
Zurück zum Zitat Fu, J., et al.: Dual attention network for scene segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019) Fu, J., et al.: Dual attention network for scene segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)
10.
Zurück zum Zitat Fu, J., Liu, J., Wang, Y., Lu, H.: Densely connected deconvolutional network for semantic segmentation. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 3085–3089. IEEE (2017) Fu, J., Liu, J., Wang, Y., Lu, H.: Densely connected deconvolutional network for semantic segmentation. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 3085–3089. IEEE (2017)
11.
Zurück zum Zitat Garland-Thomson, R.: Staring: How We Look. Oxford University Press, Oxford (2009) Garland-Thomson, R.: Staring: How We Look. Oxford University Press, Oxford (2009)
12.
Zurück zum Zitat Gong, K., Gao, Y., Liang, X., Shen, X., Wang, M., Lin, L.: Graphonomy: universal human parsing via graph transfer learning. In: CVPR (2019) Gong, K., Gao, Y., Liang, X., Shen, X., Wang, M., Lin, L.: Graphonomy: universal human parsing via graph transfer learning. In: CVPR (2019)
13.
Zurück zum Zitat Gong, K., Liang, X., Li, Y., Chen, Y., Yang, M., Lin, L.: Instance-level human parsing via part grouping network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 770–785 (2018) Gong, K., Liang, X., Li, Y., Chen, Y., Yang, M., Lin, L.: Instance-level human parsing via part grouping network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 770–785 (2018)
14.
Zurück zum Zitat Gong, K., Liang, X., Zhang, D., Shen, X., Lin, L.: Look into person: self-supervised structure-sensitive learning and a new benchmark for human parsing. In: CVPR, vol. 2, p. 6 (2017) Gong, K., Liang, X., Zhang, D., Shen, X., Lin, L.: Look into person: self-supervised structure-sensitive learning and a new benchmark for human parsing. In: CVPR, vol. 2, p. 6 (2017)
15.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
16.
Zurück zum Zitat Ke, L., Chang, M.C., Qi, H., Lyu, S.: Multi-scale structure-aware network for human pose estimation. In: ECCV, pp. 713–728 (2018) Ke, L., Chang, M.C., Qi, H., Lyu, S.: Multi-scale structure-aware network for human pose estimation. In: ECCV, pp. 713–728 (2018)
18.
Zurück zum Zitat Liang, X., Gong, K., Shen, X., Lin, L.: Look into person: joint body parsing & pose estimation network and a new benchmark. IEEE TPAMI 41, 871–885 (2018)CrossRef Liang, X., Gong, K., Shen, X., Lin, L.: Look into person: joint body parsing & pose estimation network and a new benchmark. IEEE TPAMI 41, 871–885 (2018)CrossRef
19.
Zurück zum Zitat Liang, X., Lin, L., Shen, X., Feng, J., Yan, S., Xing, E.P.: Interpretable structure-evolving LSTM. In: CVPR, pp. 2175–2184 (2017) Liang, X., Lin, L., Shen, X., Feng, J., Yan, S., Xing, E.P.: Interpretable structure-evolving LSTM. In: CVPR, pp. 2175–2184 (2017)
20.
Zurück zum Zitat Liang, X., Shen, X., Xiang, D., Feng, J., Lin, L., Yan, S.: Semantic object parsing with local-global long short-term memory. In: CVPR, pp. 3185–3193 (2016) Liang, X., Shen, X., Xiang, D., Feng, J., Lin, L., Yan, S.: Semantic object parsing with local-global long short-term memory. In: CVPR, pp. 3185–3193 (2016)
21.
Zurück zum Zitat Lin, G., Milan, A., Shen, C., Reid, I.: RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: CVPR, pp. 1925–1934 (2017) Lin, G., Milan, A., Shen, C., Reid, I.: RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: CVPR, pp. 1925–1934 (2017)
22.
Zurück zum Zitat Lin, K., Wang, L., Luo, K., Chen, Y., Liu, Z., Sun, M.T.: Cross-domain complementary learning with synthetic data for multi-person part segmentation. arXiv preprint arXiv:1907.05193 (2019) Lin, K., Wang, L., Luo, K., Chen, Y., Liu, Z., Sun, M.T.: Cross-domain complementary learning with synthetic data for multi-person part segmentation. arXiv preprint arXiv:​1907.​05193 (2019)
23.
Zurück zum Zitat Liu, T., et al.: Devil in the details: towards accurate single and multiple human parsing. In: AAAI (2019) Liu, T., et al.: Devil in the details: towards accurate single and multiple human parsing. In: AAAI (2019)
24.
Zurück zum Zitat Liu, X., Zhang, M., Liu, W., Song, J., Mei, T.: BraidNet: braiding semantics and details for accurate human parsing. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 338–346. ACM (2019) Liu, X., Zhang, M., Liu, W., Song, J., Mei, T.: BraidNet: braiding semantics and details for accurate human parsing. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 338–346. ACM (2019)
25.
Zurück zum Zitat Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015) Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)
26.
Zurück zum Zitat Luc, P., Couprie, C., Chintala, S., Verbeek, J.: Semantic segmentation using adversarial networks. arXiv preprint arXiv:1611.08408 (2016) Luc, P., Couprie, C., Chintala, S., Verbeek, J.: Semantic segmentation using adversarial networks. arXiv preprint arXiv:​1611.​08408 (2016)
27.
Zurück zum Zitat Luo, P., Wang, X., Tang, X.: Pedestrian parsing via deep decompositional network. In: CVPR, pp. 2648–2655 (2014) Luo, P., Wang, X., Tang, X.: Pedestrian parsing via deep decompositional network. In: CVPR, pp. 2648–2655 (2014)
28.
Zurück zum Zitat Luo, Y., Zheng, Z., Zheng, L., Guan, T., Yu, J., Yang, Y.: Macro-micro adversarial network for human parsing. In: ECCV (2018) Luo, Y., Zheng, Z., Zheng, L., Guan, T., Yu, J., Yang, Y.: Macro-micro adversarial network for human parsing. In: ECCV (2018)
29.
Zurück zum Zitat Nie, X., Feng, J., Yan, S.: Mutual learning to adapt for joint human parsing and pose estimation. In: ECCV, pp. 502–517 (2018) Nie, X., Feng, J., Yan, S.: Mutual learning to adapt for joint human parsing and pose estimation. In: ECCV, pp. 502–517 (2018)
30.
Zurück zum Zitat Park, S., Nie, B.X., Zhu, S.C.: Attribute and-or grammar for joint parsing of human pose, parts and attributes. IEEE TPAMI 40(7), 1555–1569 (2018)CrossRef Park, S., Nie, B.X., Zhu, S.C.: Attribute and-or grammar for joint parsing of human pose, parts and attributes. IEEE TPAMI 40(7), 1555–1569 (2018)CrossRef
31.
Zurück zum Zitat Qi, S., Huang, S., Wei, P., Zhu, S.C.: Predicting human activities using stochastic grammar. In: CVPR (2017) Qi, S., Huang, S., Wei, P., Zhu, S.C.: Predicting human activities using stochastic grammar. In: CVPR (2017)
33.
Zurück zum Zitat Thorpe, S., Fize, D., Marlot, C.: Speed of processing in the human visual system. Nature 381(6582), 520 (1996)CrossRef Thorpe, S., Fize, D., Marlot, C.: Speed of processing in the human visual system. Nature 381(6582), 520 (1996)CrossRef
34.
Zurück zum Zitat Wang, P., Shen, X., Lin, Z., Cohen, S., Price, B., Yuille, A.: Joint object and part segmentation using deep learned potentials. In: ICCV, pp. 1573–1581 (2015) Wang, P., Shen, X., Lin, Z., Cohen, S., Price, B., Yuille, A.: Joint object and part segmentation using deep learned potentials. In: ICCV, pp. 1573–1581 (2015)
35.
Zurück zum Zitat Wang, W., Xu, Y., Shen, J., Zhu, S.: Attentive fashion grammar network for fashion landmark detection and clothing category classification. In: CVPR (2018) Wang, W., Xu, Y., Shen, J., Zhu, S.: Attentive fashion grammar network for fashion landmark detection and clothing category classification. In: CVPR (2018)
36.
Zurück zum Zitat Wang, W., Zhang, Z., Qi, S., Shen, J., Pang, Y., Shao, L.: Learning compositional neural information fusion for human parsing. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2019) Wang, W., Zhang, Z., Qi, S., Shen, J., Pang, Y., Shao, L.: Learning compositional neural information fusion for human parsing. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2019)
37.
Zurück zum Zitat Wang, Y., Duan, T., Liao, Z., Forsyth, D.: Discriminative hierarchical part-based models for human parsing and action recognition. J. Mach. Learn. Res. 13(1), 3075–3102 (2012)MathSciNetMATH Wang, Y., Duan, T., Liao, Z., Forsyth, D.: Discriminative hierarchical part-based models for human parsing and action recognition. J. Mach. Learn. Res. 13(1), 3075–3102 (2012)MathSciNetMATH
38.
Zurück zum Zitat Xia, F., Wang, P., Chen, L.C., Yuille, A.L.: Zoom better to see clearer: human and object parsing with hierarchical auto-zoom net. In: ICCV, pp. 648–663 (2015) Xia, F., Wang, P., Chen, L.C., Yuille, A.L.: Zoom better to see clearer: human and object parsing with hierarchical auto-zoom net. In: ICCV, pp. 648–663 (2015)
39.
Zurück zum Zitat Xia, F., Wang, P., Chen, X., Yuille, A.: Joint multi-person pose estimation and semantic part segmentation. In: CVPRW, pp. 6080–6089 (2017) Xia, F., Wang, P., Chen, X., Yuille, A.: Joint multi-person pose estimation and semantic part segmentation. In: CVPRW, pp. 6080–6089 (2017)
40.
Zurück zum Zitat Yamaguchi, K., Kiapour, M.H., Berg, T.L.: Paper doll parsing: retrieving similar styles to parse clothing items. In: ICCV, pp. 3519–3526 (2013) Yamaguchi, K., Kiapour, M.H., Berg, T.L.: Paper doll parsing: retrieving similar styles to parse clothing items. In: ICCV, pp. 3519–3526 (2013)
41.
Zurück zum Zitat Zhang, X., Chen, Y., Zhu, B., Wang, J., Tang, M.: Part-aware context network for human parsing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8971–8980 (2020) Zhang, X., Chen, Y., Zhu, B., Wang, J., Tang, M.: Part-aware context network for human parsing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8971–8980 (2020)
42.
Zurück zum Zitat Zhao, H., Qi, X., Shen, X., Shi, J., Jia, J.: ICNet for real-time semantic segmentation on high-resolution images. In: ECCV, pp. 405–420 (2018) Zhao, H., Qi, X., Shen, X., Shi, J., Jia, J.: ICNet for real-time semantic segmentation on high-resolution images. In: ECCV, pp. 405–420 (2018)
43.
Zurück zum Zitat Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017) Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)
44.
Zurück zum Zitat Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR, pp. 2921–2929 (2016) Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR, pp. 2921–2929 (2016)
45.
Zurück zum Zitat Zhu, B., Chen, Y., Tang, M., Wang, J.: Progressive cognitive human parsing. In: AAAI (2018) Zhu, B., Chen, Y., Tang, M., Wang, J.: Progressive cognitive human parsing. In: AAAI (2018)
Metadaten
Titel
Blended Grammar Network for Human Parsing
verfasst von
Xiaomei Zhang
Yingying Chen
Bingke Zhu
Jinqiao Wang
Ming Tang
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-58586-0_12

Premium Partner