Skip to main content

2019 | OriginalPaper | Buchkapitel

Scene Recognition via Bi-enhanced Knowledge Space Learning

verfasst von : Jin Zhang, Bing-Kun Bao, Changsheng Xu

Erschienen in: New Trends in Computer Technologies and Applications

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Scene recognition is one of the hallmark tasks in computer vision, as it provides rich information beyond object recognition and action recognition. It is easy to accept that scene images from the same class always include the same essential objects and relations, for example, scene images of “wedding” usually have bridegroom and bride next to him. Following this observation, we introduce a novel idea to boost the accuracy of scene recognition by mining essential scene sub-graph and learning a bi-enhanced knowledge space. The essential scene sub-graph describes the essential objects and their relations for each scene class. The learned knowledge space is bi-enhanced by global representation on the entire image and local representation on the corresponding essential scene sub-graph. Experimental results on the constructed dataset called Scene 30 demonstrate the effectiveness of our proposed method.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH
2.
Zurück zum Zitat Chen, P.H., Lin, C.J., Schölkopf, B.: A tutorial on-support vector machines. Appl. Stoch. Models Bus. Ind. 21(2), 111–136 (2005)MathSciNetCrossRef Chen, P.H., Lin, C.J., Schölkopf, B.: A tutorial on-support vector machines. Appl. Stoch. Models Bus. Ind. 21(2), 111–136 (2005)MathSciNetCrossRef
3.
Zurück zum Zitat Cheng, X., Lu, J., Feng, J., Yuan, B., Zhou, J.: Scene recognition with objectness. Pattern Recogn. 74, 474–487 (2018)CrossRef Cheng, X., Lu, J., Feng, J., Yuan, B., Zhou, J.: Scene recognition with objectness. Pattern Recogn. 74, 474–487 (2018)CrossRef
5.
Zurück zum Zitat Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detectionwith discriminatively trained part-based models. PAMI 32(9), 1627–1645 (2010)CrossRef Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detectionwith discriminatively trained part-based models. PAMI 32(9), 1627–1645 (2010)CrossRef
6.
Zurück zum Zitat Geman, S., Graffigne, C.: Markov random field image models and their applications to computer vision. In: Proceedings of the International Congress of Mathematicians, vol. 1, p. 2 (1986) Geman, S., Graffigne, C.: Markov random field image models and their applications to computer vision. In: Proceedings of the International Congress of Mathematicians, vol. 1, p. 2 (1986)
8.
Zurück zum Zitat Herranz, L., Jiang, S., Li, X.: Scene recognition with CNNs: objects, scales and dataset bias. In: CVPR, pp. 571–579 (2016) Herranz, L., Jiang, S., Li, X.: Scene recognition with CNNs: objects, scales and dataset bias. In: CVPR, pp. 571–579 (2016)
9.
Zurück zum Zitat Huang, S., Xu, Z., Tao, D., Zhang, Y.: Part-stacked CNN for fine-grained visual categorization. In: CVPR, pp. 1173–1182 (2016) Huang, S., Xu, Z., Tao, D., Zhang, Y.: Part-stacked CNN for fine-grained visual categorization. In: CVPR, pp. 1173–1182 (2016)
10.
Zurück zum Zitat Krishna, R., et al.: Visual genome: connecting language and vision using crowdsourced dense image annotations. IJCV 123(1), 32–73 (2017)MathSciNetCrossRef Krishna, R., et al.: Visual genome: connecting language and vision using crowdsourced dense image annotations. IJCV 123(1), 32–73 (2017)MathSciNetCrossRef
11.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
12.
Zurück zum Zitat Bao, B.-K., Zhu, G., Shen, J., Yan, S.: Robust image analysis with sparse representation on quantized visual features. IEEE Trans. Image Process. 22(3), 860–871 (2013)MathSciNetCrossRef Bao, B.-K., Zhu, G., Shen, J., Yan, S.: Robust image analysis with sparse representation on quantized visual features. IEEE Trans. Image Process. 22(3), 860–871 (2013)MathSciNetCrossRef
13.
Zurück zum Zitat Li, L.J., Su, H., Fei-Fei, L., Xing, E.P.: Object bank: A high-level image representation for scene classification and semantic feature sparsification. In: Advances in Neural Information Processing Systems, pp. 1378–1386 (2010) Li, L.J., Su, H., Fei-Fei, L., Xing, E.P.: Object bank: A high-level image representation for scene classification and semantic feature sparsification. In: Advances in Neural Information Processing Systems, pp. 1378–1386 (2010)
15.
Zurück zum Zitat Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representationof the spatial envelope. IJCV 42(3), 145–175 (2001)CrossRef Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representationof the spatial envelope. IJCV 42(3), 145–175 (2001)CrossRef
16.
Zurück zum Zitat Parizi, S.N., Oberlin, J.G., Felzenszwalb, P.F.: Reconfigurable models for scene recognition. In: CVPR 2012, pp. 2775–2782. IEEE (2012) Parizi, S.N., Oberlin, J.G., Felzenszwalb, P.F.: Reconfigurable models for scene recognition. In: CVPR 2012, pp. 2775–2782. IEEE (2012)
17.
18.
Zurück zum Zitat Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:​1409.​1556 (2014)
19.
Zurück zum Zitat Stamp, M., Professor, A.: A revealing introduction to hidden Markov models. IEEE ASSP Magruine 1(24), 258–261 (2004) Stamp, M., Professor, A.: A revealing introduction to hidden Markov models. IEEE ASSP Magruine 1(24), 258–261 (2004)
20.
Zurück zum Zitat Sudderth, E.B., Torralba, A., Freeman, W.T., Willsky, A.S.: Learning hierarchical models of scenes, objects, and parts. In: ICCV 2005, vol. 2, pp. 1331–1338. IEEE (2005) Sudderth, E.B., Torralba, A., Freeman, W.T., Willsky, A.S.: Learning hierarchical models of scenes, objects, and parts. In: ICCV 2005, vol. 2, pp. 1331–1338. IEEE (2005)
21.
Zurück zum Zitat Wang, Z., Wang, L., Wang, Y., Zhang, B., Qiao, Y.: Weakly supervised patchnets: describing and aggregating local patches for scene recognition. TIP 26(4), 2028–2041 (2017)MathSciNetMATH Wang, Z., Wang, L., Wang, Y., Zhang, B., Qiao, Y.: Weakly supervised patchnets: describing and aggregating local patches for scene recognition. TIP 26(4), 2028–2041 (2017)MathSciNetMATH
22.
Zurück zum Zitat Wu, J., Rehg, J.M.: Centrist: a visual descriptor for scene categorization. PAMI 33(8), 1489–1501 (2011)CrossRef Wu, J., Rehg, J.M.: Centrist: a visual descriptor for scene categorization. PAMI 33(8), 1489–1501 (2011)CrossRef
23.
Zurück zum Zitat Xie, G.S., Zhang, X.Y., Yan, S., Liu, C.L.: Hybrid CNN and dictionary-based models for scene recognition and domain adaptation. IEEE Trans. Circuits Syst. Video Technol. 27(6), 1263–1274 (2017)CrossRef Xie, G.S., Zhang, X.Y., Yan, S., Liu, C.L.: Hybrid CNN and dictionary-based models for scene recognition and domain adaptation. IEEE Trans. Circuits Syst. Video Technol. 27(6), 1263–1274 (2017)CrossRef
24.
Zurück zum Zitat Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. PAMI 40, 1452–1464 (2017)CrossRef Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. PAMI 40, 1452–1464 (2017)CrossRef
25.
Zurück zum Zitat Bao, B.-K., Liu, G., Changsheng, X., Yan, S.: Inductive robust principal component analysis. IEEE Trans. Image Process. 21(8), 3794–3800 (2012)MathSciNetCrossRef Bao, B.-K., Liu, G., Changsheng, X., Yan, S.: Inductive robust principal component analysis. IEEE Trans. Image Process. 21(8), 3794–3800 (2012)MathSciNetCrossRef
26.
Zurück zum Zitat Bao, B.-K., Min, W., Li, T., Changsheng, X.: Joint local and global consistency on interdocument and interword relationships for co-clustering. IEEE Trans. Cybern. 45(1), 15–28 (2015)CrossRef Bao, B.-K., Min, W., Li, T., Changsheng, X.: Joint local and global consistency on interdocument and interword relationships for co-clustering. IEEE Trans. Cybern. 45(1), 15–28 (2015)CrossRef
27.
Zurück zum Zitat Min, W., Bao, B.-K., Mei, S., Zhu, Y., Rui, Y., Jiang, S.: You are what you eat: exploring rich recipe information for cross-region food analysis. IEEE Trans. Multimed. 20(4), 950–964 (2018)CrossRef Min, W., Bao, B.-K., Mei, S., Zhu, Y., Rui, Y., Jiang, S.: You are what you eat: exploring rich recipe information for cross-region food analysis. IEEE Trans. Multimed. 20(4), 950–964 (2018)CrossRef
28.
Zurück zum Zitat Bao, B.-K., Changsheng, X., Min, W., Hossain, M.S.: Cross-platform emerging topic detection and elaboration from multimedia streams. TOMCCAP 11(4), 54 (2015)CrossRef Bao, B.-K., Changsheng, X., Min, W., Hossain, M.S.: Cross-platform emerging topic detection and elaboration from multimedia streams. TOMCCAP 11(4), 54 (2015)CrossRef
29.
Zurück zum Zitat Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Advances in Neural Information Processing Systems, pp. 487–495 (2014) Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Advances in Neural Information Processing Systems, pp. 487–495 (2014)
Metadaten
Titel
Scene Recognition via Bi-enhanced Knowledge Space Learning
verfasst von
Jin Zhang
Bing-Kun Bao
Changsheng Xu
Copyright-Jahr
2019
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-13-9190-3_23