Skip to main content

2021 | OriginalPaper | Buchkapitel

Ensemble-Based Commercial Buildings Facades Photographs Classifier

verfasst von : Aleksei Samarin, Valentin Malykh

Erschienen in: Analysis of Images, Social Networks and Texts

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We present an ensemble-based method for classifying photographs containing patches with text. In particular, the proposed solution is suitable for the task of classification the images of commercial building facades by the type of provided services. Our model is based on heterogeneous ensemble usage and analysis of textual and visual features as well as special visual descriptors for areas with English text. It should be noted that our classifier demonstrates remarkable performance (0.71 in \(F_1\) score against 0.43 baseline result). We also provide our own dataset containing 3000 images of facades with signboards in order to provide complete classification benchmark.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
We used https://​Flickr.​com and chose only the images with ‘commercial use and modifications allowed’ licensing. The dataset is available here: https://​github.​com/​madrugado/​commercial-facades-dataset.
 
Literatur
2.
Zurück zum Zitat Intasuwan, T., Kaewthong, J., Vittayakorn, S.: Text and object detection on billboards. In: 2018 10th International Conference on Information Technology and Electrical Engineering (ICITEE), pp. 6–11, July 2018 Intasuwan, T., Kaewthong, J., Vittayakorn, S.: Text and object detection on billboards. In: 2018 10th International Conference on Information Technology and Electrical Engineering (ICITEE), pp. 6–11, July 2018
4.
Zurück zum Zitat Watve, A., Sural, S.: Soccer video processing for the detection of advertisement billboards. Pattern Recogn. Lett. 29(7), 994–1006 (2008)CrossRef Watve, A., Sural, S.: Soccer video processing for the detection of advertisement billboards. Pattern Recogn. Lett. 29(7), 994–1006 (2008)CrossRef
5.
Zurück zum Zitat Szegedy, C., et al.: Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9, June 2015 Szegedy, C., et al.: Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9, June 2015
6.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, June 2016 He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, June 2016
7.
Zurück zum Zitat Andrew, G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications, April 2017 Andrew, G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications, April 2017
8.
Zurück zum Zitat Tan, M., Le, Q.: Efficientnet: rethinking model scaling for convolutional neural networks, May 2019 Tan, M., Le, Q.: Efficientnet: rethinking model scaling for convolutional neural networks, May 2019
10.
Zurück zum Zitat Deng, J., Dong, W., Socher, R., Li, L., Li, V., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255, June 2009 Deng, J., Dong, W., Socher, R., Li, L., Li, V., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255, June 2009
12.
Zurück zum Zitat Zhou, X., et al.: East: an efficient and accurate scene text detector, April 2017 Zhou, X., et al.: East: an efficient and accurate scene text detector, April 2017
13.
Zurück zum Zitat Smith, V.: An overview of the tesseract ocr engine. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), vol. 2, pp. 629–633, September 2007 Smith, V.: An overview of the tesseract ocr engine. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), vol. 2, pp. 629–633, September 2007
14.
Zurück zum Zitat Malykh, V.: Robust word vectors for russian language. In: Proceedings of Artificial Intelligence and Natural Language AINL FRUCT 2016 Conference, Saint-Petersburg, Russia, pp. 10–12 (2016) Malykh, V.: Robust word vectors for russian language. In: Proceedings of Artificial Intelligence and Natural Language AINL FRUCT 2016 Conference, Saint-Petersburg, Russia, pp. 10–12 (2016)
15.
Zurück zum Zitat Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556 (2014) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556 (2014)
16.
Zurück zum Zitat Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018) Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
17.
Zurück zum Zitat Zoph, B., Vasudevan, V., Shlens, J., Le, V.: Learning transferable architectures for scalable image recognition, pp. 8697–8710 (2018) Zoph, B., Vasudevan, V., Shlens, J., Le, V.: Learning transferable architectures for scalable image recognition, pp. 8697–8710 (2018)
18.
Zurück zum Zitat Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269 (2017) Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269 (2017)
19.
Zurück zum Zitat Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: AAAI Conference on Artificial Intelligence (2016) Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: AAAI Conference on Artificial Intelligence (2016)
21.
Zurück zum Zitat Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection, June 2015 Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection, June 2015
22.
Zurück zum Zitat Liao, M., Shi, B., Bai, X., Wang, X., Liu, W.: Textboxes: a fast text detector with a single deep neural network, November 2016 Liao, M., Shi, B., Bai, X., Wang, X., Liu, W.: Textboxes: a fast text detector with a single deep neural network, November 2016
23.
Zurück zum Zitat Sang, D., Cuong, L.: Improving crnn with efficientnet-like feature extractor and multi-head attention for text recognition, pp. 285–290, December 2019 Sang, D., Cuong, L.: Improving crnn with efficientnet-like feature extractor and multi-head attention for text recognition, pp. 285–290, December 2019
24.
Zurück zum Zitat Baek, A., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection, pp. 9357–9366, June 2019 Baek, A., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection, pp. 9357–9366, June 2019
25.
Zurück zum Zitat Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding, Actober 2018 Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding, Actober 2018
26.
Zurück zum Zitat Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: a lite bert for self-supervised learning of language representations, September 2019 Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: a lite bert for self-supervised learning of language representations, September 2019
27.
Zurück zum Zitat Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1715–1725 (2016) Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1715–1725 (2016)
Metadaten
Titel
Ensemble-Based Commercial Buildings Facades Photographs Classifier
verfasst von
Aleksei Samarin
Valentin Malykh
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-72610-2_19

Premium Partner