nach oben

Pattern Recognition and Image Analysis

Erschienen in:

01.06.2023 | SELECTED CONFERENCE PAPERS

Predictors Based on Convolutional Neural Networks for the Movement Strategy of Trainable Agents for Building Customized Image Descriptors

verfasst von: A. Samarin, A. Savelev, A. Toropov, A. Dzestelova, V. Malykh, E. Mikhailova, A. Motyko

Erschienen in: Pattern Recognition and Image Analysis | Ausgabe 2/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

We present a description of various custom image descriptor modifications that are used as part of an image classification pipeline with text elements. The problem under consideration is related to the classification of images of commercial facades by the type of services provided. Some of the proposed descriptor types are presented for the first time and demonstrate state-of-the-art performance on open datasets. In our study, we used a special type of descriptor for image areas with text based on traces of the movement of agents. The traces in question are generated using parameterized movement strategies, which are presented and compared in this article.

Vorheriger Artikel One-Stage Classifiers Based on U-Net and Autoencoder with Attention for Recognition of Neoplasms from Single-Channel Monochrome Computed Tomography Images

Nächster Artikel An Extensible Approach to Searching and Selecting Data Sources for Materialized Big Data Integration in Distributed Computing Environments

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

L. Ballan, M. Bertini, A. D. Bimbo, and A. Jain, “Automatic trademark detection and recognition in sport videos,” in 2008 IEEE Int. Conf. on Multimedia and Expo, Hannover, 2008 (IEEE, 2008), pp. 901–904. https://doi.org/10.1109/ICME.2008.4607581

D. A. Chacra and J. Zelek, “Road segmentation in street view images using texture information,” in 13th Conf. on Computer and Robot Vision (CRV), Victoria, Canada, 2016 (IEEE, 2016), pp. 424–431. https://doi.org/10.1109/CRV.2016.47

T. Chattopadhyay and A. Sinha, “Recognition of trademarks from sports videos for channel hyperlinking in consumer end,” in IEEE 13th Int. Symp. on Consumer Electronics, Kyoto, Japan, 2009 (IEEE, 2009), pp. 943–947. https://doi.org/10.1109/ISCE.2009.5156881

A. Clavelli and D. Karatzas, “Text segmentation in colour posters from the Spanish Civil War era,” in 10th Int. Conf. on Document Analysis and Recognition, Barcelona, 2009 (IEEE, 2009), pp. 181–185. https://doi.org/10.1109/ICDAR.2009.32

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A large-scale hierarchical image database,” in 2009 IEEE Conf. on Computer Vision and Pattern Recognition, Miami, Fla., 2009 (IEEE, 2009), pp. 248–255. https://doi.org/10.1109/CVPR.2009.5206848

G. Howard, A. Zhu, M. Chen, B. Kalenichenko, D. Wang, W. Weyand, T. Andreetto, and M. Adam, “MobileNets: Efficient convolutional neural networks for mobile vision applications,” (2017). arXiv:1704.04861 [cs.CV]

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2016 (IEEE, 2016), pp. 770–778. https://doi.org/10.1109/CVPR.2016.90

G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, 2017 (IEEE, 2017), pp. 2261–2269. https://doi.org/10.1109/CVPR.2017.243

T. Intasuwan, J. Kaewthong, and S. Vittayakorn, “Text and object detection on billboards,” in 10th Int. Conf. on Information Technology and Electrical Engineering (ICITEE), Bali, Indonesia, 2018 (IEEE, 2018), pp. 6–11. https://doi.org/10.1109/ICITEED.2018.8534879

10.

T.-Yi Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, Microsoft COCO: Common objects in context,” in Computer Vision—ECCV 2014, Ed. by D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Lecture Notes in Computer Science, Vol. 8693 (Springer, Cham, 2014), pp. 740–755. https://doi.org/10.1007/978-3-319-10602-1_48CrossRef

11.

T. Liu, S. Fang, Y. Zhao, P. Wang, and J. Zhang, “Implementation of training convolutional neural networks,” (2015). arXiv:1506.01195 [cs.CV]

12.

V. Malykh and A. Samarin, “Combined advertising sign classifier,” in Analysis of Images, Social Networks and Texts. AIST 2019, Ed. by W. M. P. van der Aalst, V. Batagelj, D. I. Ignatov, M. Khachay, V. Kuskova, A. Kutuzov, S. O. Kuznetsov, I. A. Lomazova, N. Loukachevictch, A. Napoli, P. M. Pardalos, M. Pelillo, A. V. Savchenko, and E. Tutubalina, Lecture Notes in Computer Science, Vol. 11832 (Springer, Cham, 2019), pp. 179–185. https://doi.org/10.1007/978-3-030-37334-4_16CrossRef

13.

X. Wei, S. L. Phung, and A. Bouzerdoum, “Visual descriptors for scene categorization: experimental evaluation,” Artif. Intell. Rev. 45, 333–368 (2016). https://doi.org/10.1007/s10462-015-9448-4CrossRef

14.

S. Romberg, L. G. Pueyo, R. Lienhart, and R. van Zwol, “Scalable logo recognition in real-world images,” in Proc. 1st ACM Int. Conf. on Multimedia Retrieval, Trento, Italy, 2011 (Association for Computing Machinery, New York, 2011), p. 25. https://doi.org/10.1145/1991996.1992021

15.

A. Samarin and V. Malykh, “Worm-like image descriptor for signboard classification,” CEUR Workshop Proc. 2691, 17 (2020). https://ceur-ws.org/Vol-2691/paper17.pdf

16.

A. Samarin and V. Malykh, “Ensemble-based commercial buildings facades photographs classifier,” in Analysis of Images, Social Networks and Texts, Ed. by W. M. P. van der Aalst, V. Batagelj, D. I. Ignatov, M. Khachay, O. Koltsova, A. Kutuzov, S. O. Kuznetsov, I. A. Lomazova, N. Loukachevitch, A. Napoli, A. Panchenko, P. M. Pardalos, M. Pelillo, A. V. Savchenko, and E. Tutubalina, Lecture Notes in Computer Science, Vol. 12602 (Springer, Cham, 2021), pp. 257–265. https://doi.org/10.1007/978-3-030-72610-2_19CrossRef

17.

A. Samarin, V. Malykh V., Muravyov, S., “Specialized image descriptors for signboard photographs classification,” in Databases and Information Systems. DB&IS 2020, Ed. by T. Robal, H. M. Haav, J. Penjam, and R. Matulevičius, Communications in Computer and Information Science, Vol. 1243 (Springer, Cham, 2020), pp. 122–129. https://doi.org/10.1007/978-3-030-57672-1_10

18.

M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-Ch. Chen, “MobileNetV2: Inverted residuals and linear bottlenecks,” in 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Salt Lake City, Utah, 2018 (IEEE, 2018), pp. 4510–4520. https://doi.org/10.1109/CVPR.2018.00474

19.

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” (2014). arXiv:1409.1556 [cs.CV]

20.

R. Smith, An overview of the Tesseract OCR engine,” in Ninth Int. Conf. on Document Analysis and Recognition (ICDAR 2007), Cutitiba, Brazil, 2007 (IEEE, 2007), vol. 2, pp. 629–633. https://doi.org/10.1109/ICDAR.2007.4376991

21.

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in 2015 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, 2015 (IEEE, 2015), pp. 1–9. https://doi.org/10.1109/CVPR.2015.7298594

22.

Ch. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, “Inception-v4, inception-resnet and the im-pact of residual connections on learning,” Proc. AAAI Conf. Artif. Intell. 31 (1) (2017). https://doi.org/10.1609/aaai.v31i1.11231

23.

M. Tan and Q. Le, “EfficientNet: Rethinking model scaling for convolutional neural networks,” Proc. Mach. Learn. Res. 97, 6105–6114 (2019). http://proceedings.mlr.press/v97/tan19a.html

24.

M. Tan and Q. Le, “EfficientNetV2: Smaller models and faster training,” Proc. Mach. Learn. Res. 139, 10096–10106 (2021). https://proceedings.mlr.press/v139/tan21a.html

25.

Z. Tian, W. Huang, T. He, P. He, and Y. Qiao, “Detecting text in natural image with connetionist text proposal network,” in Computer Vision—ECCV 2016, Ed. by B. Leibe, J. Matas, N. Sebe, and M. Welling, Lecture Notes in Computer Science, Vol. 9912 (Springer, Cham, 2016), pp. 56–72. https://doi.org/10.1007/978-3-319-46484-8_4CrossRef

26.

T. Tsai, W. Cheng, C. You, M. Hu, A. W. Tsui, and H. Chi, “Learning and recognition of on-premise signs from weakly labeled street view images,” IEEE Trans. Image Process. 23, 1047–1059 (2014). https://doi.org/10.1109/TIP.2014.2298982MathSciNetCrossRefMATH

27.

A. Watve and S. Sural, “Soccer video processing for the detection of advertisement billboards,” Pattern Recogn. Lett. 29, 994–1006 (2008). https://doi.org/10.1016/j.patrec.2008.01.022CrossRef

28.

J. Zhou, K. McGuinness, and N. E. O’Connor, “A text recognition and retrieval system for e-business image management,” in MultiMedia Modeling, Ed. by K. Schoeffmann, T. H. Chalidabhongse, Ch. W. Ngo, S. Aramvith, N. E. O’Connor, Yo-S. Ho, M. Gabbouj, and A. Elgammal, Lecture Notes in Computer Science, Vol. 10705 (Springer, Cham, 2018), pp. 23–35. https://doi.org/10.1007/978-3-319-73600-6_3

29.

X. Zhou, C. Yao, H. Wen, Y. ang, S. Zhou, W. He, and J. Liang, “East: An efficient and accurate scene text detector,” in 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii (IEEE, 2017), pp. 2642–2651. https://doi.org/10.1109/CVPR.2017.283

Titel: Predictors Based on Convolutional Neural Networks for the Movement Strategy of Trainable Agents for Building Customized Image Descriptors
verfasst von: A. Samarin
A. Savelev
A. Toropov
A. Dzestelova
V. Malykh
E. Mikhailova
A. Motyko
Publikationsdatum: 01.06.2023
Verlag: Pleiades Publishing
Erschienen in: Pattern Recognition and Image Analysis / Ausgabe 2/2023
Print ISSN: 1054-6618
Elektronische ISSN: 1555-6212
DOI: https://doi.org/10.1134/S105466182302013X

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 2/2023

Reversible Mapping of Relational and Graph Databases

One-Stage Classifiers Based on U-Net and Autoencoder with Attention for Recognition of Neoplasms from Single-Channel Monochrome Computed Tomography Images

Tracking People in Video Using Neural Network Features and Facial Identification Taking into Account the Mask Mode

The Generation of Human Handwriting in Russian

Deep Reinforcement Learning for the Capacitated Pickup and Delivery Problem with Time Windows

Recognition of Local Anisotropies of Muon Fluxes Using Normalized Variations for Matrix Observations of the URAGAN Hodoscope

Premium Partner