nach oben

Mobile Networks and Applications

10.05.2024 | Research

DGNet: A Handwritten Mathematical Formula Recognition Network Based on Deformable Convolution and Global Context Attention

verfasst von: Cuihong Wen, Lemin Yin, Shuai Liu

Erschienen in: Mobile Networks and Applications

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The Handwritten Mathematical Expression Recognition (HMER) task aims to generate corresponding LATEX sequences from images of handwritten mathematical expressions. Currently, the encoder-decoder architecture has made significant progress in this task. However, the architecture based on the DenseNet encoder fails to adequately consider the unique features of handwritten mathematical expressions (HME) and the similarity between different characters. Additionally, the decoder, with its small receptive field during the decoding process, fails to effectively capture the spatial positional information of the targets, resulting in a lack of global contextual information during decoding. To address these issues, this paper proposes a neural network called DGNet based on deformable convolution and global contextual attention. Our network takes into full consideration the sparse nature of handwritten mathematical formulas and utilizes the properties of deformable convolution, allowing the convolution kernel to deform based on the content of the neighborhood. This enables our model to better adapt to geometric changes and other deformations in handwritten mathematical expressions. Simultaneously, we introduce GCAttention in optimizing the feature part to fully aggregate global contextual features of both position and channel. In experiments, our model achieved accuracies of 58.51%, 56.32%, and 56.1% on the CROHME 2014, 2016, and 2019 datasets, respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik.

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

Jetzt informieren

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information.

Order your 30-days-trial for free and without any commitment.

Jetzt informieren

Wang S, Govindaraj VV, Gorriz JM et al (2021) Covid-19 classification by FGCNet with deep feature fusion from graph convolutional network and convolutional neural network. Inform Fusion 67:208–229CrossRef

Wang S, Nayak DR, Guttery DS et al (2021) COVID-19 classification by CCSHNet with deep fusion using transfer learning and discriminant correlation analysis. Inform Fusion 68:131–148CrossRef

Yue X, Kuang Z, Lin C, Sun H, Zhang W (2020) Robustscanner: Dynamically enhancing positional clues for robust text recognition. In: Computer vision–ECCV 2020: 16th european conference proceedings, Part XIX. Springer, Glasgow, pp 135–151CrossRef

Shi B, Bai X, Yao C (2016) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Tran Pattern Anal Mach Intell 39(11):2298–2304CrossRef

Shi B, Yang M, Wang X, Lyu P, Yao C, Bai X (2018) Aster: An attentional scene text recognizer with flexible rectification. IEEE Trans Pattern Anal Mach Intell 41(9):2035–2048CrossRef

Fang S, Xie H, Wang Y, Mao Z, Zhang Y (2021) Read like humans: Autonomous, bidirectional and iterative language modeling for scene text recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7098–7107

Zhang Y, Dong Z, Wang S et al (2020) Advances in multimodal data fusion in neuroimaging: overview, challenges, and novel orientation. Inform Fusion 64:149–187CrossRef

Tang JM, Guo HY, Wu JW et al (2024) Offline handwritten mathematical expression recognition with graph encoder and transformer decoder[J]. Pattern Recogn 148:110155CrossRef

Pal A, Singh KP (2023) AdamR-GRUs: Adaptive momentum-based Regularized GRU for HMER problems[J]. Appl Soft Comput 143:110457CrossRef

10.

Le AD, Indurkhya B, Nakagawa M (2019) Pattern generation strategies for improving recognition of handwritten mathematical expressions. Pattern Recogn Lett 128:255–262CrossRef

11.

Li Z, Jin L, Lai S, Zhu Y (2020) Improving attention-based handwritten mathematical expression recognition with scale augmentation and drop attention. In: 2020 17th international conference on frontiers in handwriting recognition (ICFHR). IEEE, pp 175–180CrossRef

12.

Le AD (2020) Recognizing handwritten mathematical expressions via paired dual loss attention network and printed mathematical expressions. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 566–567

13.

Zhang J, Du J, Zhang S, Liu D, Hu Y, Hu J, Wei S, Dai L (2017) Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition. Pattern Recogn 71:196–206CrossRef

14.

Ding H, Chen K, Huo Q (2021) An encoder-decoder approach to handwritten mathematical expression recognition with multi-head attention and stacked decoder. In: Document analysis and recognition–ICDAR 2021: 16th international conference, Lausanne, Switzerland, September 5–10, 2021, proceedings, part II 16. Springer International Publishing, pp 602–616CrossRef

15.

Truong TN, Nguyen CT, Phan KM, Nakagawa M (2020) Improvement of end-to-end offline handwritten mathematical expression recognition by weakly supervised learning. In: 2020 17th international conference on frontiers in handwriting recognition (ICFHR). IEEE, pp 181–186CrossRef

16.

Wu JW, Yin F, Zhang YM, Zhang XY, Liu CL (2021) Graph-to-graph: towards accurate and interpretable online handwritten mathematical expression recognition. Proc AAAI Conf Artif Intell 35(4):2925-2933

17.

Zhao W, Gao L, Yan Z, Peng S, Du L, Zhang Z (2021) Handwritten mathematical expression recognition with bidirectionally trained transformer. In: Document analysis and recognition–ICDAR 2021: 16th international conference, Lausanne, Switzerland, September 5–10, 2021, proceedings, part II 16. Springer International Publishing, pp 570–584CrossRef

18.

Zhang J, Du J, Dai L (2018) Multi-scale attention with dense encoder for handwritten mathematical expression recognition. In: 2018 24th international conference on pattern recognition (ICPR). IEEE, pp 2245–2250CrossRef

19.

Bian X, Qin B, Xin X, Li J, Su X, Wang Y (2022) Handwritten mathematical expression recognition via attention aggregation based bi-directional mutual learning. Proc AAAI Conf Artif Intell 36(1):113–121

20.

Li B, Yuan Y, Liang D et al (2022) When counting meets HMER counting-aware network for handwritten mathematical expression recognition. In: European conference on computer vision. Springer Nature Switzerland, Cham, pp 197–214

21.

Zhao W, Gao L, Yan Z, Peng S, Du L, Zhang Z (2021) Handwritten mathematical expression recognition with bidirectionally trained transformer. In: Document analysis and recognition–ICDAR 2021: 16th international conference, proceedings, part II 16. Springer, Lausann, pp 570–584CrossRef

22.

Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 764–773

23.

Cao Y, Xu J, Lin S, Wei F, Hu H (2019) Gcnet: Non-local networks meet squeeze-excitation networks and beyond. In: Proceedings of the IEEE/CVF international conference on computer vision workshops

24.

Mouchere H, Viard-Gaudin C, Zanibbi R, Garain U (2014) ICFHR 2014 competition on recognition of online handwritten mathematical expressions (CROHME 2014). In: 2014 14th international conference on frontiers in handwriting recognition. IEEE, pp 791–796CrossRef

25.

Mouchère H, Viard-Gaudin C, Zanibbi R, Garain U (2016, October) ICFHR2016 CROHME: Competition on recognition of online handwritten mathematical expressions. In: 2016 15th international conference on frontiers in handwriting recognition (ICFHR). IEEE, pp 607–612CrossRef

26.

Deng Y, Kanervisto A, Ling J, Rush AM (2017) Image-to-markup generation with coarse-to-fine attention. In: International conference on machine learning. PMLR, pp 980–989

27.

Wu JW, Yin F, Zhang YM, Zhang XY, Liu CL (2019) Image-to-markup generation via paired adversarial learning. In: Machine learning and knowledge discovery in databases: European conference, ECML PKDD 2018, Dublin, Ireland, September 10–14, 2018, Proceedings, Part I 18, Springer International Publishing, pp 18–34

28.

Wu JW, Yin F, Zhang YM et al (2020) Handwritten mathematical expression recognition via paired adversarial learning. Int J Comput Vis 128:2386–2401MathSciNetCrossRef

29.

30.

Zhang J, Du J, Yang Y, Song YZ, Wei S, Dai L (2020) A tree-structured decoder for image-to-markup generation. In: International conference on machine learning. PMLR, pp 11076–11085

31.

32.

Zhao W, Gao L, Yan Z, Peng S, Du L, Zhang Z (2021) Handwritten mathematical expression recognition with bidirectionally trained transformer. In: Document Analysis and Recognition–ICDAR 2021: 16th International Conference, Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part II 16. Springer International Publishing, pp 570–584CrossRef

33.

Zhao W, Gao L (2022) Comer: Modeling coverage for transformer-based handwritten mathematical expression recognition. In: European conference on computer vision. Springer Nature Switzerland, Cham, pp 392–408

34.

Hu P, Ma J, Zhang Z, Du J, Zhang J (2023) Count, decode and fetch: a new approach to handwritten Chinese character error correction. arXiv preprint arXiv:2307.16253

35.

Li Y, Du J, Zhang J, Wu C (2023) A tree-structure analysis network on handwritten chinese character error correction. IEEE Trans Multimedia 25:3615–3627CrossRef

36.

Hu P, Zhang Z, Zhang J, Du J, Wu J (2022, August) Multimodal tree decoder for table of contents extraction in document images. In: 2022 26th international conference on pattern recognition (ICPR). IEEE, pp 1756–1762CrossRef

37.

Zhu X, Su W, Lu L, Li B, Wang X, Dai J (2020) Deformable DETR: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159

38.

Zhu X, Hu H, Lin S, Dai J (2019) Deformable convnets v2: More deformable, better results. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9308–9316

39.

Liu Z, Lin W, Li X, Rao Q, Jiang T, Han M, et al (2021) ADNet: Attention-guided deformable convolutional network for high dynamic range imaging. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 463-470

40.

Cojocaru I, Cascianelli S, Baraldi L, Corsini M, Cucchiara R (2021) Watch your strokes: improving handwritten text recognition with deformable convolutions. In: 2020 25th international conference on pattern recognition (ICPR). IEEE, pp 6096–6103CrossRef

41.

Haixin C et al (2023) DCAM-Net: A rapid detection network for strip steel surface defects based on deformable convolution and attention mechanism. IEEE Trans Instrum Measur 72:1–12

42.

Yu H, Yun L, Chen Z, Cheng F, Zhang C (2023) A small object detection algorithm based on modulated deformable convolution and large kernel convolution. Computat Intell Neurosci 2023:2506274

43.

Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7794–7803

44.

Ashish V (2017) Attention is all you need. Adv Neural Inf Process Syst 30:I

45.

Zeiler MD (2012) Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701

46.

Zhang Z, He T, Zhang H, Zhang Z, Xie J, Li M (2019) Bag of freebies for training object detection neural networks. arXiv preprint arXiv:1902.04103

47.

Yuan Y, Liu X, Dikubab W, Liu H, Ji Z, Wu Z, Bai X (2022) Syntax-aware network for handwritten mathematical expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4553–4562

48.

Fu Y, Cai W, Gao M, Zhou A (2023) Symbol location-aware network for improving handwritten mathematical expression recognition. In: Proceedings of the 2023 ACM international conference on multimedia retrieval, pp 516–524CrossRef

49.

Liu Z, Yuan Y, Ji Z, Bai J, Bai X (2023) Semantic graph representation learning for handwritten mathematical expression recognition. In: International conference on document analysis and recognition. Springer Nature Switzerland, Cham, pp 152–166

Titel: DGNet: A Handwritten Mathematical Formula Recognition Network Based on Deformable Convolution and Global Context Attention
verfasst von: Cuihong Wen
Lemin Yin
Shuai Liu
Publikationsdatum: 10.05.2024
Verlag: Springer US
Erschienen in: Mobile Networks and Applications
Print ISSN: 1383-469X
Elektronische ISSN: 1572-8153
DOI: https://doi.org/10.1007/s11036-024-02315-x

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

ATZelektronik

ATZelectronics worldwide