Skip to main content
Top

2021 | OriginalPaper | Chapter

GSSF: A Generative Sequence Similarity Function Based on a Seq2Seq Model for Clustering Online Handwritten Mathematical Answers

Authors : Huy Quang Ung, Cuong Tuan Nguyen, Hung Tuan Nguyen, Masaki Nakagawa

Published in: Document Analysis and Recognition – ICDAR 2021

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Toward a computer-assisted marking for descriptive math questions, this paper presents clustering of online handwritten mathematical expressions (OnHMEs) to help human markers to mark them efficiently and reliably. We propose a generative sequence similarity function for computing a similarity score of two OnHMEs based on a sequence-to-sequence OnHME recognizer. Each OnHME is represented by a similarity-based representation (SbR) vector. The SbR matrix is inputted to the k-means algorithm for clustering OnHMEs. Experiments are conducted on an answer dataset (Dset_Mix) of 200 OnHMEs mixed of real patterns and synthesized patterns for each of 10 questions and a real online handwritten mathematical answer dataset of 122 student answers at most for each of 15 questions (NIER_CBT). The best clustering results achieved around 0.916 and 0.915 for purity, and around 0.556 and 0.702 for the marking cost on Dset_Mix and NIER_CBT, respectively. Our method currently outperforms the previous methods for clustering HMEs.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Mahdavi, M., Zanibbi, R., Mouchere, H., Viard-Gaudin, C., Garain, U.: CROHME+TFD: competition on recognition of handwritten mathematical expressions and typeset formula detection. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1533–1538 (2019) Mahdavi, M., Zanibbi, R., Mouchere, H., Viard-Gaudin, C., Garain, U.: CROHME+TFD: competition on recognition of handwritten mathematical expressions and typeset formula detection. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1533–1538 (2019)
2.
go back to reference LaViola, J.J., Zeleznik, R.C.: MathPad2: a system for the creation and exploration of mathematical sketches. ACM Trans. Graph. 23, 432–440 (2004)CrossRef LaViola, J.J., Zeleznik, R.C.: MathPad2: a system for the creation and exploration of mathematical sketches. ACM Trans. Graph. 23, 432–440 (2004)CrossRef
3.
go back to reference Chan, K.F., Yeung, D.Y.: PenCalc: a novel application of on-line mathematical expression recognition technology. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 774–778 (2001) Chan, K.F., Yeung, D.Y.: PenCalc: a novel application of on-line mathematical expression recognition technology. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 774–778 (2001)
4.
go back to reference O’Connell, T., Li, C., Miller, T.S., Zeleznik, R.C., LaViola, J.J.: A usability evaluation of AlgoSketch: a pen-based application for mathematics. In: Proceedings of Eurographics Symposium on Sketch-Based Interfaces Model, pp. 149–157 (2009) O’Connell, T., Li, C., Miller, T.S., Zeleznik, R.C., LaViola, J.J.: A usability evaluation of AlgoSketch: a pen-based application for mathematics. In: Proceedings of Eurographics Symposium on Sketch-Based Interfaces Model, pp. 149–157 (2009)
6.
go back to reference Ung, H.Q., Khuong, V.T.M., Le, A.D., Nguyen, C.T., Nakagawa, M.: Bag-of-features for clustering online handwritten mathematical expressions. In: Proceedings of International Conference on Pattern Recognition and Artificial Intelligence, pp. 127–132 (2018) Ung, H.Q., Khuong, V.T.M., Le, A.D., Nguyen, C.T., Nakagawa, M.: Bag-of-features for clustering online handwritten mathematical expressions. In: Proceedings of International Conference on Pattern Recognition and Artificial Intelligence, pp. 127–132 (2018)
7.
go back to reference Nguyen, C.T., Khuong, V.T.M., Nguyen, H.T., Nakagawa, M.: CNN based spatial classification features for clustering offline handwritten mathematical expressions. Pattern Recognit. Lett. 131, 113–120 (2020)CrossRef Nguyen, C.T., Khuong, V.T.M., Nguyen, H.T., Nakagawa, M.: CNN based spatial classification features for clustering offline handwritten mathematical expressions. Pattern Recognit. Lett. 131, 113–120 (2020)CrossRef
8.
go back to reference François, D., Wertz, V., Verieysen, M.: The concentration of fractional distances. IEEE Trans. Knowl. Data Eng. 19, 873–886 (2007)CrossRef François, D., Wertz, V., Verieysen, M.: The concentration of fractional distances. IEEE Trans. Knowl. Data Eng. 19, 873–886 (2007)CrossRef
9.
go back to reference Cummins, R., Zhang, M., Briscoe, T.: Constrained multi-task learning for automated essay scoring. In: Proceedings of Annual Meeting Association and Computing Linguistics, pp. 789–799 (2016) Cummins, R., Zhang, M., Briscoe, T.: Constrained multi-task learning for automated essay scoring. In: Proceedings of Annual Meeting Association and Computing Linguistics, pp. 789–799 (2016)
10.
go back to reference Salvatore, V., Francesca, N., Alessandro, C.: An Overview of current research on automated essay grading. J. Inf. Technol. Educ. Res. 2, 319–330 (2003) Salvatore, V., Francesca, N., Alessandro, C.: An Overview of current research on automated essay grading. J. Inf. Technol. Educ. Res. 2, 319–330 (2003)
11.
go back to reference Ishioka, T., Kameda, M.: Automated Japanese essay scoring system: jess. In: Proceedings of International Workshop Database Expert Systema and Applications, pp. 4–8 (2004) Ishioka, T., Kameda, M.: Automated Japanese essay scoring system: jess. In: Proceedings of International Workshop Database Expert Systema and Applications, pp. 4–8 (2004)
12.
go back to reference Srihari, S., Collins, J., Srihari, R., Srinivasan, H., Shetty, S., Brutt-Griffler, J.: Automatic scoring of short handwritten essays in reading comprehension tests. Artif. Intell. 172, 300–324 (2008)CrossRef Srihari, S., Collins, J., Srihari, R., Srinivasan, H., Shetty, S., Brutt-Griffler, J.: Automatic scoring of short handwritten essays in reading comprehension tests. Artif. Intell. 172, 300–324 (2008)CrossRef
13.
go back to reference Basu, S., Jacobs, C., Vanderwende, L.: Powergrading: a clustering approach to amplify human effort for short answer grading. Trans. Assoc. Comput. Linguist. 1, 391–402 (2013)CrossRef Basu, S., Jacobs, C., Vanderwende, L.: Powergrading: a clustering approach to amplify human effort for short answer grading. Trans. Assoc. Comput. Linguist. 1, 391–402 (2013)CrossRef
14.
go back to reference Brooks, M., Basu, S., Jacobs, C., Vanderwende, L.: Divide and correct: using clusters to grade short answers at scale. In: Proceedings of ACM Conference on Learning @ Scale, pp. 89–98 (2014) Brooks, M., Basu, S., Jacobs, C., Vanderwende, L.: Divide and correct: using clusters to grade short answers at scale. In: Proceedings of ACM Conference on Learning @ Scale, pp. 89–98 (2014)
15.
go back to reference Zhang, J., Du, J., Dai, L.: Track, attend, and parse (TAP): an end-to-end framework for online handwritten mathematical expression recognition. IEEE Trans. Multimed. 21, 221–233 (2019)CrossRef Zhang, J., Du, J., Dai, L.: Track, attend, and parse (TAP): an end-to-end framework for online handwritten mathematical expression recognition. IEEE Trans. Multimed. 21, 221–233 (2019)CrossRef
16.
go back to reference Hong, Z., You, N., Tan, J., Bi, N.: Residual BiRNN based Seq2Seq model with transition probability matrix for online handwritten mathematical expression recognition. In: Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, pp. 635–640 (2019). https://doi.org/10.1109/ICDAR.2019.00107 Hong, Z., You, N., Tan, J., Bi, N.: Residual BiRNN based Seq2Seq model with transition probability matrix for online handwritten mathematical expression recognition. In: Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, pp. 635–640 (2019). https://​doi.​org/​10.​1109/​ICDAR.​2019.​00107
18.
go back to reference Phan, K.M., Khuong, V.T.M., Ung, H.Q., Nakagawa, M.: Generating synthetic handwritten mathematical expressions from a LaTeX sequence or a MathML script. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 922–927 (2020) Phan, K.M., Khuong, V.T.M., Ung, H.Q., Nakagawa, M.: Generating synthetic handwritten mathematical expressions from a LaTeX sequence or a MathML script. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 922–927 (2020)
19.
go back to reference Yasuno, F., Nishimura, K., Negami, S., Namikawa, Y.: Development of mathematics items with dynamic objects for computer-based testing using tablet PC. Int. J. Technol. Math. Educ. 26, 131–137 (2019) Yasuno, F., Nishimura, K., Negami, S., Namikawa, Y.: Development of mathematics items with dynamic objects for computer-based testing using tablet PC. Int. J. Technol. Math. Educ. 26, 131–137 (2019)
20.
go back to reference Zhang, X.Y., Yin, F., Zhang, Y.M., Liu, C.L., Bengio, Y.: Drawing and recognizing Chinese characters with recurrent neural network. IEEE Trans. Pattern Anal. Mach. Intell. 40, 849–862 (2018)CrossRef Zhang, X.Y., Yin, F., Zhang, Y.M., Liu, C.L., Bengio, Y.: Drawing and recognizing Chinese characters with recurrent neural network. IEEE Trans. Pattern Anal. Mach. Intell. 40, 849–862 (2018)CrossRef
21.
22.
go back to reference Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035 (2007) Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035 (2007)
23.
go back to reference Nguyen, C.T., Truong, T.N., Ung, H.Q., Nakagawa, M.: Online handwritten mathematical symbol segmentation and recognition with bidirectional context. In: Proceedings of International Conference on Frontiers Handwriting Recognition, pp. 355–360 (2020) Nguyen, C.T., Truong, T.N., Ung, H.Q., Nakagawa, M.: Online handwritten mathematical symbol segmentation and recognition with bidirectional context. In: Proceedings of International Conference on Frontiers Handwriting Recognition, pp. 355–360 (2020)
25.
go back to reference Ma, Q., Zheng, J., Li, S., Cottrell, G.W.: Learning representations for time series clustering. In: Advances in Neural Information Processing Systems, pp. 3776–3786 (2019) Ma, Q., Zheng, J., Li, S., Cottrell, G.W.: Learning representations for time series clustering. In: Advances in Neural Information Processing Systems, pp. 3776–3786 (2019)
26.
go back to reference Rao, S.J., Wang, Y., Cottrell, G.: A deep siamese neural network learns the human-perceived similarity structure of facial expressions without explicit categories. In: Proceedings of the 38th Annual Conference of the Cognitive Science Society, pp. 217–222 (2016) Rao, S.J., Wang, Y., Cottrell, G.: A deep siamese neural network learns the human-perceived similarity structure of facial expressions without explicit categories. In: Proceedings of the 38th Annual Conference of the Cognitive Science Society, pp. 217–222 (2016)
Metadata
Title
GSSF: A Generative Sequence Similarity Function Based on a Seq2Seq Model for Clustering Online Handwritten Mathematical Answers
Authors
Huy Quang Ung
Cuong Tuan Nguyen
Hung Tuan Nguyen
Masaki Nakagawa
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-86331-9_10

Premium Partner