nach oben

International Journal on Interactive Design and Manufacturing (IJIDeM)

Erschienen in:

03.08.2021 | Original Paper

Automated evaluation of foreign language speaking performance with machine learning

verfasst von: Ramon F. Brena, Evelyn Zuvirie, Alan Preciado, Aristh Valdiviezo, Miguel Gonzalez-Mendoza, Carlos Zozaya-Gorostiza

Erschienen in: International Journal on Interactive Design and Manufacturing (IJIDeM) | Ausgabe 2-3/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In a globalized world, the need to speak foreign languages, particularly English, is imperative. One challenge for learning foreign languages at the scale of millions is that, although teaching content is widely available, speaking skills are harder to develop than vocabulary, because feedback from a teacher is needed to correct pronunciation, intonation, etc. There are currently no automated tools to evaluate the fluency or pronunciation level of language students, so this evaluation, which is required even for placing the student into the right level, requires an interview with a language teacher. We have proposed a supervised machine-learning method for automatically evaluating both the fluency and the pronunciation of a language student, as well as detecting specific pronunciation mistakes, taking English as the target language. In order to train a classifier for the classes “low”, “intermediate” and “high”, we first built datasets of audio samples of English-learning students talking. Each audio was divided into small segments, and for each segment a set of features were calculated. We trained several classifiers, which made predictions about the level of a given non-native English speaker. We performed a series of tests with the trained classifiers, comparing the predicted class of audio segments not included in the training dataset, for accuracy, precision, and other measures. Results were promising, as for both fluency and pronunciation we obtained accuracy values of 94% and 99.9% in predictions, the second one being the highest accuracy ever reported on the literature for such predictions.

Vorheriger Artikel Encryption activity to improve higher-order thinking in engineering students

Nächster Artikel Looking for experimental evidence of critical thinking through EEG

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M.: Tensorflow: a system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) (2016)

Arafa, M.N., Elbarougy, R., Ewees, A.A., Behery, G.M.: A dataset for speech recognition to support Arabic phoneme pronunciation. Int J Image Graph Signal Process 11, 31 (2018)CrossRef

Bishop, C.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, Berlin (2006)MATH

Bowles, M.: Machine Learning in Python: Essential Techniques for Predictive Analysis. Wiley, Hoboken (2015)CrossRef

Black, M. P., Bone, D., Skordilis, Z.I., Gupta, R., Xia, W., Papadopoulos, P., Chakravarthula, S.N., Xiao, B., Segbroeck, M.V., Kim, J., Georgiou, P.G.: Automated evaluation of non-native English pronunciation quality: combining knowledge-and data-driven features at multiple time scales. In: Sixteenth Annual Conference of the International Speech Communication Association (2015)

Camastra, F., Vinciarelli, A.: Machine Learning for Audio, Image and Video Analysis: Theory and Applications. Springer, Berlin (2015)CrossRef

Chen, L., Zechner, K., Xi, X.: Improved pronunciation features for construct-driven assessment of non-native spontaneous speech. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics (2009)

Delgado-Contreras, J.R., García-Vázquez, J.P., Brena, R.: Classification of environmental audio signals using statistical time and frequency features. In: 2014 International Conference on Electronics, Communications and Computers (CONIELECOMP) (2014)

Engwall, O., Bälter, O.: Pronunciation feedback from real and virtual language teachers. Comput. Assist. Lang. Learn. 20(3), 235–262 (2007)CrossRef

10.

Ehsani, F., Knodt, E.: Speech technology in computer-aided language learning: Strengths and limitations of a new CALL paradigm. Lang Learn Technol 21, 54–73 (1998)

11.

Fu, J., Chiba, Y., Nose, T., Ito, A.: Automatic assessment of English proficiency for Japanese learners without reference sentences based on deep neural network acoustic models. Speech Commun. 116, 86–97 (2020)CrossRef

12.

Giannakopoulos, T.: Pyaudioanalysis: An open-source python library for audio signal analysis. PLoS ONE 10(12), 144610 (2015)CrossRef

13.

Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649 (2013)

14.

Gulli, A., Pal, S.: Deep Learning with Keras. Packt Publishing Ltd, Birmingham (2017)

15.

Khan, M.K., Al-Khatib, W.G.: Machine-learning based classification of speech and music. Multimed. Syst. 12(1), 55–67 (2006)CrossRef

16.

Kulkarni, A., Iyer, D., Sridharan, S.R.: Audio segmentation. In: CITESEER. IEEE, International Conference on Data Mining, 29 Nov.–2 Dec, San Jose, California (2001)

17.

Khalid, S., Khalil, T., Nasreen, S.: A survey of feature selection and feature extraction techniques in machine learning. In: 2014 Science and Information Conference (pp. 372–378). IEEE (2014)

18.

Kotsiantis, S.B., Zaharakis, I., Pintelas, P.: Supervised machine learning: a review of classification techniques. In: Emerging Artificial Intelligence Applications in Computer Engineering, pp. 3–24 (2007)

19.

Lantz, B.: Machine Learning with R. Packt Publishing Ltd, Birmingham (2015)

20.

Liu, Z., Huang, J., Wang, Y., Chen, T.: Audio feature extraction and analysis for scene classification. In Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing (1997)

21.

Liu, H., Motoda, H.: Computational Methods of Feature Selection. CRC Press, Boca Raton (2007)CrossRef

22.

McFee, B., Raffel, C., Liang, D., Ellis, D.P., McVicar, M., Battenberg, E., Nieto, O., Librosa: audio and music signal analysis in python. In: Proceedings of the 14th Python in Science Conference. Austin, Texas, July 6–12, pp.18–24 (2015)

23.

Orozco-Arevalo, M.G., Clúster: “S-Impura” en la pronunciación del idioma inglés en los estudiantes de la Universidad Central del Ecuador, de la Facultad de Filosofía, Letras y Ciencias de la Educación, de la carrera Plurilingüe de séptimo y octavos niveles de inglés en el período escolar 2017–2018 (Bachelor's thesis, Quito: UCE)

24.

Piczak, K.J. (2015) Environmental sound classification with convolutional neural networks. In: IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP), Boston, USA, September 17–20.

25.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH

26.

Silla, Jr C.N., Kaestner, C.A., Koerich, A.L.: Automatic music genre classification using ensemble of classifiers. In: IEEE International Conference on Systems, Man and Cybernetics (2007)

27.

Subramanian, H., Rao, P., Roy, S.D.: Audio signal classification. EE Dept., IIT Bombay (2004)

28.

Smola, A., Vishwanathan, S.V.N.: Introduction to Machine Learning. Cambridge University Press, Cambridge (2008)

29.

Schmidt, M., Walters, R., Ault, B., Poudel, K., Mischke, A., Jones, S., Sockhecke, A., Spears, M., Clarke, P., Makram, R., Meagher, S.: A simple web utility for automatic speech quantification in dyadic reading interactions. In: International Conference on Human-Computer Interaction, Jul 26 (pp. 122–130), Springer (2019)

30.

Sammut, C., Webb, G.I.: Encyclopedia of machine learning and data mining. Springer Publishing Company, Berlin (2017)CrossRef

31.

Volle, L.M.: Analyzing oral skills in voice e-mail and online interviews. Lang. Learn. Technol. 9(3), 146–163 (2005)

32.

Wang, Y., Gales, M.J.F., Knill, K.M., Kyriakopoulos, K., Malinin, A., van Dalen, R.C., Rashid, M.: Towards automatic assessment of spontaneous spoken English. Speech Commun. 104, 47–56 (2018)CrossRef

33.

Wetzel, J.M., Killen, J.: A Preliminary Report on the Zero-Crossing-Rate Technique for Average Shear Measurement in Flowing Fluid. University of Minnesota, Minneapolis (1972)

34.

Wall, M.E., Rechtsteiner, A., & Rocha, L.M.: Singular value decomposition and principal component analysis. In: A Practical Approach to Microarray Data Analysis, pp. 91–109. Springer, Boston (2003)

35.

Yang, X., Loukina, A., Evanini, K.: Machine learning approaches to improving pronunciation error detection on an imbalanced corpus. In: 2014 IEEE Spoken Language Technology Workshop, South Lake Tahoe, California and Nevada, Dec 7–10 (2014)

36.

Zechner, K., Higgins, D., Xi, X., Williamson, D.M.: Automatic scoring of non-native spontaneous speech in tests of spoken English. Speech Commun. 51(10), 883–895 (2009)CrossRef

Titel: Automated evaluation of foreign language speaking performance with machine learning
verfasst von: Ramon F. Brena
Evelyn Zuvirie
Alan Preciado
Aristh Valdiviezo
Miguel Gonzalez-Mendoza
Carlos Zozaya-Gorostiza
Publikationsdatum: 03.08.2021
Verlag: Springer Paris
Erschienen in: International Journal on Interactive Design and Manufacturing (IJIDeM) / Ausgabe 2-3/2021
Print ISSN: 1955-2513
Elektronische ISSN: 1955-2505
DOI: https://doi.org/10.1007/s12008-021-00759-z

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2-3/2021

Towards the development of a smart manufacturing system for the automated remodeling and manufacturing of standard parts

An interactive resource value mapping tool to support the reduction of inefficiencies in smart manufacturing processes

An economic feasibility study using a system-dynamics-based archetype of RFID implementation in a manufacturing firm

The performance analysis of functionally graded material (FGM) based single point cutting tool

Assessment of the effect of using air conditioning on the vehicle's real fuel consumption

3D matching by combining CAD model and computer vision for autonomous bin picking