research-article

The impact of accents on automatic recognition of South African English speech: a preliminary investigation

Authors:
Audrey Mbogho

University of Cape Town, Rondebosch, South Africa

University of Cape Town, Rondebosch, South Africa
View Profile

,
Michelle Katz

University of Cape Town, Rondebosch, South Africa

University of Cape Town, Rondebosch, South Africa
View Profile

SAICSIT '10: Proceedings of the 2010 Annual Research Conference of the South African Institute of Computer Scientists and Information TechnologistsOctober 2010Pages 187–192https://doi.org/10.1145/1899503.1899524

Published:11 October 2010Publication History

SAICSIT '10: Proceedings of the 2010 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists

Pages 187–192

ABSTRACT

The accent with which words are spoken can have a strong effect on the performance of a speech recognition system. In a multilingual country such as South Africa where English is not the first language of most citizens, the need to address this issue is critical when building speech-based systems. In this project we trained two sets of hidden Markov Models for isolated word English speech. The first set of models was trained with native English speakers and the second set was trained with non-native speakers from a representative sample of major South African accent groups. We compared the recognition accuracies of the two sets of models and found that the models trained with accented English performed better. This preliminary research indicates that there is merit to committing resources to the task of accented training.

References

Atal, B. 1995. Speech Technology in 2001: New Research Directions. In Proceedings of the National Academy of Sciences of the United States of America. 92, 22, 10046--10051.Google ScholarCross Ref
Bahl, L., Jelinek, F., and Mercer, R. 1983. A Maximum Likelihood Approach to Continuous Speech Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. PAMI-5, 2, 179--190.Google ScholarDigital Library
Cai, J., Bouselmi, G., Laprie, Y., and Haton J. 2009. Efficient likelihood evaluation and dynamic Gaussian selection for HMM-based speech recognition. Computer Speech and Language. 23, 147--164. Google ScholarDigital Library
Chen, J., and Jang, J. 2008. TRUES: Tone recognition using extended segments. ACM Trans. Asian Lang. Inform. Process. 7, 3, Article 10 (August). Google ScholarDigital Library
Durling, S., and Lumsden, J. 2008. Speech Recognition use in Healthcare Applications. In Proceedings of the 6th International Conference on Advances in Mobile Computing and Multimedia (Linz Austria, 2008). 473--478. Google ScholarDigital Library
Govender, N., Barnard, E., and Davel, M. 2007. Pitch Modelling for the Nguni Languages. South African Computer Journal. 38, 28--39.Google Scholar
Huang, X. 1992. Minimizing Speaker Variation Effects for Speaker-Independent Speech Recognition. In Proceeding of the Workshop on Speech and Natural Language (Harriman New York, 1992). 191--196. Google ScholarDigital Library
Jeong, M., and Lee, G. 2008. Improving Speech Recognition and Understanding Using Error-Corrective Re-ranking. ACM Trans. Asian Lang. Inform. Process. 7, 1, Article 2 (February). Google ScholarDigital Library
Koumpis, K., and Renals, S. 2005. Automatic Summarization of Voicemail Messages Using Lexical and Prosodic Features. ACM Transactions on Speech and Language Processing. 2, 1, Article 1 (February). Google ScholarDigital Library
Lee, T., Lau, W., Wong, Y., and Ching, P. 2002. Using Tone Information in Cantonese Continuous Speech Recognition. ACM Trans. Asian Lang. Inform. Process. 1, 83--102. Google ScholarDigital Library
Levinson, S. 1995. Speech Recognition Technology: A Critique. In Proceedings of the National Academy of Sciences of the United States of America. 92, 22, 9953--9955.Google ScholarCross Ref
Lippman, R. 1997. Speech Recognition by Machines and Humans. Speech Communication. 22, 1--15. Google ScholarDigital Library
Markhoul, J., and Schwartz, R. 1995. State of the Art in Continuous Speech Recognition. In Proceedings of the National Academy of Sciences of the United States of America. 92, 22, 9956--9963.Google ScholarCross Ref
Morales, N., Toledano, D., Hansen, J., and Garrido, J. 2009. Feature Compensation Techniques for ASR on Band-Limited Speech. IEEE Transaction on Audio, Speech and Language Processing. 17, 4, 758--774. Google ScholarDigital Library
Mosur, R. 1996. Efficient Algorithms for Speech Recognition. PhD thesis, Carnegie Mellon University, May 1996. CMU-CS-96-143.Google Scholar
Rabiner, L. 1989. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. In Proceedings of the IEEE. 77, 2, 257--286.Google ScholarCross Ref
Rabiner, L., and Juang, B. 1993. Fundamentals of Speech Recognition. Prentice Hall. Google ScholarDigital Library
Roux, J., Botha, E., and du Preez, J. 2000. Developing a Multilingual Telephone Based Information System in African Languages. Second International Language Resources and Evaluation Conference. (Athens Greece, 2000).Google Scholar
Smit, W., and Barnard, E. 2009. Continuous Speech Recognition with Sparse Coding. Computer Speech and Language. 23, 200--219. Google ScholarDigital Library
Spencer, A. 1996. Phonology: Theory and Description. Blackwell Publishers: Great Britain.Google Scholar
Van der Merwe, I., Van der Merwe J. 2006. Linguistic Atlas of South Africa: Language in Space and Time. Sun Press: StellenboschGoogle Scholar
Xie, H., Andreae, P., Zhang, M., and Warren, P. 2004. Learning Models for English Speech Recognition. In Proceedings of Conferences in Research and Practice in Information Technology (Dunedin New Zealand, 2004). 26, 323--329. Google ScholarDigital Library
Young, S., Evermann, G., Gales, M., Hain, T., et al. 2009. The HTK Book. Cambridge University Engineering Department: CambridgeGoogle Scholar
Zerbian, S and Barnard, E. 2008. Phonetics of Intonation in South African Bantu languages. Southern African Linguistics and Applied Language Studies. 26(2), 235--254.Google ScholarCross Ref

Index Terms

The impact of accents on automatic recognition of South African English speech: a preliminary investigation
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Speech recognition
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction paradigms
      1. Natural language interfaces

Recommendations

Phoneme and tonal accent recognition for Thai speech
Highlights
► Phoneme recognition with soft phoneme segmentation procedure for Thai speech. ► Recognition system classifies phonemes using discrete hidden Markov models. ► MFPLP is better than MFCC as features in phoneme ...
Abstract
In this paper, we investigate the application of a phoneme recognition system with a soft phoneme segmentation procedure for Thai speech. In addition, we propose a new method to classify the tonal accent of a syllable. The recognition ...
Read More
Using prosody to improve automatic speech recognition

In this paper acoustic processing and modelling of the supra-segmental characteristics of speech is addressed, with the aim of incorporating advanced syntactic and semantic level processing of spoken language for speech recognition/understanding tasks. ...
Read More
Cued Speech automatic recognition in normal-hearing and deaf subjects

This article discusses the automatic recognition of Cued Speech in French based on hidden Markov models (HMMs). Cued Speech is a visual mode which, by using hand shapes in different positions and in combination with lip patterns of speech, makes all the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SAICSIT '10: Proceedings of the 2010 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists
October 2010
447 pages
ISBN:9781605589503
DOI:10.1145/1899503
Conference Chair:
Paula Kotzé
CSIR Meraka Institute, Pretoria, South Africa
,
Program Chairs:
Alta van der Merwe
CSIR Meraka Institute, Pretoria, South Africa
,
Aurona Gerber
CSIR Meraka Institute, Pretoria, South Africa
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 October 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
accents
hidden Markov models
speech recognition
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate187of439submissions,43%
Upcoming Conference
HT '24

Sponsor:

sigweb

35th ACM Conference on Hypertext and Social Media

September 10 - 13, 2024

Poznan , Poland
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 290
  Total Downloads
- Downloads (Last 12 months)8
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The impact of accents on automatic recognition of South African English speech: a preliminary investigation

SAICSIT '10: Proceedings of the 2010 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists

ABSTRACT

References

Cited By

Index Terms

Recommendations

Phoneme and tonal accent recognition for Thai speech

Using prosody to improve automatic speech recognition

Cued Speech automatic recognition in normal-hearing and deaf subjects

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

The impact of accents on automatic recognition of South African English speech: a preliminary investigation

SAICSIT '10: Proceedings of the 2010 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists

ABSTRACT

References

Cited By

Index Terms

Recommendations

Phoneme and tonal accent recognition for Thai speech

Using prosody to improve automatic speech recognition

Cued Speech automatic recognition in normal-hearing and deaf subjects

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media