Skip to main content

Trends and Directions in Computer-Assisted Pronunciation Training

  • Chapter
Investigating English Pronunciation

Abstract

The wide range of tools and applications available today offer promising solutions for the facilitation of pronunciation training, one of the most challenging and often neglected areas in foreign language teaching and learning. Research has shown that the possibilities offered by technology are numerous, offering a valuable asset for pronunciation training given the different obstacles foregin language learners face (e.g. scarcity of input, time limitations, or perceptual and productive biases as a result of their L1). This chapter reviews the literature on CAPT (Computer-Assisted Pronunciation Training) stressing the different ways in which technology has been shown to enhance the training of pronunciation. After a review of the potential different technologies have and some research trends addressing different types of perceptual and productive enhancement, empirical findings supporting the usefulness of technology are provided, together with a discussion of limitations. Finally, the chapter highlights the importance of self-monitoring for autonomous pronunciation practice and proposes some directions for future investigation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Abberton, E. and Fourcin, A. J. (1975). Visual feedback and the acquisition of intonation. In E. H. Lenneberg and E. Lenneberg (eds), Foundations of language development (pp. 157–65). New York: Academic Press.

    Google Scholar 

  • Acton, W. (1984). Changing fossilized pronunciation. TESOL Quarterly, 18: 71–85.

    Article  Google Scholar 

  • Akahane-Yamada, R., Adachi, T. and Kawahara, H. (1997). Second language production training using spectrographic representations as feedback. Journal of the Acoustical Society of Japan, 18: 341–3.

    Article  Google Scholar 

  • Akahane-Yamada, R., McDermott, E., Adachi, T., Kawahara, H. and Pruitt, J. S. (1998). Computer-based second language production training by using spectrographic representation and HMM-based speech recognition scores. ICSLP 98, Proceedings of the 5th International Conference on Spoken Language Processing. Rundle Mall, Australia: Casual Productions.

    Google Scholar 

  • Anderson-Hsieh, J. (1992). Using electronic visual feedback to teach suprasegmentals. System, 20 (1): 51–62.

    Article  Google Scholar 

  • Apple Inc. (2014). Oxford dictionary of English. (Version 2.2.1 for Mac) [Computer software]

    Google Scholar 

  • Avery, P. and Ehrlich, S. (1992). Teaching American English pronunciation. Oxford, UK: Oxford University Press.

    Google Scholar 

  • Badin, P., Tarabalka, Y., Elisei, F. and Bailly, G. (2010). Can you ‘read tongue movements’? Evaluation of the contribution of tongue display to speech understanding. Speech Communication, 52: 493–503.

    Article  Google Scholar 

  • Baker, A. (2006). Ship or sheep. An intermediate pronunciation course (3rd edn). Cambridge, UK: Cambridge University Press.

    Google Scholar 

  • Barreiro-Bilbao, S. C. (2013). Perception of natural and enhanced non-native contrasts in clear speech. Onomázein, 27: 207–19.

    Google Scholar 

  • Benzeghiba, M., De Mori, R., Deroo, O., Dupont, S., Erbes, T., Jouvet, D., Fissore, L., Laface, P., Mertins, A., Ris, C., Rose, R., Tyagi, V. and Wellekens, C. (2007). Automatic speech recognition and speech variability: A review. Speech Communication, 49 (10): 763–86.

    Article  Google Scholar 

  • Binnie, C. A., Montgomery, A. A. and Jackson, P. L. (1974). Auditory and visual contributions to the perception of consonants. Journal of Speech and Hearing Research, 17: 619–30.

    Article  Google Scholar 

  • Bongaerts, T., van Summeren, C., Planken, B. and Schils, E. (1997). Age and ultimate attainment in the pronunciation of a foreign language. Studies in Second Language Acquisition, 19: 447–65.

    Article  Google Scholar 

  • Bradlow, A. R., Pisoni, D. B., Akahana-Yamada, R. and Tohkura, Y. (1997). Training Japanese listeners to identify English/r/and/l/: IV. Some effects of perceptual learning on speech production. Journal of the Acoustical Society of America, 101: 2299–310.

    Article  Google Scholar 

  • Breitkreutz, J., Derwing, T. M. and Rossiter, M. J. (2001). Pronunciation teaching practices in Canada. TESL Canada Journal, 19: 51–61.

    Article  Google Scholar 

  • Brett, D. (2004). Computer generated feedback on vowel production by learners of English as a second language. ReCALL, 16 (1): 103–13.

    Article  Google Scholar 

  • Burleston, D. F. (2014). Improving intelligibility of non-native speech with computerassisted phonological training. Indiana University Linguistics Club Working Papers, 7: 1–18.

    Google Scholar 

  • Busà, M. G. (2008). New perspectives in teaching pronunciation. In A. Baldry, M. Pavesi, C. Taylor Torsello and C. Taylor (eds), From DIDACTAS to ECOLINGUA. An ongoing research project on translation and corpus linguistics (pp. 165–82). Trieste, Italy: Università degli Studi di Trieste.

    Google Scholar 

  • Carey, M. (2004). CALL visual feedback for pronunciation of vowels: Kay Sona-Speech. CALICO Journal, 21 (3): 571–601.

    Google Scholar 

  • Catford, J. C. and Pisoni, D. B. (1970). Auditory vs. articulatory training in exotic sounds. Modern Language Journal, 54 (7): 477–81.

    Google Scholar 

  • Celce-Murcia, M., Brinton, D. and Goodwin, J. (2010). Teaching pronunciation. A reference for teachers of English to speakers of other languages (2nd edn). Cambridge, UK: Cambridge University Press.

    Google Scholar 

  • Cenoz, J. and García-Lecumberri, M. L. (1999). The effect of training on the discrimination of English vowels. International Review of Applied Linguistics, 37 (4): 261–75.

    Article  Google Scholar 

  • Chapelle, C. and Jamieson, J. (2008). Tips for teaching with CALL: Practical approaches to computer-assisted language learning. White-Plains, NY: Pearson-Longman.

    Google Scholar 

  • Chen, T. H. and Massaro, D. W. (2011). Evaluation of synthetic and natural Mandarin visual speech: Initial consonants, single vowels, and syllables. Speech Communication, 53: 955–72.

    Article  Google Scholar 

  • Chun, D. M. (1998). Signal analysis software for teaching discourse intonation. Language Learning and Technology, 2 (1): 61–77. Retrieved from: http://llt.msu.edu/vol2num1/article4/

    Google Scholar 

  • Chun, D. M. (2007). Technological advances in researching and teaching phonology. In M. C. Pennington (ed.), Phonology in context (pp. 274–99). Basingstoke, UK: Palgrave Macmillan.

    Chapter  Google Scholar 

  • Coniam, D. (1999). Voice recognition software accuracy with second language speakers of English. System, 27 (1): 49–64.

    Article  Google Scholar 

  • Couper, G. (2003). The value of an explicit pronunciation syllabus in ESOL teaching. Prospect: An Australian Journal of TESOL, 18 (3): 53–70.

    Google Scholar 

  • Couper, G. (2006). The short and long-term effects of pronunciation instruction. Prospect: An Australian Journal of TESOL, 21 (1): 46–66.

    Google Scholar 

  • Couper, G. (2011). What makes pronunciation teaching work? Testing for the effect of two variables: Socially constructed metalanguage and critical listening. Language Awareness, 20 (3): 159–82.

    Article  Google Scholar 

  • Cranen, B., Weltens, B., de Bot, K. and van Rossum, N. (1984). An aid in language teaching: The visualization of pitch. System, 12 (1): 25–9.

    Article  Google Scholar 

  • Cruz-Ferreira, M. (1987). Non-native interpretive strategies for intonational meaning: An experimental study. In A. James and J. Leather (eds), Sound patterns in second language acquisition (pp. 103–20). Dordrecht, the Netherlands: Foris.

    Google Scholar 

  • de Bot, K. (1983). Visual feedback of intonation I: Effectiveness and induced practice behavior. Language and Speech, 26: 331–50.

    Google Scholar 

  • de Bot, K. and Mailfert, K. (1982). The teaching of intonation: Fundamental research and classroom applications. TESOL Quarterly, 16: 71–7.

    Article  Google Scholar 

  • Delmonte, R. (2008). Speech synthesis for language tutoring systems. In V. M. Holland and F. P. Fisher (eds), The path of speech technologies in computer assisted language learning (pp. 123–50). New York: Routledge.

    Google Scholar 

  • Derwing, T. M., Munro, M. J. and Carbonaro, M. (2000). Does popular speech recognition software work with ESL speech? TESOL Quarterly, 34: 592–603.

    Article  Google Scholar 

  • Dowd, A., Smith, J. and Wolfe, J. (1998). Learning to pronounce vowel sounds in a foreign language using acoustic measures of the vocal tract as feedback in real time. Language and Speech, 41 (1): 1–20.

    Google Scholar 

  • Ducate, L. and Lomicka, L. (2009). Podcasting: An effective tool for honing language students’ pronunciation? Language Learning and Technology, 13 (3): 66–86. Retrieved from: http://llt.msu.edu/vol13num3/ducatelomicka.pdf

    Google Scholar 

  • Ehsani, F. and Knodt, E. (1998). Speech technology in computer-aided language learning: Strengths and limitations of a new CALL paradigm. Language Learning and Technology, 2 (1): 54–73. Retrieved from: http://llt.msu.edu/vol2num1/article3/

    Google Scholar 

  • Elimat, A. K. and AbuSeileek, A. F. (2014). Automatic speech recognition technology as an effective means for teaching pronunciation. JALT CALL Journal, 10 (1): 21–47.

    Google Scholar 

  • Elliott, A. R. (1995). Field independence/dependence, hemispheric specialization, and attitude in relation to pronunciation accuracy in Spanish as a foreign language. Modern Language Journal, 79: 356–71.

    Article  Google Scholar 

  • Engwall, O. (2008). Can audio-visual instructions help learners improve their articulation? An ultrasound study of short term changes. Proceedings of Interspeech 2008 (pp. 2631–4). Brisbane, Australia: ISCA.

    Google Scholar 

  • Engwall, O. (2012). Analysis of and feedback on phonetic features in pronunciation training with a virtual teacher. Computer Assisted Language Learning, 25 (1): 37–64.

    Article  Google Scholar 

  • Eskenazi, M. (2009). An overview of spoken language technology for education. Speech Communication, 51 (10): 832–44.

    Article  Google Scholar 

  • Fagel, S. and Madany, K. (2008). A 3-D virtual head as a tool for speech therapy for children. Proceedings of Interspeech 2008 (pp. 2643–6). Brisbane, Australia: ISCA.

    Google Scholar 

  • Firth, S. (1987). Developing self-correcting and self-monitoring strategies. TESL Talk, 17 (1): 148–52.

    Google Scholar 

  • Flege, J. E. (1987). Effects of equivalence classification on the production of foreign language speech sounds. In A. James and J. Leather (eds), Sound patterns in second language acquisition (pp. 9–39). Dordrecht, the Netherlands: Foris.

    Google Scholar 

  • Flege, J. E. (1989). Using visual information to train foreign language vowel production. Language Learning, 38: 365–407.

    Article  Google Scholar 

  • Flege, J. E. (1991). Perception and production: The relevance of phonetic input to L2 phonological learning. In T. Hueber and C. Ferguson (eds), Crosscurrents in second language acquisition and linguistic theories (pp. 249–89). Amsterdam, the Netherlands: John Benjamins.

    Chapter  Google Scholar 

  • Flege, J. E. (1995). Second-language speech learning: Theory, findings and problems. In W. Strange (ed.), Speech perception and linguistic experience: Theoretical and methodological issues in cross-language speech research (pp. 233–77). Timonium, Md: York Press Inc.

    Google Scholar 

  • Flege, J. E., Munro, M. J. and MacKay, I. R. A. (1995). Factors affecting strength of perceived foreign accent in a second language. Journal of the Acoustical Society of America, 97: 3125–34.

    Article  Google Scholar 

  • Fouz-González, J. (2012). Can Apple’s iPhone help to improve English pronunciation autonomously? State of the app. In L. Bradley and S. Thouësny (eds), Proceedings, CALL: Using, Learning, Knowing, EUROCALL Conference, Gothenburg, Sweden, 22–25 August (pp. 81–7). Dublin, Ireland: Research-publishing.net.

    Google Scholar 

  • Fraser, H. (2006). Phonological concepts and concept formation: Metatheory, theory and application. International Journal of English Studies, 6 (2): 55–75. Retrieved from: http://revistas.um.es/ijes/article/viewFile/48801/46701

    Google Scholar 

  • Fraser, H. (2009). Pronunciation as categorization: The role of contrast in teaching English/r/and/l/. In A. Mahboob and C. Lipovsky (eds), Studies in applied linguistics and language learning (pp. 289–306). Newcastle upon Tyne, UK: Cambridge Scholars Publishing.

    Google Scholar 

  • Gick, B., Bernhardt, M., Bacsfalvi, P. and Wilson, I. (2008). Ultrasound imaging applications in second language acquisition. In J. Edwards and M. Zampini (eds), Phonology and second language acquisition (pp. 309–22). Amsterdam, the Netherlands: John Benjamins Publishing Company.

    Chapter  Google Scholar 

  • Godwin-Jones, R. (2009). Emerging technologies: Speech tools and technologies. Language Learning and Technology, 13 (3): 4–11. Retrieved from: http://llt.msu.edu/vol13num3/emerging.pdf

    Google Scholar 

  • Gómez, P., Álvarez, A., Martínez, R., Bobadilla, J., Bernal, J., Rodellar, V. and Nieto, V. (2008). Applications of formant detection in language learning. In V. M. Holland and F. P. Fisher (eds), The path of speech technologies in computer assisted language learning (pp. 44–65). New York: Routledge.

    Google Scholar 

  • Gordon, P. C., Keyes, L. and Yung, Y. F. (2001). Ability in perceiving nonnative contrasts: Performance on natural and synthetic speech stimuli. Perception and Psychophysics, 63: 746–58.

    Article  Google Scholar 

  • Hancock, M. (2003). English pronunciation in use. Cambridge, UK: Cambridge University Press.

    Google Scholar 

  • Handley, Z. (2009). Is text-to-speech synthesis ready for use in computer-assisted language learning? Speech Communication, 51 (10): 906–19.

    Article  Google Scholar 

  • Handley, Z. and Hamel, M. J. (2005). Establishing a methodology for benchmarking speech synthesis for computer-assisted language learning (CALL). Language Learning and Technology, 9 (3): 99–119. Retrieved from: http://llt.msu.edu/vol9num3/pdf/handley.pdf

    Google Scholar 

  • Hardison, D. M. (2003). Acquisition of second-language speech: Effects of visual cues, context, and talker variability. Applied Psycholinguistics, 24: 495–522.

    Article  Google Scholar 

  • Hardison, D. M. (2004). Generalization of computer-assisted prosody training: Quantitative and qualitative findings. Language Learning and Technology, 8 (1): 34–52. Retrieved from: http://www.llt.msu.edu/vol8num1/pdf/hardison.pdf

    Google Scholar 

  • Hardison, D. M. (2005). Contextualised computer-based L2 prosody training: Evaluating the effects of discourse context and video input. CALICO Journal, 22 (2): 175–90.

    Google Scholar 

  • Hardison, D. M. (2007). The visual element in phonological perception and learning. In M. C. Pennington (ed.), Phonology in context (pp. 135–58). Basingstoke, UK: Palgrave Macmillan.

    Chapter  Google Scholar 

  • Hazan, V. and Li, E. (2008). The effect of auditory and visual degradation on audiovisual perception of native and non-native speakers. Proceedings of Interspeech 2008 (pp. 1191–4). Brisbane, Australia: ISCA.

    Google Scholar 

  • Hazan, V., Sennema, A., Iba, M. and Faulkner, A. (2005). Effect of audiovisual perceptual training on the perception and production of consonants by Japanese learners of English. Speech Communication, 47 (3): 360–78.

    Article  Google Scholar 

  • Hazan, V. and Simpson, A. (2000). The effect of cue-enhancement on consonant intelligibility in noise: Talker and listener effects. Language and Speech, 43 (3): 273–94.

    Article  Google Scholar 

  • Hew, S. H. and Ohki, M. (2004). Effect of animated graphic annotations and immediate visual feedback in aiding Japanese pronunciation learning: A comparative study. CALICO Journal, 21 (2): 397–420.

    Google Scholar 

  • Hincks, R. (2002). Speech synthesis for teaching lexical stress. Proceedings of Fonetik, the Quarterly Progress and Status Report of the Department of Speech, Music and Hearing (TMH-QPSR), 44 (1): 153–6.

    Google Scholar 

  • Hincks, R. (2003). Speech technologies for pronunciation, feedback and evaluation. ReCALL, 15 (1): 3–20.

    Article  Google Scholar 

  • Hincks, R. (2015). Technology and learning pronunciation. In M. Reed and J. Levis (eds), The handbook of English pronunciation (pp. 505–19). Malden, NY: Wiley-Blackwell.

    Google Scholar 

  • Hincks, R. and Edlund, J. (2009). Promoting increased pitch variation in oral presentations with transient visual feedback. Language Learning and Technology, 13 (3): 32–50. Retrieved from: http://llt.msu.edu/vol13num3/hincksedlund.pdf

    Google Scholar 

  • Hinofotis, F. and Bailey, K. (1980). American undergraduate reactions to the communication skills for foreign teaching assistants. In J. Fisher, M. Clarke and J. Schacter (eds), On TESOL’80: Building bridges (pp. 120–33). Washington, DC: TESOL.

    Google Scholar 

  • Ioup, G., Boustagui, E., El Tigi, M. and Moselle, M. (1994). Reexamining the critical period hypothesis: A case study of successful adult SLA in a naturalistic environment. Studies in Second Language Acquisition, 16: 73–98.

    Article  Google Scholar 

  • Isaacs, T. (2009). Integrating form and meaning in L2 pronunciation instruction. TESL Canada Journal, 27 (1): 1–12.

    Article  Google Scholar 

  • Iverson, P., Hazan, V. and Banister, K. (2005). Phonetic training with acoustic cue manipulations: A comparison of methods for teaching English/r/-/l/to Japanese adults. The Journal of the Acoustical Society of America, 118: 3267–78.

    Article  Google Scholar 

  • James, E. (1976). The acquisition of prosodic features of speech using a speech visualizer. International Review of Applied Linguistics, 14 (3): 227–43.

    Article  Google Scholar 

  • Jamieson, D. and Morosan, D. (1986). Training non-native speech contrasts in adults: Acquisition of the English/ð/-/θ/contrast by francophones. Perceptual Psychophysics, 40: 205–15.

    Article  Google Scholar 

  • Jenkins, J. (2000). The phonology of English as an international language. Oxford, UK: Oxford University Press.

    Google Scholar 

  • Jones, D., Roach, P., Setter, J. and Esling, J. (2011). Cambridge English pronouncing dictionary (18th edn). Cambridge, UK: Cambridge University Press.

    Google Scholar 

  • Kang, M., Kashiwagi, H., Treviranus, J. and Kaburagi, M. (2008). Synthetic speech in foreign language learning: An evaluation by learners. International Journal of Speech Technology, 11 (2): 97–106.

    Article  Google Scholar 

  • Kelly, G. (2000). How to teach pronunciation. Harlow, UK: Longman.

    Google Scholar 

  • Kenning, M-M. (2007). ICT and language learning. From print to the mobile phone. Basingstoke, UK: Palgrave Macmillan.

    Book  Google Scholar 

  • Kim, I.-S. (2006). Automatic speech recognition: Reliability and pedagogical implications for teaching pronunciation. Educational Technology and Society, 9 (1): 322–34.

    Google Scholar 

  • Krashen, S. (1979). Adult second language acquisition as post-critical period learning. ITL: Review of Applied Linguistics, 43: 39–52.

    Article  Google Scholar 

  • Kröger, B., Graf-Borttscheller, V. and Lowit, A. (2008). Two-and three-dimensional visual articulatory models for pronunciation training and for treatment of speech disorders. Proceedings of Interspeech 2008 (pp. 2639–42). Brisbane, Australia: ISCA.

    Google Scholar 

  • Lambacher, S. (1999). A CALL tool for improving second language acquisition of English consonants by Japanese learners. Computer Assisted Language Learning, 12 (2): 137–56.

    Article  Google Scholar 

  • Lambacher, S., Martens, W., Kakehi, K., Marasinghe, C. and Molholt, G. (2005). The effects of identification training on the identification and production of American English vowels by native speakers of Japanese. Applied Psycholinguistics, 26: 227–47.

    Article  Google Scholar 

  • Léon, P. and Martin, P. (1972). Applied linguistics and the teaching of intonation. Modern Language Journal, 56: 139–44.

    Article  Google Scholar 

  • Levis, J. (2007). Computer technology in teaching and researching pronunciation. Annual Review of Applied Linguistics, 27: 184–202.

    Article  Google Scholar 

  • Levy, M. (1997). Theory-driven CALL and the development process. Computer-Assisted Language Learning, 10 (1): 41–56.

    Article  Google Scholar 

  • Levy, M. and Stockwell, G. (2006). CALL dimensions. Options and issues in computerassisted language learning. New Jersey, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Liakin, D., Cardoso, W. and Liakina, N. (2014). Learning L2 pronunciation with a mobile speech recognizer: French/y/. CALICO Journal, 32 (1): 1–25. Retrieved from http://www.equinoxpub.com/journals/index.php/CALICO/article/view/22942

    Google Scholar 

  • Lieberman, P. (1965). On the acoustic basis of the perception of intonation by linguists. Word, 21: 40–54.

    Article  Google Scholar 

  • Liu, Y., Massaro, D. M., Chen, T. H., Chan, H. L. and Perfetti, C. (2007). Using visual speech for training Chinese pronunciation: An in-vivo experiment. Proceedings of Interspeech 2007. Farmington, Pa: ISCA.

    Google Scholar 

  • Llisterri, J. (1995). Relationships between speech production and speech perception in a second language. In K. Elenius and P. Branderurd (eds), ICPhS 1995. Proceedings of the 13th International Congress of Phonetic Sciences (Vol. 4, pp. 92–9). Stockholm, Sweden: Stockholm University.

    Google Scholar 

  • Llisterri, J. (2007). La enseñanza de la pronunciación asistida por ordenador. In Actas del XXIV Congreso Internacional de AESLA. Aprendizaje de lenguas, uso del lenguaje y modelación cognitiva: Perspectivas aplicadas entre disciplinas, (pp. 91–120). Madrid, Spain: Universidad Nacional de Educación a Distancia (UNED). Retrieved from: http://liceu.uab.cat/~joaquim/publicacions/Llisterri_06_Pronunciacion_Tecnologias.pdf

    Google Scholar 

  • Logan, J. S., Lively, S. E. and Pisoni, D. B. (1991). Training Japanese listeners to identify English/r/and/l/: A first report. Journal of the Acoustical Society of America, 89 (2): 874–86.

    Article  Google Scholar 

  • Long, M. H. (1991). Focus on form: A design feature in language teaching methodology. In K. de Bot, R. Ginsberg and C. Kramsch (eds), Foreign language research in crosscultural perspective (pp. 39–52). Amsterdam, the Netherlands: John Benjamins.

    Chapter  Google Scholar 

  • Lord, G. (2005). (How) can we teach foreign language pronunciation? On the effects of a Spanish phonetics course. Hispania, 88 (3): 557–67.

    Article  Google Scholar 

  • Lord, G. (2008). Podcasting communities and second language pronunciation. Foreign Language Annals, 41: 374–89.

    Article  Google Scholar 

  • MacDonald, S. (2002). Pronunciation — views and practices of reluctant teachers. Prospect, 17 (3): 3–18.

    Google Scholar 

  • McGurk, H. and MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264: 746–8.

    Article  Google Scholar 

  • Massaro, D., Bigler, S., Chen, T., Perlman, M. and Ouni, S. (2008). Pronunciation training: The role of eye and ear. In Proceedings of Interspeech 2008 (pp. 2623–6). Brisbane, Australia: ISCA.

    Google Scholar 

  • Massaro, D. W., Cohen, M. M., Tabain, M., Beskow, J. and Clark, R. (2012). Animated speech: Research progress and applications. In G. Bailly, P. Perrier and E. Vatikiotis-Bateson (eds), Audiovisual speech processing (pp. 309–45). Cambridge, UK: Cambridge University Press.

    Chapter  Google Scholar 

  • Massaro, D. W. and Light, J. (2003). Read my tongue movements: Bimodal learning to perceive and produce non-native speech/r/and/l/. In Proceedings of Eurospeech (Interspeech), 8th European Conference on Speech Communication and Technology (pp. 2249–52). Geneva, Switzerland: ESCA.

    Google Scholar 

  • Massaro, D. W. and Light, J. (2004). Improving the vocabulary of children with hearing loss. Volta Review, 104 (3): 141–74.

    Google Scholar 

  • Mayfield-Tomokiyo, L., Wang, L. and Eskenazi, M. (2000). An empirical study of the effectiveness of speech-recognition-based pronunciation tutoring. Proceedings of the 6th International Conference on Speech and Language Processing (pp. 677–80). Beijing, China.

    Google Scholar 

  • Menzel, W., Herron, D., Morton, R., Pezzotta, D., Bonaventura, P. and Howarth, P. (2001). Interactive pronunciation training, ReCALL, 13 (1): 67–78.

    Article  Google Scholar 

  • Molholt, G. (1988). Computer-assisted instruction in pronunciation for Chinese speakers of American English. TESOL Quarterly, 22 (1): 91–111.

    Article  Google Scholar 

  • Molholt, G. (1990). Spectrographic analysis and patterns in pronunciation. Computers and the Humanities, 24: 81–92.

    Article  Google Scholar 

  • Molholt, G. and Hwu, F. (2008). Visualization of speech patterns for language learning. In V. M. Holland and F. P. Fisher (eds), The path of speech technologies in computer assisted language learning (pp. 91–122). New York: Routledge.

    Google Scholar 

  • Mompean, J. A. and Fouz-González, J. (in press, 2016). Twitter-based EFL pronunciation instruction. Language Learning and Technology, 20 (1).

    Google Scholar 

  • Monroy, R. (2001). Profiling the phonological processes shaping the fossilised IL of adult Spanish learners of English. Some theoretical implications. International Journal of English Studies, 1: 157–217. Retrieved from: http://revistas.um.es/ijes/article/view/47661

    Google Scholar 

  • Morley, J. (1991). The pronunciation component in teaching English to speakers of other languages. TESOL Quarterly, 25 (3): 481–520.

    Article  Google Scholar 

  • Morton H., Gunson, N. and Jack, M. (2012). Interactive language learning through speech-enabled virtual scenarios. Advances in Human-Computer Interaction, 12: 1–14.

    Article  Google Scholar 

  • Morton, H. and Jack, M. (2005). Scenario-based spoken interaction with virtual agents. Computer Assisted Language Learning, 18: 171–91.

    Article  Google Scholar 

  • Motohashi-Saigo, M. and Hardison, D. M. (2009). Acquisition of L2 Japanese geminates training with waveform displays. Language Learning and Technology, 13 (2): 29–47. Retrieved from: http://llt.msu.edu/vol13num2/motohashisaigohardison.pdf

    Google Scholar 

  • Moyer, A. (1999). Ultimate attainment in L2 phonology: The critical factors of age, motivation and instruction. Studies in Second Language Acquisition, 21: 81–108.

    Article  Google Scholar 

  • Neri, A., Cucchiarini, C. and Strik, H. (2008) The effectiveness of computer-based corrective feedback for improving segmental quality in L2-Dutch. ReCALL, 20 (2): 225–43.

    Article  Google Scholar 

  • Neri, A., Cucchiarini, C., Strik, H. and Boves, L. (2002). The pedagogy-technology interface in computer assisted pronunciation training. Computer Assisted Language Learning, 15 (5): 441–67.

    Article  Google Scholar 

  • O’Brien, M. (2006). Teaching pronunciation and intonation with computer technology. In L. Ducate and N. Arnold (eds), Calling on CALL: From theory and research to new directions in foreign language teaching (CALICO Monograph Series, Vol. 5, pp. 127–48). San Marcos, Tex.: CALICO.

    Google Scholar 

  • Olson, D. J. (2014). Benefits of visual feedback on segmental production in the L2 classroom. Language Learning and Technology, 18 (3): 173–92. Retrieved from http://llt.msu.edu/issues/october2014/olson.pdf

    Google Scholar 

  • Ortega-Llebaria, M., Faulkner, A. and Hazan, V. (2001). Auditory-visual L2 speech perception: Effects of visual cues and acoustic-phonetic context for Spanish learners of English. In D. W. Massaro, J. Light and K. Geraci (eds), Proceedings AVSP-2001 (pp. 149–54). Aalborg, Denmark: ISCA.

    Google Scholar 

  • Pennington, M. C. (1996). The power of the computer in language education. In M. C. Pennington (ed.), The power of CALL. Houston, Tex.: Athelstan publications.

    Google Scholar 

  • Pennington, M. C. (1998). The teachability of pronunciation in adulthood: A reconsideration. International Review of Applied Linguistics, 36: 323–41.

    Article  Google Scholar 

  • Pennington, M. C. (1999). Computer-aided pronunciation pedagogy: Promise, limitations, directions. Computer-Assisted Language Learning, 12 (5): 427–40.

    Article  Google Scholar 

  • Pennington, M. C. and Ellis, N. C. (2000). Cantonese speakers’ memory for English sentences with prosodic cues. The Modern Language Journal, 84: 372–89.

    Article  Google Scholar 

  • Pennington, M. C. and Esling, J. H. (1996). Computer-assisted development of spoken language skills. In M. C. Pennington (ed.), The power of CALL (pp. 153–89). Houston, Tex.: Athelstan.

    Google Scholar 

  • Pennington, M. C. and Richards, J. C. (1986). Pronunciation revisited. TESOL Quarterly, 20: 207–25.

    Article  Google Scholar 

  • Pi-Hua, T. (2006). Bridging pedagogy and technology: User evaluation of pronunciation oriented CALL software. Australasian Journal of Educational Technology, 22 (3): 375–97.

    Google Scholar 

  • Pisoni, D. B. (1981). Speeded classification of natural and synthetic speech in a lexical decision task. Journal of the Acoustical Society of America, 70: S98.

    Article  Google Scholar 

  • Pisoni, D. B. (1982). Perception of speech: The human listener as a cognitive interface. Speech Technology, 1: 10–23.

    Google Scholar 

  • Probst, K., Ke, Y. and Eskenazi, M. (2002). Enhancing foreign language tutors — in search of the golden speaker. Speech Communication, 37 (3–4): 161–73.

    Article  Google Scholar 

  • Pruitt, J. S., Kawahara, H., Akahane-Yamada, R. and Kubo, R. (1998). Methods of enhancing speech stimuli for perceptual training: Exaggerated articulation, context truncation, and ‘STRAIGHT’ re-synthesis. ESCA Workshop on Speech Technology in Language Learning (STiLL 98). Proceedings (pp. 107–10). Stockholm, Sweden: ESCA.

    Google Scholar 

  • Psyentific Mind Inc. (2012). iBaldi. (Version 2.1) [Mobile application software]. Retrieved from: https://itunes.apple.com/es/app/ibaldi/id504464546?mt=8

    Google Scholar 

  • Ramírez-Verdugo, D. (2006). A study of intonation awareness and learning in non-native speakers of English. Language Awareness, 15 (3): 141–59.

    Article  Google Scholar 

  • Rato, A. (2014). Effects of perceptual training on the identification of English vowels by native speakers of European Portuguese. Proceedings of the 7th International Symposium on the Acquisition of Second Language Speech. Concordia University Working Papers in Applied Linguistics, 5: 529–46.

    Google Scholar 

  • Reynolds, M. and Jefferson, L. (1999). Natural and synthetic speech comprehension: Comparison of children from two age groups. Augmentative and Alternative Communication, 15 (3): 174–82.

    Article  Google Scholar 

  • Rubin, D. L. (1992). Non language factors affecting undergraduates’ judgments of nonnative English-speaking teaching assistants. Research in Higher Education, 33 (4): 511–31.

    Article  Google Scholar 

  • Ruellot, V. (2011). Computer-assisted pronunciation learning of French/u/and/y/at the intermediate level. In J. Levis and K. LeVelle (eds), Proceedings of the 2nd Pronunciation in Second Language Learning and Teaching Conference (pp. 199–213). Ames, Iowa: Iowa State University.

    Google Scholar 

  • RuiYu Team (2012). English pronunciation testing. (Version 1.2.1) [Mobile application software]. Retrieved from: https://play.google.com/store/apps/details?id=com.ruiyu. englishprotest

    Google Scholar 

  • Saito, K. (2013). Re-examining effects of form-focused instruction on L2 pronunciation development: The role of explicit phonetic information. Studies in Second Language Acquisition, 35: 1–29.

    Article  Google Scholar 

  • Saito, K. and Lyster, R. (2012). Effects of form-focused instruction and corrective feedback on L2 pronunciation development of/ʴ/by Japanese learners of English. Language Learning, 62: 595–633.

    Article  Google Scholar 

  • Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11 (2): 129–58.

    Article  Google Scholar 

  • Schwab, E., Nusbaum, H. and Pisoni, D. (1985). Some effects of training on the perception of synthetic speech. Human Factors, 27: 395–408.

    Google Scholar 

  • Sekiyama, K. and Tohkura, Y. (1993). Interlanguage differences in the influence of visual cues in speech perception. Journal of Phonetics, 21: 427–44.

    Google Scholar 

  • Siciliano, C., Faulkner, A. and Williams, G. (2003). Lipreadability of a synthetic talking face in normal hearing and hearing-impaired listeners. In J.-L. Schwartz, F. Berthommier, M.-A. Cathiard and D. Sodoyer (eds), Proceedings AVSP 2003 (pp. 205–8). St Jorioz, France: ISCA.

    Google Scholar 

  • Smith, J. and Beckmann, B. (2010). Noticing-reformulation tasks as a stimulus towards continuous autonomous phonological development. New Zealand Studies in Applied Linguistics, 16 (1): 36–51.

    Google Scholar 

  • Spaai, G. W. G. and Hermes, D. J. (1993). A visual display for the teaching of intonation. CALICO Journal, 10 (3): 19–30.

    Google Scholar 

  • Spada, N. (1997). Form-focused instruction and second language acquisition: A review of classroom and laboratory research. Language Teaching, 29: 73–87.

    Article  Google Scholar 

  • Strange, W. and Dittmann, S. (1984). Effects of discrimination training on the perception of/r-l/by Japanese adults learning English. Perception and Psychophysics, 36: 131–45.

    Article  Google Scholar 

  • Suter, R. (1976). Predictors of pronunciation accuracy in second language learning. Language Learning, 30: 271–87.

    Google Scholar 

  • Taniguchi, M. and Abberton, E. (1999). Effect of interactive visual feedback on the improvement of English intonation of Japanese EFL learners. Speech, Hearing and Language: Work in Progress (University College London, Department of Phonetics and Linguistics), 11: 76–89.

    Google Scholar 

  • Tanner, M. and Landon, M. (2009). The effects of computer-assisted pronunciation readings on ESL learners’ use of pausing, stress, intonation, and overall comprehensibility. Language Learning and Technology, 13 (3): 51–65. Retrieved from: http://llt.msu.edu/vol13num3/tannerlandon.pdf

    Google Scholar 

  • Taylor, P. (2009). Text-to-speech synthesis. Cambridge, UK: Cambridge University Press.

    Book  Google Scholar 

  • Thomson, R. I. (2011). Computer assisted pronunciation training: Targeting second language vowel perception improves pronunciation. CALICO Journal, 28 (3): 744–65.

    Article  Google Scholar 

  • University of Iowa Research Foundation. (2014). Sounds of Speech. (Version 1.6.5) [Mobile application software]. Retrieved from: https://itunes.apple.com/us/app/sounds-of-speech/id780656219?mt=8

    Google Scholar 

  • Walker, N. R., Trofimovich, P., Cedergren, H. and Gatbonton, E. (2011). Using ASR technology in language training for specific purposes: A perspective from Quebec, Canada. CALICO Journal, 28 (3): 539–52.

    Article  Google Scholar 

  • Walker, R. (2005). Using student-produced recordings with monolingual groups to provide effective individualized pronunciation practice. TESOL Quarterly, 39: 550–8.

    Article  Google Scholar 

  • Wang, X. and Munro, M. (2004). Computer-based training for learning English vowel contrasts. System, 32: 539–52.

    Article  Google Scholar 

  • Wells, J. C. (2008). Longman pronunciation dictionary. Harlow, UK: Longman.

    Google Scholar 

  • Wichern, P. U. M. and Boves, L. (1980). Visual feedback of F0 curves as an aid in learning intonation-contours. Proceedings of the Institute of Phonetics Nijmegen, 4: 53–63.

    Google Scholar 

  • Wik, P. and Hjalmarsson, A. (2009). Embodied conversational agents in computer assisted language learning. Speech Communication, 51: 1024–37.

    Article  Google Scholar 

  • Winters, S. and Pisoni, D. (2004). Some effects of feedback on the perception of pointlight and full-face visual displays of speech: A preliminary report. Research on Spoken Language Processing Progress (Report No. 26, pp. 139–64). Bloomington, Ind.: Indiana University.

    Google Scholar 

  • Witt, S. M. and Young, S. (1997). Language learning based on non-native speech recognition. In Eurospeech’97. Proceedings of the 5th European Conference on Speech Communication and Technology (pp. 633–6). Rhodes, Greece: ESCA.

    Google Scholar 

  • Yamada, R. A. and Tohkura, Y. (1992). The effects of experimental variables on the perception of American English/r/and/l/by Japanese listeners. Perception and Psychophysics, 52: 376–92.

    Article  Google Scholar 

  • Yule, G., Hoffman, P. and Damico, J. (1987). Paying attention to pronunciation: The role of self-monitoring in perception. TESOL Quarterly, 21: 765–8.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Copyright information

© 2015 Jonás Fouz-González

About this chapter

Cite this chapter

Fouz-González, J. (2015). Trends and Directions in Computer-Assisted Pronunciation Training. In: Mompean, J.A., Fouz-González, J. (eds) Investigating English Pronunciation. Palgrave Macmillan, London. https://doi.org/10.1057/9781137509437_14

Download citation

Publish with us

Policies and ethics