Skip to main content

2010 | OriginalPaper | Buchkapitel

28. Spatial Speaker Spatial Positioning of Synthesized Speech in Java

verfasst von : Jaka Sodnik, Sašo Tomažič

Erschienen in: Machine Learning and Systems Engineering

Verlag: Springer Netherlands

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we propose a “Spatial speaker”, an enhancement of a Java FreeTTS speech synthesizer with the additional function of spatial positioning of both the speaker and the listener. Our module enables the reading of an arbitrary text from a file or webpage to the user from a fixed or changing position in space through normal stereo headphones. Our solution combines the following modules: FreeTTS speech synthesizer, a custom made speech processing unit, MIT Media Lab HRTF library, JOAL positioning library and Creative X-Fi sound card. The paper gives an overview of the design of the “Spatial Speaker” and proposes three different practical applications of such a system for visually impaired and blind computer users. Some preliminary results of user studies confirmed the system’s usability and showed its great potential also in other types of applications and auditory interfaces. The entire system is developed as a single Java class which can be imported and used in any Java application.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat R.A. Cole, Survey of the State of the Art in Human Language Technology (1996) R.A. Cole, Survey of the State of the Art in Human Language Technology (1996)
5.
Zurück zum Zitat E.D. Mynatt, Transforming graphical interfaces into auditory interfaces. Doctoral Dissertation, Georgia Institute of Technology, 1995 E.D. Mynatt, Transforming graphical interfaces into auditory interfaces. Doctoral Dissertation, Georgia Institute of Technology, 1995
6.
Zurück zum Zitat K. Crispien, K. Fellbaum, A 3D-auditory environment for hierarchical navigation in non-visual interaction, in Proceedings of ICAD96, 1996 K. Crispien, K. Fellbaum, A 3D-auditory environment for hierarchical navigation in non-visual interaction, in Proceedings of ICAD96, 1996
7.
Zurück zum Zitat J. Sodnik, S. Tomažič, Spatial auditory interface for word processing application, in Proceedings of IEEE ACHI09 2009, pp. 271–276 J. Sodnik, S. Tomažič, Spatial auditory interface for word processing application, in Proceedings of IEEE ACHI09 2009, pp. 271–276
8.
Zurück zum Zitat Y. Borodin, J. Mahmud, I.V. Ramakrishnan, A. Stent, The HearSay non-visual web browser. Proc. Int. Cross Discip. Conf. Web Accessibility, 225, 128–129 (2007) Y. Borodin, J. Mahmud, I.V. Ramakrishnan, A. Stent, The HearSay non-visual web browser. Proc. Int. Cross Discip. Conf. Web Accessibility, 225, 128–129 (2007)
9.
Zurück zum Zitat J.U. Mahmud, Y. Borodin, I.V. Ramakrishnan, Csurf: a context-driven non-visual web-browser, in Proceedings of 16th International Conference on World Wide Web, pp. 31–40, 2007 J.U. Mahmud, Y. Borodin, I.V. Ramakrishnan, Csurf: a context-driven non-visual web-browser, in Proceedings of 16th International Conference on World Wide Web, pp. 31–40, 2007
10.
Zurück zum Zitat P. Roth, L.S. Petrucci, A. Assimacopoulos, T. Pun, Audio-haptic internet browser and associated tools for blind and visually impaired computer users, in Workshop on Friendly Exchanging Through the Net, pp. 57–62, 2000 P. Roth, L.S. Petrucci, A. Assimacopoulos, T. Pun, Audio-haptic internet browser and associated tools for blind and visually impaired computer users, in Workshop on Friendly Exchanging Through the Net, pp. 57–62, 2000
11.
Zurück zum Zitat S. Goose, C. Moller, A 3D audio only interactive Web browser: using spatialization to convey hypermedia document structure, in Proc. of Seventh ACM International Conference on Multimedia, pp. 363–371, 1999 S. Goose, C. Moller, A 3D audio only interactive Web browser: using spatialization to convey hypermedia document structure, in Proc. of Seventh ACM International Conference on Multimedia, pp. 363–371, 1999
12.
Zurück zum Zitat J. Sodnik, C. Dicke, S. Tomažič, M. Billinghurst, A user study of auditory versus visual interfaces for use while driving. Int. J. Human Comput. Stud. 66(5), 318–332 (2008)CrossRef J. Sodnik, C. Dicke, S. Tomažič, M. Billinghurst, A user study of auditory versus visual interfaces for use while driving. Int. J. Human Comput. Stud. 66(5), 318–332 (2008)CrossRef
13.
Zurück zum Zitat S. Goose, S. Kodlahalli, W. Pechter, R. Hjelsvold, Streaming speech3: a framework for generating and streaming 3D text-to-speech and audio presentations to wireless PDAs as specified using extensions to SMIL, in Proceedings of the International Conference on World Wide Web, pp. 37–44, 2002 S. Goose, S. Kodlahalli, W. Pechter, R. Hjelsvold, Streaming speech3: a framework for generating and streaming 3D text-to-speech and audio presentations to wireless PDAs as specified using extensions to SMIL, in Proceedings of the International Conference on World Wide Web, pp. 37–44, 2002
14.
Zurück zum Zitat S. Goose, J. Riedlinger, S. Kodlahalli, Conferencing3: 3D audio conferencing and archiving services for handheld wireless devices. Int. J. Wireless Mobile Comput. 1(1), 5–13 (2005)CrossRef S. Goose, J. Riedlinger, S. Kodlahalli, Conferencing3: 3D audio conferencing and archiving services for handheld wireless devices. Int. J. Wireless Mobile Comput. 1(1), 5–13 (2005)CrossRef
15.
Zurück zum Zitat K. Crispien, W. Wurz, G. Weber, Using spatial audio for the enhanced presentation of synthesised speech within screen-readers for blind computer users, Computers for Handicapped Persons, vol. 860. (Springer, Berlin, Heidelberg, 1994), pp. 144–153 K. Crispien, W. Wurz, G. Weber, Using spatial audio for the enhanced presentation of synthesised speech within screen-readers for blind computer users, Computers for Handicapped Persons, vol. 860. (Springer, Berlin, Heidelberg, 1994), pp. 144–153
20.
Zurück zum Zitat C.I. Cheng, G.H. Wakefield, Introduction to head-related transfer functions (HRTF’s): representations of HRTF’s in time, frequency, and space (invited tutorial). J. Audio Eng. Soc. 49(4), 231–249 (2001) C.I. Cheng, G.H. Wakefield, Introduction to head-related transfer functions (HRTF’s): representations of HRTF’s in time, frequency, and space (invited tutorial). J. Audio Eng. Soc. 49(4), 231–249 (2001)
21.
Zurück zum Zitat J. Blauert, Spatial Hearing: The Psychophysics of Human Sound Localization, Revised edn. (MIT Press, Cambridge, MA, 1997) J. Blauert, Spatial Hearing: The Psychophysics of Human Sound Localization, Revised edn. (MIT Press, Cambridge, MA, 1997)
22.
Zurück zum Zitat B. Arons, A review of the cocktail party effect, J. Am. Voice I/O Soc. 12, 35–50 (1992) B. Arons, A review of the cocktail party effect, J. Am. Voice I/O Soc. 12, 35–50 (1992)
Metadaten
Titel
Spatial Speaker Spatial Positioning of Synthesized Speech in Java
verfasst von
Jaka Sodnik
Sašo Tomažič
Copyright-Jahr
2010
Verlag
Springer Netherlands
DOI
https://doi.org/10.1007/978-90-481-9419-3_28