ABSTRACT
We describe the use of non-verbal features in voice for direct control of interactive applications. Traditional speech recognition interfaces are based on an indirect, conversational model. First the user gives a direction and then the system performs certain operation. Our goal is to achieve more direct, immediate interaction like using a button or joystick by using lower-level features of voice such as pitch and volume. We are developing several prototype interaction techniques based on this idea, such as "control by continuous voice", "rate-based parameter control by pitch," and "discrete parameter control by tonguing." We have implemented several prototype systems, and they suggest that voice-as-sound techniques can enhance traditional voice recognition approach.
- 1.Goto M., Itou,K., Akiba,T., Hayamizu,S. Speech Completion: New Speech Interface with On-demand Completion Assistance, Proc. of HCI International 2001, 2001.(in press)Google Scholar
- 2.Hirose,Y., Ozeki,K., Takagi,K., Effectiveness of prosodic features in syntactic analysis of read Japanese sentences, Proceedings of ICSLP2000, Vol.3, pp.215-218, 2000.Google Scholar
- 3.Igarashi,T., Hinckley,K. Speed-dependent automatic zooming for browsing large documents, Proceedings of UIST'00, pp.139-148, 2000. Google ScholarDigital Library
- 4.Iwano,K., Hirose,K., Prosodic Word Boundary Detection Using Statistical Modeling of Moraic Fundamental Frequency Contours and Its Use for Continuous Speech Recognition, Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol.1, pp.133-136, 1999. Google ScholarDigital Library
- 5.Lieske,C., Bos,J., Emele,M., Gamback,B., Rupp,CJ., Giving prosody a meaning, Eurospeech97 vo13 pp.1431-1434, 1997.Google Scholar
- 6.Manaris,B., McCauley,R., MacGyvers,V., An Intelligent Interface for Keyboard and Mouse Control--Providing Full Access to PC Functionality via Speech, Proceedings of 14th International Florida AI Research Symposium (FLAIRS-01), 2001, (to appear). Google ScholarDigital Library
- 7.Tsukahara,W., Ward,N, Responding to Subtle, Fleeting Changes in the User's Internal State. Proceedings of CHI 2001, pp.77-84, 2001. Google ScholarDigital Library
- 8.Westphal,M., Waibel,A. Towards Spontaneous Speech Recognition For On-Board Car Navigation And Information Systems, Proceedings of the Eurospeech 99, 1999.Google Scholar
Index Terms
- Voice as sound: using non-verbal voice input for interactive control
Recommendations
The effect of voice cuing on releasing Chinese speech from informational masking
In a cocktail-party environment, human listeners are able to use perceptual-level and cognitive-level cues to segregate the attended target speech from other background conversations. At the cognitive level, priming the listener with part of the target ...
Voice pathology assessment based on automatic speech recognition using Amazigh digits
ICSDE'18: Proceedings of the 2nd International Conference on Smart Digital EnvironmentIn the past few years, research on automatic systems to assess voice disorders has received appreciable attention due to its objectivity and noninvasive nature. The work presented in this paper aims to build an automatic speech recognition system based ...
Adding voice to whisper using a simple heuristic algorithm inferred from empirical observation
ICCHP'10: Proceedings of the 12th international conference on Computers helping people with special needs: Part IThe aim of the work described in this paper is to allow people that are enforced to use "whispery voice" to be endowed with "voiced voice". A very simple method and algorithm obtained by empirical observation of corresponding speech signals is presented ...
Comments