The human factors of speech-based interfaces: a research agenda

Author:
James H. Bradford

Brock University, St. Catharines, Ontario, Canada L2S 3A1

Brock University, St. Catharines, Ontario, Canada L2S 3A1
View Profile

Authors Info & Claims

ACM SIGCHI Bulletin Volume 27 Issue 2April 1995pp 61–67https://doi.org/10.1145/202511.202527

Published:01 April 1995Publication History

ACM SIGCHI Bulletin

Abstract

This article outlines some of the characteristics of speech technology that distinguish it from traditional interaction techniques. A number of human factors issues relating to speech will be discussed in the context of a proposed research agenda for speech-based interfaces.

References

Ainsworth, W.A., Speech Recognition by Machine, Peter Peregrinus Ltd., London 1988, pp. 9-10. Google ScholarDigital Library
Allen, J., Natural Language Understanding, The Benjamin/ Cummings Publishing Company Inc., Don Mills, 1987.Google Scholar
Antoniol, G., Fiutem, R., Flor, R., and Lazzari, G., "Radiological reporting based on voice recognition", Proceedings of EWCHI'93, the 1993 East- West International Conference on Human-Computer Interaction, Moscow, August 1993, Vol. II, pp. 233-246. Google ScholarDigital Library
Boll, S.F., and Pulsipher, D.C., "Suppression of acoustic noise in speech using two microphone adaptive noise cancellation", IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP-28(6), 1980, pp. 752-755.Google ScholarCross Ref
Bonner, R., "Changes in the speech pattern under emotional tension", American Journal of Psychology, 56, 1943, pp. 262- 273.Google ScholarCross Ref
Bradford, J.H., "Semantic Strings, A New Technique for Detecting and Correcting User Errors", The International Journal of Man-Machine Studies, 33(4), October 1990, pp. 399- 407. Google ScholarDigital Library
Bradford, J.H., "Towards a Robust Speech Interface for Teleoperation Systems", Proceedings of ICSLP'92: The International Conference on Spoken Language Processing, Banff Alta, October 1992, pp. 1331-1334.Google Scholar
Darwin, C., The Expression of the Emotions in Man and Animals , The University of Chicago Press, 1965 (reprint), 1872 (original), pp. 83-93.Google ScholarCross Ref
Grosjean, F, "Spoken word recognition processes and the gating paradigm", Perception and Psychophysics, 28(4), 1980, pp. 267-283.Google ScholarCross Ref
Hubbard, C., and Bradford, J.H., "Task interference with a discrete word recognizer", Proceedings of EWCHI'93, the 1993 East-West International Conference on Human-Computer Interaction , Moscow, August 1993, Vol II, pp. 246-252. Google ScholarDigital Library
Kent, R.D., and Read, C., The Acoustic Analysis of Speech, Singular Publishing Group, San Diego, 1992, pp. 154-158.Google Scholar
Lee, K-F., Automatic Speech Recognition: The Development of the SPHINX System, Kluwer Academic Publishers, Norwell, Massachusetts, 1992, pp. 2-3. Google ScholarDigital Library
Lieberman, P., and Michaels, S.B., "Some aspects of fundamental frequency and envelope amplitude as related to the emotional content of speech", The Journal of the Acoustical Society of America, 34, 1962, pp. 922-927.Google ScholarCross Ref
Logan, J., Greene, B., and Pisoni, D., "Segmental intelligibility of synthetic speech produced by rule", Journal of the Acoustical Society of America, 86(2), August 1989, pp. 566-581 (see table VII, p. 575).Google ScholarCross Ref
Luff, P., Gilbert, N., and Frohlich, D. (Eds), Computers and Conversation, Academic Press, Toronto, 1990.Google Scholar
Mitchell, O.M.M., Ross, C.A., and Yates, G.H., "Signal processing for a cocktail party effect", Journal of the Acoustical Society of America, 50(2), 1971, pp. 656-660.Google ScholarCross Ref
Okawa, S., Endo, T., Kobayashi, T., and Shirai, K., "Phrase recognition in conversational speech using prosodic and phonemic information", IEICE Transactions on Information and Systems, E76-D(1), 1993, pp. 44-50.Google Scholar
Owens, F.J., Signal Processing of Speech, McGraw-Hill Inc., Toronto, 1993, p. 10. Google ScholarDigital Library
Preece, J., Rogers, Y., Sharp, H., Benyon, D., Holland, S., and Carey, T., Human-Computer Interaction, Addison-Wesley, Don Mills Ontario, 1994, pp. 64-65. Google ScholarDigital Library
Price, P.J., Ostendorf, M., Shattuck-Hufnagel, S., and Fong, C., "The use of prosody in syntactic disambiguation", The Journal of the Acoustical Society of America, 90(6), 1991, pp. 2956-297O.Google ScholarCross Ref
Rabiner, L., and Juang, B-H., Fundamentals of Speech Recognition , Prentice Hall Inc., Toronto, 1993, p. 309. Google ScholarDigital Library
Rudnicky, A., "Matching the input mode to the task", Byte, 18(11), October 1993, p. 100. Google ScholarDigital Library
Scherer, K.R., "Speech and emotional states", In J.K. Darby (Ed.), Speech Evaluation in Psychiatry, Grune-Stratton, New York, 1981, pp. 189-220.Google Scholar
Schmandt, C., Voice Communication with Computers, Van Nostrand Reinhold, New York, 1994, p. 109, pp. 179-209. Google ScholarDigital Library
Shneiderman, B., Designing the User Interface: Strategies for Effective Human-Computer Interaction, Addison-Wesley, Don Mills Ontario, 1992, pp. 280-281, p. 256. Google ScholarDigital Library
Widrow, B., Glover, J.R., McCool, J.M., Kaunitz, J., Williams, C.S., Hearn, R.H., Zeidler, J.R., Dong, E., and Goodlin, R.C., "Adaptive noise canceling: principles and applications", Proceedings of the IEEE, 63(12), 1975, pp. 1692-1716.Google ScholarCross Ref

Index Terms

The human factors of speech-based interfaces: a research agenda

Recommendations

Lithuanian Speech Corpus Liepa for Development of Human-Computer Interfaces Working in Voice Recognition and Synthesis Mode

The problem of speech corpus for design of human-computer interfaces working in voice recognition and synthesis mode is investigated. Specific requirements of speech corpus for speech recognizers and synthesizers were accented. It has been discussed that ...
Read More
Silent speech interfaces

The possibility of speech processing in the absence of an intelligible acoustic signal has given rise to the idea of a 'silent speech' interface, to be used as an aid for the speech-handicapped, or as part of a communications system operating in silence-...
Read More
Speech interfaces based upon surface electromyography

This paper discusses the use of surface electromyography (EMG) to recognize and synthesize speech. The acoustic speech signal can be significantly corrupted by high noise in the environment or impeded by garments or masks. Such situations occur, for ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGCHI Bulletin Volume 27, Issue 2
April 1995
94 pages
ISSN:0736-6906
DOI:10.1145/202511
Editor:
Steven Pemberton
CWI, Amsterdam, The Netherlands
Issue’s Table of Contents
Copyright © 1995 Author
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 April 1995
Check for updates
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 18
  Total Citations
  View Citations
- 886
  Total Downloads
- Downloads (Last 12 months)79
- Downloads (Last 6 weeks)15
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The human factors of speech-based interfaces: a research agenda

ACM SIGCHI Bulletin

Abstract

References

Cited By

Index Terms

Recommendations

Lithuanian Speech Corpus Liepa for Development of Human-Computer Interfaces Working in Voice Recognition and Synthesis Mode

Silent speech interfaces

Speech interfaces based upon surface electromyography