ABSTRACT
The proliferation of computing technology to low power domains such as hand--held devices has lead to increased interest in portable interface technologies, with particular interest in speech recognition. The computational demands of robust, large vocabulary speech recognition systems, however, are currently prohibitive for such low power devices. This work begins anexploration of domain specific characteristics of speech recognition that might be exploited to achieve the requisite performance within the power constraints of such devices. We focus primarily on architectural techniques to exploit the massive amounts of potential thread level parallelism apparent in this application domain, and consider the performance / power trade-offs of such architectures. Our results show that a simple, multi-threaded, multi-pipelined processor architecture can significantly improve the performance of the time-consuming search phase of modern speech recognition algorithms, and may reduce overall energy consumption by drastically reducing dissipation of static power. We also show that the primary hurdle to achieving these performance benefits is the data request rate into the memory system, and consider some initial solutions to this problem.
- Intel PXA250 processor. http://developer.intel.com/.Google Scholar
- Micron Technologies. http://www.micron.com/.Google Scholar
- Project Gutenberg. http://promo.net/pg/.Google Scholar
- Simplescalar toolset. http://www.simplescalar.com.Google Scholar
- K. Agaram, S. Keckler, and D. Burger. A characterization of speech recognition on modern computer systems. In Proceedings of 4th Annual Workshop on Workload Characterization, December 2001. Google ScholarDigital Library
- T. Anantharaman and B. Bisiani. A hardware accelerator for speech recognition algorithms. In Proceedings of the 13th Annual Intl. Symposium on Computer Architecture, pages 216--223, 1986. Google ScholarDigital Library
- D. Brooks, V. Tiwari, and M. Martonosi. Wattch: a framework for architectural-level power analysis and optimizations. In ISCA, pages 83--94, 2000. Google ScholarDigital Library
- S. Chatterjee and P. Agrawal. Connected speech recognition on a multiple processor pipeline. volume 2, pages 774--777, May 1989.Google Scholar
- C.Lai, S.Su, and Q.Zhao. Performance analysis of speech recognition software. In Proceedings of 5th Workshop on Computer Architecture Evaluation using Commercial Workloads, February 2002.Google Scholar
- P. Clarkson and R. Rosenfeld. Statistical language modeling using the CMU-Cambridge toolkit. In Proceedings of EUROSPEECH'97, pages 2707--2710, 1997.Google Scholar
- C.Zhang, F. Vahid, and W. Najjar. A Highly--Configurable Cache Architecture for Embedded Systems. In 30th Annual International Symposium on Computer Architecture, June 2003. Google ScholarDigital Library
- H. Hon. A survey of hardware architectures designed for speech recognition. Technical Report CMU-CS-91-169, August 1991.Google Scholar
- X. Huang, F. Alleva, H.-W. Hon, M.-Y. Hwang, K.-F. Lee, and R. Rosenfeld. The SPHINX-II speech recognition system: an overview. Computer Speech and Language, 7(2):137--148, 1993.Google ScholarCross Ref
- D. Jagger and D. Seal. ARM Architecture Reference Manual (2nd edition). Addison--Wesley, 2000. Google ScholarDigital Library
- G. Karypis. Metis family of multilevel partitioning algorithms. http://www-users.cs.umn.edu/~karypis/metis/metis/index.html.Google Scholar
- S. Kaxiras, G. Narlikar, A. Berenbaum, and Z. Hu. Comparing Power Consumption of an SMT and a CMP DSP for Mobile Phone Workloads. In International Conference on Compilers, Architectures, and Synthesis for Embedded Systems (CASES), November 2001. Google ScholarDigital Library
- R. Krishna, S. Mahlke, and T. Austin. Insights into the memory demands of speech recognition algorithms. In Proceedings of the 2nd Annual Workshop on Memory Performance Issues, May 2002.Google Scholar
- K. Lee, H. Hon, and R. Reddy. An overview of the SPHINX speech recognition system. IEEE Transactions on Acoustics, Speech and Signal Processing, 34:35--44, 1990.Google ScholarCross Ref
- K.-F. Lee. Automatic Speech Recognition: The Development of the SPHINX System. Klewer Academic Publishers, 1989. Google ScholarDigital Library
- L. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of IEEE, 77(2):257--286, February 1989.Google ScholarCross Ref
- L. Rabiner and B.-H. Juang. Fundamentals of Speech Recognition. Prentice Hall, 1993. Google ScholarDigital Library
- M. Ravishankar. Parallel implementation of fast beam search for speaker-independent continuous speech recognition. Computer Science & Automation, 1993.Google Scholar
- R. Sasanka, S. Adve, Y. Chen, and E.Debes. Comparing the Energy Efficiency of CMP and SMT Architectures for Multimedia Workloads. Technical Report UIUCDCS-R-2003-2325, 2003.Google Scholar
- P. Shivakumar and N. Jouppi. CACTI 3.0: An integrated cache timing, power, and area model. Technical report, August 2000.Google Scholar
- D. Wang and B. Jacobs. MASE DRAM memory simulator manual. http://www.ece.umd.edu/courses/enee759h.S2003/references/mase_dram.pdf.Google Scholar
- S. Young. Large vocabulary continuous speech recognition: A review. Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding, Snowbird, Utah, pages 3--28, December 1995.Google Scholar
Index Terms
- Architectural optimizations for low-power, real-time speech recognition
Recommendations
Hardware speech recognition for user interfaces in low cost, low power devices
DAC '05: Proceedings of the 42nd annual Design Automation ConferenceWe propose a system architecture for real-time hardware speech recognition on low-cost, power-constrained devices. The system is intended to support real-time speech-based user interfaces as part of an effort to bring Information and Communication ...
An ultra low-power hardware accelerator for automatic speech recognition
MICRO-49: The 49th Annual IEEE/ACM International Symposium on MicroarchitectureAutomatic Speech Recognition (ASR) is becoming increasingly ubiquitous, especially in the mobile segment. Fast and accurate ASR comes at a high energy cost which is not affordable for the tiny power budget of mobile devices. Hardware acceleration can ...
Low-Power Automatic Speech Recognition Through a Mobile GPU and a Viterbi Accelerator
Automatic speech recognition (ASR) has become a core technology for mobile devices. Delivering real-time and accurate ASR has a huge computational cost, which is challenging to achieve in tightly energy-constrained platforms such as mobile devices. A ...
Comments