Phonetic Search Methods for Large Speech Databases | springerprofessional.de

Springer Professional

nach oben

2013 | Buch

Kapitel lesen Erstes Kapitel lesen

Phonetic Search Methods for Large Speech Databases

verfasst von: Ami Moyal, Vered Aharonson, Ella Tetariy, Michal Gishri

Verlag: Springer New York

Buchreihe : SpringerBriefs in Electrical and Computer Engineering

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

“Phonetic Search Methods for Large Databases” focuses on Keyword Spotting (KWS) within large speech databases. The brief will begin by outlining the challenges associated with Keyword Spotting within large speech databases using dynamic keyword vocabularies. It will then continue by highlighting the various market segments in need of KWS solutions, as well as, the specific requirements of each market segment. The work also includes a detailed description of the complexity of the task and the different methods that are used, including the advantages and disadvantages of each method and an in-depth comparison. The main focus will be on the Phonetic Search method and its efficient implementation. This will include a literature review of the various methods used for the efficient implementation of Phonetic Search Keyword Spotting, with an emphasis on the authors’ own research which entails a comparative analysis of the Phonetic Search method which includes algorithmic details. This brief is useful for researchers and developers in academia and industry from the fields of speech processing and speech recognition, specifically Keyword Spotting.

Anzeige

Inhaltsverzeichnis

Frontmatter

Chapter 1. Keyword Spotting Out of Continuous Speech

Abstract

Successful Automatic Speech Recognition (ASR) technology has been a research aspiration for the past five decades. Ideally, computers would be able to transform any type of human speech into an accurate textual transcription. Today’s ASR technology generates fairly good results using structured speech with relatively low Signal to Noise Ratios (SNR), but performance degrades when using spontaneous speech in real-life noisy environments (Murveit et al. 1992; Young 1996; Furui 2003; Deng and Huang 2004). Performance that is acceptable for commercial applications can be achieved using large training corpora of speech and text. However, there are still problems that need to be resolved.

Ami Moyal, Vered Aharonson, Ella Tetariy, Michal Gishri

Chapter 2. Keyword Spotting Methods

Abstract

This chapter will review in detail the three KWS methods, LVCSR KWS, Acoustic KWS and Phonetic Search KWS, followed by a discussion and comparison of the methods.

Ami Moyal, Vered Aharonson, Ella Tetariy, Michal Gishri

Chapter 3. Phonetic Search

Abstract

Traditionally, information retrieval techniques create an index of words or terms found in a textual database that can later be rapidly searched by simply entering a query for a desired word or term. Naturally, straightforward application of this technique to non-textual materials is impossible. When it comes to speech, using text-based techniques requires a preprocessing stage of transforming the digital speech signal into some form of text. However, since classical speech recognition engines are not totally accurate, the indexing will necessarily include errors.

Ami Moyal, Vered Aharonson, Ella Tetariy, Michal Gishri

Chapter 4. Search Space Complexity Reduction

Abstract

Various studies have focused on exploring ways to search more efficiently; this chapter will present an overview of methods that deal with efficient searching, with a focus on methods that reduce the size of the search space. The basis of all these methods is to formulate and use constraints that trim down the search space by eliminating impossible paths, dimensions or locations, thus leaving a reduced grid on which to perform the search.

Ami Moyal, Vered Aharonson, Ella Tetariy, Michal Gishri

Chapter 5. Evaluating Phonetic Search KWS

Abstract

Complexity reduction algorithm evaluation should be carefully performed to assess its performance and usability. A basic evaluation in such cases measures two aspects as compared to the exhaustive search: the relative decrease in computational complexity and the relative change (increase or decrease) in recognition performance in the reduced computational complexity mode.

Ami Moyal, Vered Aharonson, Ella Tetariy, Michal Gishri

Chapter 6. Evaluation Results

Abstract

In order to evaluate the performance of the suggested anchor-based algorithm in reducing the computational complexity, the size of the search space and the KWS performance were calculated for the anchor-based search and compared to an exhaustive search. The reduction in computational complexity was determined by measuring the average runtime for processing an input phoneme string and then retrieving the keywords using a standard server. The relative decrease in runtime was measured rather than absolute figures (which may change depending on the server used).

Ami Moyal, Vered Aharonson, Ella Tetariy, Michal Gishri

Chapter 7. Summary

Abstract

Speech recognition technology can be used for a wide range of applications. Keyword spotting is one of the more practical implementations of speech recognition, as it does not require any understanding of the transcribed speech, nor does it necessarily demand full transcription accuracy.

Ami Moyal, Vered Aharonson, Ella Tetariy, Michal Gishri

Backmatter

Titel: Phonetic Search Methods for Large Speech Databases
verfasst von: Ami Moyal
Vered Aharonson
Ella Tetariy
Michal Gishri
Copyright-Jahr: 2013
Verlag: Springer New York
Electronic ISBN: 978-1-4614-6489-1
Print ISBN: 978-1-4614-6488-4
DOI: https://doi.org/10.1007/978-1-4614-6489-1

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Buchstaben, die aus einem Megaphon kommen/© MicroStockHub/Getty Images/iStock, Digitale Lieferkette/© zapp2photo / stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Weitere Formate
Fachgebiete
Bücher
Zeitschriften
Themenseiten
Jetzt Einzelzugang starten
Zugang für Unternehmen
Referenzkunden
SLX-Digitalkonferenz: Zukunftswerkstatt 2024