Skip to main content

2012 | Buch

Forensic Speaker Recognition

Law Enforcement and Counter-Terrorism

herausgegeben von: Amy Neustein, Hemant A. Patil

Verlag: Springer New York

insite
SUCHEN

Über dieses Buch

Forensic Speaker Recognition: Law Enforcement and Counter-Terrorism is an anthology of the research findings of 35 speaker recognition experts from around the world. The volume provides a multidimensional view of the complex science involved in determining whether a suspect’s voice truly matches forensic speech samples, collected by law enforcement and counter-terrorism agencies, that are associated with the commission of a terrorist act or other crimes. While addressing such topics as the challenges of forensic case work, handling speech signal degradation, analyzing features of speaker recognition to optimize voice verification system performance, and designing voice applications that meet the practical needs of law enforcement and counter-terrorism agencies, this material all sounds a common theme: how the rigors of forensic utility are demanding new levels of excellence in all aspects of speaker recognition. The contributors are among the most eminent scientists in speech engineering and signal processing; and their work represents such diverse countries as Switzerland, Sweden, Italy, France, Japan, India and the United States.

Forensic Speaker Recognition is a useful book for forensic speech scientists, speech signal processing experts, speech system developers, criminal prosecutors and counter-terrorism intelligence officers and agents.

Inhaltsverzeichnis

Frontmatter

Forensic Case Work

Frontmatter
Chapter 1. Historical and Procedural Overview of Forensic Speaker Recognition as a Science
Abstract
Forensic phonetics and acoustics are nowadays widely used regarding police and legal use of acoustic samples. Among many tasks included in this area, forensic speaker recognition is considered as one of the most complex problems. Forensic speaker recognition, sometimes called forensic speaker comparison, is a process for making judgments on whether or not two speech samples are from the same speaker. This chapter introduces the historical backgrounds of forensic speaker recognition including “voiceprint” controversy, human-based visual and auditory forensic speaker recognition, and automatic forensic speaker recognition. Procedural considerations in forensic speaker recognition processes and factors that affect recognition performances are also presented. Finally, we will give a summary of the progress and developments made in the forensic automatic speaker recognition.
Kanae Amino, Takashi Osanai, Toshiaki Kamada, Hisanori Makinae, Takayuki Arai
Chapter 2. Automatic Speaker Recognition for Forensic Case Assessment and Interpretation
Abstract
Forensic speaker recognition (FSR) is the process of determining if a specific individual (suspected speaker) is the source of a questioned voice recording (trace). The forensic expert’s role is to testify to the worth of the voice evidence by using, if possible, a quantitative measure of this worth. It is up to the judge and/or the jury to use this information as an aid to their deliberations and decision. This chapter aims at presenting research advances in forensic automatic speaker recognition (FASR), including data-driven tools and related methodology, that provide a coherent way of quantifying and presenting recorded voice as biometric evidence, as well as the assessment of its strength (likelihood ratio) in the Bayesian interpretation framework, compatible with interpretations in other forensic disciplines. Step-by-step guidelines for the calculation of the biometric evidence and its strength under operating conditions of the casework are provided in this chapter. It also reports on the European Network of Forensic Science Institutes (ENFSI) evaluation campaign through a fake (simulated) case, organized by the Netherlands Forensic Institute (NFI), as an example, where an automatic method using the Gaussian mixture models (GMMs) and the Bayesian interpretation (BI) framework were implemented for the forensic speaker recognition task.
Andrzej Drygajlo
Chapter 3. Aural/Acoustic vs. Automatic Methods in Forensic Phonetic Case Work
Abstract
In this chapter focus will be on speech analysis in a forensic context. Both so called aural/acoustic approaches and automatic methods will be considered and their application in a forensic context described. Forensic casework introduces many challenges not found in the laboratory settings where the applied methods were originally developed. Forensic phonetic case work may involve many different types of tasks like voice comparison, speaker profiling, phonetic transcription and enhancement of poor quality recordings but in the present chapter only voice comparison will be described in any detail. The purpose of forensic speech science is to produce evidence that can be used in court. This introduces other types of challenges like the choice of presentation format. Pros and cons applying the traditional approach using verbal scales vs. using a likelihood ratio approach will be considered and put in the context of a current debate within forensics in general about the presentation of evidence.
Anders Eriksson
Chapter 4. Speaker Profiling: The Study of Acoustic Characteristics Based on Phonetic Features of Hindi Dialects for Forensic Speaker Identification
Abstract
Personal identification has been proved to be one of the important aspects of forensic science in which biometric features have been evaluated for information related to individuals. Biometric techniques of identification such as handwriting characteristic, speaker identification, fingerprint and DNA Profiling etc. are significant in forensic science for solving specific type of crime cases. Another aspect of identification of individuals through voice is speaker profiling, in which information about speakers are extracted from the recorded speech material. One of the speaker profile characteristics is the dialectal accent feature that can establish the speaker’s identity through his dialect. Khariboli, Bundeli, Kannauji, Haryanvi, Chattisgarhi, Marwari and Bhojpuri dialects are chosen from different parts of the Hindi speaking belt for this study. Khariboli is considered as the basic language due the close approximation to standard Hindi. Also it is the dialect that forms the basis of the modern standard Hindi. The prepared texts are transliterated using same script (Devnagri) but different vocabularies are used within the dialect group. Speakers have been selected from a uniform area of the relevant regional dialect. 15 male and 15 female speakers were selected from each dialectal region keeping the selection criteria in mind, giving a total of 210 speakers. Vowel quality and quantity of dialect speakers has been measured with the help of formant frequencies and vowel length and compared with Khariboli speakers. Intonation and tone has also been observed and compared. Bhojpuri, Chattisgarhi, Kannauji, Marwari, Khariboli, Bundeli and Haryanvi dialects are found unique for characterization in terms of vowel quality and vowel duration when compared with Khariboli. Acoustic features associated with lexical tone and sentence intonation are also found unique and useful to a dialect identification for speaker profiling. Our results illustrate that vowel quality, quantity, intonation and tone of a speaker as compared to Kahriboli (standard Hindi) could be the potential features for identification of dialect accent.
Manisha Kulshreshtha, C. P. Singh, R. M. Sharma

Speech Signal Degradation: Managing Problematic Conditions Affecting Probative Speech Samples

Frontmatter
Chapter 5. Speech Under Stress and Lombard Effect: Impact and Solutions for Forensic Speaker Recognition
Abstract
In the field of voice forensics, the ability to perform effective speaker recognition from input audio streams is an important task. However, in many situations, individuals willchange the manner in which they produce their speech due to the environment (i.e., Lombard Effect), their speaker state (i.e., emotion, cognitive stress), and secondary tasks (i.e., task stress at hand, both physical and/or cognitive). Automatic recognition schemes for both speech and speaker ID are impacted by the variability introduced in these conditions. Extensive research in the field of speech under stress has been performed for speech recognition, primarily for low-vocabulary isolated-word recognition. However, limited formal research has been performed for speaker ID/verification primarily due to the lack of effective corpora in the field. This chapter addresses speech under stress including Lombard effect for the purposes of speaker recognition. Domains where stress/variability occur (Lombard Effect, Physical Stress, Cognitive Stress) will first be considered. Next, to perform effective speaker recognition it is necessary to detect if a subject is under stress, which is a useful trait in and of itself for voice forensics and biometrics, and therefore we consider prior research on the detection of speech under stress. Next, the impact of stress on speaker recognition is considered, and finally we address ways to improve speaker recognition in these domains (TEO features, alternative sensors, classification schemes, etc.). While speech under stress has been considered, the domain of speaker recognition represents an emerging research aspect which deserves further investigations.
John H. L. Hansen, Abhijeet Sangwan, Wooil Kim
Chapter 6. Speaker Identification over Narrowband VoIP Networks
Abstract
Automatic Speaker Recognition (ASR) has been an active area of research for the past four decades with speech collected mostly in research laboratory environments. However, due to growing applications and possible misuses of Voice over Internet Protocol (VoIP) networks, there is a need to employ robust ASR systems over VoIP networks, especially within the context of internet security and law enforcement activities. There is, however, little systematic study on analyzing effects of several artifacts of VoIP (such as speech codec, packet loss, packet reordering, network jitter and foreign-cross talk or echo) on performance of an ASR system. This chapter investigates each of the issues of VoIP individually and trades it with the performance of the ASR system. In this chapter, a narrowband 2.4 kbps mixed-excitation linear prediction (MELP) codec is used over a VoIP network.
Hemant A. Patil, Aaron E. Cohen, Keshab K. Parhi
Chapter 7. Noise Robust Speaker Identification: Using Nonlinear Modeling Techniques
Abstract
Session variability is one of the challenging tasks in forensic speaker identification. This variability in terms of mismatched environments seriously degrades the identification performance. In order to address the problem of environment mismatch due to noise, different types of robust features are discussed in this chapter. In state-of-the art features, the speech production system is modeled as a linear source-filter model. However, this modeling technique neglects some nonlinear aspects of speech production, which carry some speaker-specific information. Furthermore, the state-of-the art features are based on either speech production mechanism or speech perception mechanism. To overcome such limitations of existing features, features derived using non-linear modeling techniques are proposed in the chapter. The proposed features, Teager energy operator based cepstral coefficients (TEOCC) and amplitude-frequency modulation (AM-FM) based ‘Q’ features show significant improvement in speaker identification rate in mismatched environments. The performance of these features is evaluated for different types of noise signals in the NOISEX-92 database with clean training and noisy testing environments. The speaker identification rate achieved is 57% using TEOCC features and 97% using AM-FM based ‘Q’ features for 0 dB SNR compared to 25.5% using MFCC features, when the signal is corrupted by car engine noise. It is shown that, with the proposed features, speaker identification accuracy can be increased in presence of noise, without any additional pre-processing of the signal to remove noise.
Raghunath S. Holambe, Mangesh S. Deshpande
Chapter 8. Robust Speaker Recognition in Noisy Environments: Using Dynamics of Speaker-Specific Prosody
Abstract
In this chapter, we propose speaker-specific prosodic features for improving the performance of speaker recognition in noisy environments. This approach can be especially useful in the forensic analysis of speech. Degradation in speaker recognition is a common phenomenon observed due to transmission and channel impairments, microphone variability and background noise. In this work spectral features are used to perform speaker recognition in the first stage and dynamic aspects of speaker-specific prosody are used to improve the performance in the second stage. For this task, speech corpus is collected at Indian Institute of Technology, Kharagpur, using 50 speakers recorded over the mobile phone. Background noise is simulated using additive white random noise from Noisex database. Speech enhancement techniques are used to improve the speaker recognition performance in the case of noisy speech. Gaussian mixture models (GMMs) and support vector machines (SVMs) are used for developing speaker models. Performance of the speaker recognition system is observed to be 55 and 66% using prosodic and spectral features respectively, for TIMIT speech at 15 dB SNR.. The speaker recognition performance of around 73% is achieved using the combination of spectral and prosodic features for noisy speech after speech enhancement.
Shashidhar G. Koolagudi, K. Sreenivasa Rao, Ramu Reddy, Vuppala Anil Kumar, Saswat Chakrabarti
Chapter 9. Characterization of Noise Associated with Forensic Speech Samples
Abstract
For speech enhancement, different methods have been developed in the past decades. This study has been carried out for characterization of various noises associated with forensic speech samples and their classification to find specific set of filtering technique for speech recognition and speaker identification. Noisy speech samples are collected from the exhibits received in case examination in the laboratory for this study. The experiment is performed in a two-fold way: enhancing the speech for (i) speech recognition and (ii) speaker identification. The original and simulated samples are subjected to various filtering techniques, namely, FFT Filter, noise reduction, noise gate, notch filter, bandpass, butterworth filter, digital equalizer and parametric equalizer for speech recognition. For speaker identification, noise reduction, noise gate, notch filter, bandpass and butterworth filter are applied to the noisy speech samples. Characterization of noise embedded with the noisy speech samples were attained based on the application of these filtering techniques and subsequent analysis performed on them using Computerized Speech Laboratory (CSL). For speech recognition, maximum SNR improvement was achieved by FFT filter on samples Noisy Speech-I (Direct Recording), Noisy Speech-II (Telephonic Landline Recording) and Noisy Speech-III (Mobile Phone Recording). The corresponding improvements in SNR for original and simulated samples were 3.81, 7.57, 5.62 dB and 4.39, 6.26, 5.57 dB respectively. FFT filter, when applied to the Noisy Speech-I, Noisy Speech-II and Noisy Speech-III of original noisy speech samples, have given an improvement of 75, 71 and 48%, whereas simulated noisy speech samples gave an improvement of 82, 78 and 52%. For speaker identification, maximum improvement was achieved by noise reduction filter when applied to the Noisy Speech-I, Noisy Speech-II and Noisy Speech-III of original noisy speech samples, have given an improvement of 60, 64 and 52% whereas simulated noisy speech samples gave an improvement of 64, 70 and 54%. Statistical study of improvised original noisy speech and simulated noisy speech samples after filtering have revealed the degree of efficiency of different filters for Speaker Identification and how far they are dependable in forensic adverse contexts. For Speech Recognition, the degree of efficiency of filters in enhancing the speech signal is found to be in a descending order; viz. FFT Filter, Noise reduction, Noise gate, Notch filter, Bandpass, Butterworth filter, Digital equalizer and Parametric equalizer. The degree of efficiency of filters in enhancing the speech signal for Speaker Identification is found to be in a descending order; viz. Noise Reduction, Noise Gate, Notch filter, Bandpass, and Butterworth filter.
Jiju P. V., C. P. Singh, R. M. Sharma
Chapter 10. Speech Processing for Robust Speaker Recognition: Analysis and Advancements for Whispered Speech
Abstract
In the field of voice forensics, the ability to perform effective speaker recognition from input audio streams is an important task. However, in many situations, individuals may prefer to lower their risk of being heard in public settings via whisper mode during communications. It is in precisely these conditions that speaker recognition should remain effective. Limited formal research has been performed in this domain to date. Whisper is an alternative speech production mode used by subjects in public conversation to protect content privacy or identity. Due to the profound differences between whisper and neutral speech in terms of spectral structure, the performance of speaker identification systems trained with neutral speech degrade significantly. In this chapter, studies that address acoustic analysis of whisper will be reviewed. Next, an effective data collection procedure for both spontaneous and read whisper speech will be introduced. An algorithm for whisper speech detection, which is a crucial front-end for whisper speech processing algorithms, will be presented. Finally, a seamless neutral/whisper mismatched closed-set speaker recognition system will be introduced. In the evaluation, a traditional MFCC-GMM system is employed as the baseline speaker ID system. An analysis of both speaker and phoneme variability in speaker ID performance using neutral trained GMMs is provided, which forms the basis for a final combined whisper based speaker ID system is presented. Experimental results are also provided followed by directions for future work.
John H. L. Hansen, Chi Zhang, Xing Fan

Methods and Strategies: Analyzing Features of Speaker Recognition to Optimize Voice Verification System Performance in Legal Settings

Frontmatter
Chapter 11. Effects of the Phonological Contents and Transmission Channels on Forensic Speaker Recognition
Abstract
This chapter introduces experiments on speaker recognition where we focus on two of the factors that affect speaker recognition accuracy: phonological contents of the speech materials used for identifying speakers and the transmission channel difference in automatic speaker verification. Through the experiments, we show that nasal sounds are effective for forensic speaker recognition despite the differences in speaker sets and recording channels. Also we show that performance degradation by the channel difference, in this study air- and bone-conduction, can be improved by devising normalisation methods and acoustic parameters.
Kanae Amino, Takashi Osanai, Toshiaki Kamada, Hisanori Makinae, Takayuki Arai
Chapter 12. Aerodynamic and Acoustic Theory of Voice Production
Abstract
A theory of voice production for vowels has to deal with two related problems; the problem of biomechanical modeling of vocal fold vibrations and the problem of calculating volume-velocity airflow through the glottis or the glottal airflow. This report is a tutorial on the second problem. We call this the aerodynamic and acoustic theory of voice production. Calculation of glottal airflow is difficult since it depends on an interaction between (1) the nonlinear time varying glottal impedance specified in the time domain and (2) the subglottal and vocal tract input impedances specified in the frequency domain. The effect of glottal geometry on the glottal impedance and the role of glottal impedance elements like kinetic resistance, viscous resistance and glottal inductance in determining glottal airflow are discussed. Methods to calculate vocal tract or subglottal input impedance based on a transmission line analog model and a formant network model are presented. Equations to find glottal airflow with source-filter interaction are derived. A digital pole-zero modeling of input impedance is proposed for an efficient and accurate computation of glottal airflow. The role of various factors in determining the so called residue, ripple and superposition components of glottal airflow is discussed with examples. The time domain response of a vowel is calculated using the glottal airflow with source-filter interaction. The instantaneous frequency and instantaneous bandwidth of an interactive vowel response are computed and interpreted. Further research is needed to extend the theory to the case of breathy vowels, vowel onsets, consonant to vowel and vowel to consonant transitions where the acoustic waves are superposed on a large dynamically changing mean airflow. A good understanding of the theory guides one in appropriate modeling and interpretation of voice source. The relevant features in voice source for a specific application such as forensic speaker identification can thus be identified. The author believes that habitually formed relative dynamic variations in voice source parameters are of greater significance in forensic speaker recognition.
T. V. Ananthapadmanabha
Chapter 13. Prosodic Features for Speaker Recognition
Abstract
In this chapter the effectiveness of syllable-based prosodic features for speaker recognition is discussed. The term prosody represents a collection of characteristics such as intonation, stress and timing, primarily expressed using variations in pitch, energy and duration at various levels of speech. Prosody reflects the learned/acquired speaking habits of a person and hence contributes for speaker recognition. Because prosodic features are less affected by channel mismatch and noise, they are particularly well suited for speaker forensics, a field that demands accurate identification of suspects with as few mitigating conditions as possible. In this chapter, the author describes a method for extracting prosodic features directly from speech signal. Applying this method, speech is segmented into syllable-like regions using vowel onset points (VOP). The locations of VOPs serve as reference for extraction and representation of prosodic features. The effectiveness of the prosodic features for speaker recognition is demonstrated for extended task of NIST speaker recognition evaluation 2003. Combining evidence from spectral features with that of the proposed prosodic features helps to improve overall speaker recognition accuracy.
Leena Mary
Chapter 14. Speaker Identification Using Intermediate Matching Kernel-Based Support Vector Machines
Abstract
Gaussian mixture model (GMM) based approaches have been commonly used for speaker recognition tasks. Methods for estimation of parameters of GMMs include the expectation-maximization method which is a non-discriminative learning based method and the large margin method which is a discriminative learning based method. Discriminative classifier based approaches to speaker recognition include support vector machine (SVM) based classifiers using dynamic kernels such as generalized linear discriminant sequence kernel, probabilistic sequence kernel, GMM supervector kernel and Bhattacharyya distance based kernel. Recently, the intermediate matching kernel (IMK) has been proposed as a dynamic kernel for recognition of objects in an image represented using a set of local feature vectors. The IMK-based SVMs give a better performance than the state-of-the-art GMM-based approaches for speaker identification tasks, because they are well suited for meeting the basic challenge of providing reliable scores of intra-speaker variation of suspects and scores of inter-speaker variation of the potential population which is crucial to law enforcement and counter terrorism agencies in evaluating the strength of the evidence at hand. Thus, the IMK-based SVMs can be used to build the speaker recognition models in the FSR (forensic speaker recognition) systems. However, it is necessary to develop techniques to determine the strength of evidence from the outputs of SVM-based models. The SVM-based models are trained using discriminative methods and their generalization ability is good. We propose to use the IMK-based SVM classifier for speaker identification from the speech signal of an utterance represented as a set of local feature vectors. The main issue in building the IMK-based SVM classifier is selection of the virtual feature vectors using which the local feature vectors from the representations of two different utterances are matched. We explore the use of components of universal background GMM as the set of virtual feature vectors. We compare the performance of the GMM-based approaches and the dynamic kernel SVM-based approaches to speaker identification. The 2002 and 2003 NIST speaker recognition corpora are used in evaluation of different approaches to speaker identification. Results of our studies show that the dynamic kernel SVM-based approaches give a significantly better performance than the GMM-based approaches. For speaker identification task, the IMK-based SVM gives a performance that is comparable to that of SVMs using any of the other dynamic kernels. The storage requirements and the computational complexity of the IMK-based SVMs are less than of SVMs using any of the other dynamic kernels.
A. D. Dileep, C. Chandra Sekhar

Applications to Law Enforcement and Counter-Terrorism

Frontmatter
Chapter 15. Speaker Spotting: Automatic Telephony Surveillance for Homeland Security
Abstract
Automating telephony surveillance is an appealing and appropriate technology from the view point of being able to detect/spot if a person from a specific watch-list is on line. Such an automatic solution is of considerable interest in the context of homeland security where a potentially large number of wire tapped conversations may have to be processed in parallel, in different deployment scenarios and demographic conditions, and with typically large watch-lists, all of which make manual lawful interception unmanageable, tedious and perhaps even impossible. In this chapter, we first introduce this problem domain starting with a sketch of a glamorous fictitious example, followed by an outline of lawful interception and wire-tapping; we then take a brief look at similar watch-list based negative recognition application using the now very successful Iris biometrics and consider equivalent scenarios in the context of speaker-spotting based on voice as a biometric. Further, in the main body of this chapter, we first provide the basic framework for watch-list based speaker-spotting, namely, open-set speaker identification, subsequently refined into a ‘multi-target detection’ framework. We then examine in some detail the main theoretical analysis available within the framework of multi-target identification, leading to performance predictions of such systems with respect to the watch-list size as the critical factor. In a related note, we also briefly touch on the prioritization mode of operation which also lends itself to interesting theoretical analysis and performance predictions. Speaker-spotting systems face unique challenges, in a way combining the difficulties inherent in conventional speaker authentication applications as well as forensic speaker recognition applications; we consider these, while using the NIST SRE evaluation results to gain insights on the performances achievable presently and the latent performance limitations which seem to warrant a cautionary approach before widespread use of speaker recognition technology for surveillance applications becomes possible. In the later part of the chapter, we outline related topics such as speaker change detection, speaker segmentation and speaker diarization, followed by a summary of product level solutions currently available in the context of surveillance and homeland security applications, finally concluding with discussions highlighting the state-of-the-art and potential future research directions.
V. Ramasubramanian
Chapter 16. Helping the Forensic Research Institute of the French Gendarmerie to Identify a Suspect in the Presence of Voice Disguise or Voice Forgery
Abstract
In the field of forensic speaker recognition, the question of voice disguise presents a specific interest. Most criminals try to disguise their voice before making a malefic call or a terrorist threat. Their aim is to change the register of their voice quality in order to falsify their identity (voice disguise) or to mimic the voice of another person (voice forgery). This chapter proposes to analyse two different kinds of disguise: The first is the transformation of the voice by non-electronic and deliberate means; the second is the conversion of the voice by electronic and deliberate means. By considering both kinds of disguise (electronic and non-electronic) our analyses of voice transformation are based on an acoustic approach, which we use to measure specific changes in speech, and on an automatic approach to detect voice disguise. Four kinds of disguises which are considered the most common are studied: high pitched voice, low pitched voice, a hand over the mouth and pinched nostrils. A constraint of audibility and intelligibility has been imposed on the speakers who have recorded the database. The acoustic analysis of specific features reveals some differences according to the form of disguise, while in the automatic experiment we found the best way to detect a voice disguise is to use Support Vector Machines (SVM) technique. The level of performance is an AUC (area under curve) at 0.79. Voice conversion techniques are also proposed and applied in two forensic scenarios: first, the imitation of a politician from an Internet recording; and second, the application of voice disguise reversibility. Different kinds of tests are proposed to evaluate the relevance of the results, which are based on objective and subjective measurements. The best conversion is obtained from a GMM-ALISP voice conversion.
Patrick Perrot, Gérard Chollet
Chapter 17. Applying Lessons Learned from Commercial Voice Biometric Deployments to Forensic Investigations
Abstract
Commercial deployments of voice biometrics have predictably focused primarily on automating the correct acceptance of true users for telephony self-service. However, over the past few years, a trend has developed within the financial institutions to begin using voice biometric technology to look for duplicate enrollments or to investigate suspicious transaction activity. This trend opens the discussion of bringing relevant techniques and experiences from commercial voice biometric deployments into the forensic voice biometric space. This chapter explores those techniques that show promise for crossing over from commercial to forensic use.
Chuck Buffum
Chapter 18. Designing Better Speaker Verification Systems: Bridging the Gap between Creators and Implementers of Investigatory Voice Biometric Technologies
Abstract
Though automated and semi-automated speech analysis and identification technologies have massive potential within law enforcement, forensics labs and intelligence communities, adoption has been slow and sporadic. This is partly due to poor experience with the previous generations of voice biometric technologies combined with a cultural mis-perception of voice biometrics being considered easily “spoofable” due to television and movies. However, the voice biometric technology vendors also have contributed to this challenge by producing products that fail to address critical implementer challenges. There are critical problems that only voice biometrics can solve, but getting the solutions well positioned requires a deep understanding of the nature of government implementations that seems to escape the grasp of too many vendors. This chapter will explore a number of critical use cases and provide perspective on how technology creators can position their solutions to meet those needs.
Avery Glasser
Backmatter
Metadaten
Titel
Forensic Speaker Recognition
herausgegeben von
Amy Neustein
Hemant A. Patil
Copyright-Jahr
2012
Verlag
Springer New York
Electronic ISBN
978-1-4614-0263-3
Print ISBN
978-1-4614-0262-6
DOI
https://doi.org/10.1007/978-1-4614-0263-3

Neuer Inhalt