ABSTRACT
The following article describes our technical demonstration of an online speaker identification system for conversations. A laptop with an internal microphone is centrally placed in the table of a meeting room. The system is able to identify the current speaker independent of spoken text or language with a latency of about 1.5 seconds and an accuracy of about 85% (as evaluated against the NIST RT benchmark). A Java GUI shows the image of the current speaker along with a timeline containing past speakers. Speakers are added to the system's database using a one-minute training procedure.
- X. Anguera, C. Wooters, B. Peskin, and M. Aguilo. Robust speaker segmentation for meetings: The ICSI-SRI spring 2005 diarization system. In Proceeding of the NIST MLMI Meeting Recognition Workshop, Edinburgh, 2005. Google ScholarDigital Library
- D. A. Reynolds. Speaker identification and verification using gaussian mixture speaker models. Speech Communication, 17(1-2):91--108, 1995. Google ScholarDigital Library
- D. A. Reynolds and P. Torres-Carrasquillo. Approaches and applications of audio diarization. In Proceedings of the IEEE ICASSP, 2005.Google ScholarCross Ref
- O. Vinyals and G. Friedland. Towards semantic analysis of conversations: A system for the live identification of speakers in meetings. In Proceedings of IEEE International Conference on Semantic Computing (to appear), August 2008. Google ScholarDigital Library
- C. Wooters and M. Huijbregts. The ICSI RT07s speaker diarization system. In Proceedings of the RT07 Meeting Recognition Evaluation Workshop, 2007.Google Scholar
Index Terms
- Live speaker identification in conversations
Recommendations
Text-Independent Speaker Identification Using Vowel Formants
Automatic speaker identification has become a challenging research problem due to its wide variety of applications. Neural networks and audio-visual identification systems can be very powerful, but they have limitations related to the number of ...
Text-Independent/Text-Prompted Speaker Recognition by Combining Speaker-Specific GMM with Speaker Adapted Syllable-Based HMM
We presented a new text-independent/text-prompted speaker recognition method by combining speaker-specific Gaussian Mixture Model (GMM) with syllable-based HMM adapted by MLLR or MAP. The robustness of this speaker recognition method for speaking style'...
Speaker Identification Using Whispered Speech
CSNT '13: Proceedings of the 2013 International Conference on Communication Systems and Network TechnologiesThe study of closed set text-independent speaker identification using whisper speech is presented in this paper. A new feature called temporal Teager energy based sub band cepstral coefficients (TTESBCC) is proposed. The work presented compares the ...
Comments