Scale-Invariant Modification of COSH Distance for Measuring Speech Signal Distortions in Real-Time Mode

Savchenko, A. V.; Savchenko, V. V.

doi:10.3103/S0735272721060030

Scale-Invariant Modification of COSH Distance for Measuring Speech Signal Distortions in Real-Time Mode

Published: 02 August 2021

Volume 64, pages 300–309, (2021)
Cite this article

Radioelectronics and Communications Systems Aims and scope Submit manuscript

74 Accesses
9 Citations
Explore all metrics

Abstract

This study considers a new measure of distortions of speaker speech sounds that is invariant with respect to the gain of speech signal in a communication channel. Properties of the measure are investigated in comparison with its closest analogues. A series of theoretical features has been proved. The new measure is shown to combine advantages of the symmetric Itakura distance in relation to the noise immunity of automatic speech processing, on the one hand, and of the COSH distance in relation to the sensitivity to speech signal distortions, on the other hand. Using the proprietary software, an experiment was set up and conducted. Estimates of the new measure dependence on the signal-to-noise ratio were presented. It has been shown that the logarithmic presentation of this relationship has the pattern close to linear. The obtained results are intended to be used in development of new systems and upgrading of existing systems and technologies for digital signal processing and speech quality analysis under the noise exposure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Time Delay Estimation from Mixed Multispeaker Speech Signals Using Single Frequency Filtering

Article Open access 21 August 2019

Method for Measuring Distortions in Speech Signals during Transmission over a Communication Channel to a Biometric Identification System

Article 01 February 2021

A measure of differences in speech signals by the voice timbre

Article 01 January 2024

References

J. Sadasivan, C. S. Seelamantula, N. R. Muraka, "Speech enhancement using a risk estimation approach," Speech Commun., v.116, p.12 (2020). DOI: https://doi.org/10.1016/j.specom.2019.11.001.
Article Google Scholar
V. V. Savchenko, "Itakura–Saito divergence as an element of the information theory of speech perception," J. Commun. Technol. Electron., v.64, n.6, p.590 (2019). DOI: https://doi.org/10.1134/S1064226919060093.
Article Google Scholar
M. A. Bakhshali, M. Khademi, A. Ebrahimi-Moghadam, S. Moghimi, "EEG signal classification of imagined speech based on Riemannian distance of correntropy spectral density," Biomed. Signal Process. Control, v.59, p.101899 (2020). DOI: https://doi.org/10.1016/j.bspc.2020.101899.
Article Google Scholar
A. A. Borovkov, Mathematical Statistics. Additional Chapters (Nauka, Fizmatlit, Moscow, 1984).
Google Scholar
C. Liu, M. Jiang, "Robust adaptive filter with lncosh cost," Signal Process., v.168, p.107348 (2020). DOI: https://doi.org/10.1016/j.sigpro.2019.107348.
Article Google Scholar
D. Prasetyawan, T. Nakamoto, "Comparison of NMF with Kullback-Leibler divergence and Itakura-Saito divergence for Odor approximation," in 2019 IEEE International Symposium on Olfaction and Electronic Nose (ISOEN) (IEEE, Washington, 2019). DOI: https://doi.org/10.1109/ISOEN.2019.8823186.
Chapter Google Scholar
Y. Matsuyama, A. Buzo, R. Gray, "Spectral distortion measures for speech compression. Information Systems Lab., Stanford Electronics Lab., Tech. Rep. 6504-3," Stanford, California (1978).
F. Itakura, S. Saito, "Analysis synthesis telephony based on the maximum likelihood method," in Proc. 6th of the International Congress on Acoustics (IEEE, Los Alamitos, CA, 1968). URI: http://www.fon.hum.uva.nl/praat/manual/Itakura___Saito__1968_.html.
Google Scholar
R. Gray, A. Buzo, A. Gray, Y. Matsuyama, "Distortion measures for speech processing," IEEE Trans. Acoust. Speech, Signal Process., v.28, n.4, p.367 (1980). DOI: https://doi.org/10.1109/TASSP.1980.1163421.
Article MATH Google Scholar
S. Kullback, Information Theory and Statistics (Dover Publications, New York, 1997). URI: https://www.amazon.com/Information-Theory-Statistics-Dover-Mathematics/dp/0486696847.
MATH Google Scholar
F.-L. Xie, F. K. Soong, H. Li, "Voice conversion with SI-DNN and KL divergence based mapping without parallel training data," Speech Commun., v.106, p.57 (2019). DOI: https://doi.org/10.1016/j.specom.2018.11.007.
Article Google Scholar
A. A. Gharbali, S. Najdi, J. M. Fonseca, "Investigating the contribution of distance-based features to automatic sleep stage classification," Comput. Biol. Med., v.96, p.8 (2018). DOI: https://doi.org/10.1016/j.compbiomed.2018.03.001.
Article Google Scholar
V. V. Savchenko, "A method of measuring the index of acoustic voice quality based on an information-theoretic approach," Meas. Tech., v.61, n.1, p.79 (2018). DOI: https://doi.org/10.1007/s11018-018-1391-8.
Article Google Scholar
Y. Gu, H.-L. Wei, "A robust model structure selection method for small sample size and multiple datasets problems," Inf. Sci., v.451–452, p.195 (2018). DOI: https://doi.org/10.1016/j.ins.2018.04.007.
Article MATH Google Scholar
F. Mustiere, M. Bouchard, M. Bolic, "All-pole modeling of discrete spectral powers: A unified approach," IEEE Trans. Audio, Speech, Lang. Process., v.20, n.2, p.705 (2012). DOI: https://doi.org/10.1109/TASL.2011.2163511.
Article Google Scholar
S. Shamila Rachel, U. Snekhalatha, K. Vedhasorubini, D. Balakrishnan, "Spectral analysis of speech signal characteristics: A comparison between healthy controls and Laryngeal disorder," in Proc. International Conference on Intelligent Computing and Applications (Springer, Singapore, 2018). DOI: https://doi.org/10.1007/978-981-10-5520-1_31.
Chapter Google Scholar
B. Wei, J. D. Gibson, "A new discrete spectral modeling method and an application to CELP coding," IEEE Signal Process. Lett., v.10, n.4, p.101 (2003). DOI: https://doi.org/10.1109/LSP.2003.808550.
Article Google Scholar
A. Ben Aicha, "Machine learning based approach to assess denoised speech," Procedia Comput. Sci., v.159, p.698 (2019). DOI: https://doi.org/10.1016/j.procs.2019.09.225.
Article Google Scholar
M. E. Hossain, M. S. A. Zilany, E. Davies-Venn, "On the feasibility of using a bispectral measure as a nonintrusive predictor of speech intelligibility," Comput. Speech Lang., v.57, p.59 (2019). DOI: https://doi.org/10.1016/j.csl.2019.02.003.
Article Google Scholar
V. V. Savchenko, A. V. Savchenko, "Method for measuring distortions of a speech signal during its transmission over a communication channel to a biometric identification system," Izmer. Tekhnika, n.11, p.65 (2020). DOI: https://doi.org/10.32446/0368-1025it.2020-11-65-72.
Article Google Scholar
V. V. Savchenko, "Minimum of information divergence criterion for signals with tuning to speaker voice in automatic speech recognition," Radioelectron. Commun. Syst., v.63, n.1, p.42 (2020). DOI: https://doi.org/10.3103/S0735272720010045.
Article MathSciNet Google Scholar
V. V. Savchenko, "Words phonetic decoding method with the suppression of background noise," J. Commun. Technol. Electron., v.62, n.7, p.788 (2017). DOI: https://doi.org/10.1134/S1064226917070099.
Article Google Scholar
V. V. Savchenko, A. V. Savchenko, "Criterion of significance level for selection of order of spectral estimation of entropy maximum," Radioelectron. Commun. Syst., v.62, n.5, p.223 (2019). DOI: https://doi.org/10.3103/S0735272719050042.
Article Google Scholar
J. Benesty, J. Chen, Y. Huang, "Linear prediction," in Springer Handbook of Speech Processing (Springer Berlin Heidelberg, Berlin, Heidelberg, 2008). DOI: https://doi.org/10.1007/978-3-540-49127-9_7.
Chapter Google Scholar
F. Itakura, "Minimum prediction residual principle applied to speech recognition," IEEE Trans. Acoust. Speech, Signal Process., v.23, n.1, p.67 (1975). DOI: https://doi.org/10.1109/TASSP.1975.1162641.
Article Google Scholar
E. Estrada, H. Nazeran, F. Ebrahimi, M. Mikaeili, "Symmetric Itakura distance as an EEG signal feature for sleep depth determination," in ASME 2009 Summer Bioengineering Conference, Parts A and B (American Society of Mechanical Engineers, 2009). DOI: https://doi.org/10.1115/SBC2009-206233.
Chapter Google Scholar
O. Diana, A. Mihaela, "Feature extraction and classification methods for a motor task brain computer interface: A comparative evaluation for two databases," Int. J. Adv. Comput. Sci. Appl., v.8, n.8 (2017). DOI: https://doi.org/10.14569/IJACSA.2017.080834.
Article Google Scholar

Download references

Acknowledgments

This investigation was carried out at the expense of the grant from the Russian Science Foundation (Project no. 20-71-10010).

Author information

Authors and Affiliations

National Research University Higher School of Economics, Nizhny Novgorod, Russia
A. V. Savchenko
Nizhny Novgorod State Linguistic University, Nizhny Novgorod, Russia
V. V. Savchenko

Authors

A. V. Savchenko
View author publications
You can also search for this author in PubMed Google Scholar
V. V. Savchenko
View author publications
You can also search for this author in PubMed Google Scholar

Ethics declarations

ADDITIONAL INFORMATION

A. V. Savchenko and V. V. Savchenko

The authors declare that they have no conflicts of interest.

This article does not contain any studies with human participants or animals performed by any of the authors.

The initial version of this paper in Russian is published in the journal “Izvestiya Vysshikh Uchebnykh Zavedenii. Radioelektronika,” ISSN 2307-6011 (Online), ISSN 0021-3470 (Print) on the link http://radio.kpi.ua/article/view/S0021347021060030 with DOI: https://doi.org/10.20535/S0021347021060030

Additional information

Translated from Izvestiya Vysshikh Uchebnykh Zavedenii. Radioelektronika, No. 6, pp. 350-361, May, 2021 https://doi.org/10.20535/S0021347021060030 .

About this article

Cite this article

Savchenko, A.V., Savchenko, V.V. Scale-Invariant Modification of COSH Distance for Measuring Speech Signal Distortions in Real-Time Mode. Radioelectron.Commun.Syst. 64, 300–309 (2021). https://doi.org/10.3103/S0735272721060030

Download citation

Received: 28 December 2020
Revised: 22 April 2021
Accepted: 16 June 2021
Published: 02 August 2021
Issue Date: June 2021
DOI: https://doi.org/10.3103/S0735272721060030

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions