A novel pipelined neural IIR adaptive filter for speech prediction

doi:10.1016/j.apacoust.2018.06.007

Applied Acoustics

Volume 141, 1 December 2018, Pages 64-70

https://doi.org/10.1016/j.apacoust.2018.06.007 Get rights and content

Abstract

The paper presents a pipelined neural IIR filter (PNIIR) for nonlinear speech prediction. It inherits the usual pipelined two layers architecture: the nonlinear modular cascaded subsections and linear combiner subsection, and the nonlinear and linear weights of each module are updated using an real-time learning algorithms. The PNIIR filter units the good tracking performance of the neural IIR network and the low computation load of the pipelined architecture. The performance analysis and complexity analysis are illustrated in this paper. The experimental study for speech prediction is also carried out to testify the efficiency of the proposed nonlinear filter.

Introduction

The prediction is of great importance for the modeling and coding of speech signals [1]. Speech prediction has been widely used in practical applications [2], [3], [4]. Linear adaptive prediction frameworks are available to the prediction of speech signals, but they do not perform well for speech signals due to their inherent nonlinearity and nonstationary. Therefore, numerous researchers have paid attention to the nonlinear adaptive prediction technologies and have shown its superiorities than the conventional linear adaptive prediction technologies [5], [6] accounting for the nonlinear nature of the speech signals. So far the most widely used nonlinear filters being available to nonlinear speech prediction are mainly the following categories: neural networks (NNs) and polynomial filters (PFs).

NNs have the prime advantages of adaptive learning ability based on optimization and statistical theories [7]. Therefore, different types of adaptive nonlinear filters based on NNs proposed in different types of literature, such as radial basis function networks (RBF) [8], recurrent neural network (RNN) [9], functional link artificial NNs [10], etc., are a powerful and effective tool in nonlinear prediction mechanism. Due to the feedback mechanism, RNNs can accurately model the nonlinear dynamic system [11] with smaller structures than nonrecursive neural networks. This characteristic enables it to be successfully applied in nonlinear prediction of speech signal. But the RNNs suffer from heavy computational burden. To improve the above drawback of RNNs, a computationally efficient nonlinear predictor called pipelined recurrent neural network (PRNN) was presented in the literature [12]. In the past few years, some new systems based on neural networks [13], [14], [15] were investigated to further enhance the nonlinear prediction ability for speech signals.

The polynomial filter also plays an important role in nonlinear prediction. Volterra filter is one of the most widely used polynomial filters due to its good nonlinear processing ability and its simple implementation architecture. Moreover, the truncated Volterra filter has been successfully applied in nonlinear prediction of speech signals [16], while its output is still linear with respect to various higher order kernels or impulse responses. However, the number of the Volterra coefficients required to model nonlinear system grows with the delays (or memories) and orders, which results in a heavy computational burden. In order to avoid the aforementioned problem, researchers devoted into the further research of these methods: low-complexity adaptive algorithms [17], [18] and low-complexity filtering constructions [19], [20]. Especially, a nonlinear adaptive joint process filter based on pipelined feedforward second-order Volterra architecture (JPPSOV) [21] is proposed to overcome the computational loads of the Volterra adaptive filter. In recent years, some new methods (such as the volterra-wiener series [22], modified normalized least mean M-estimate [23], pipelined set-membership approach [24], bilinear model [25], etc.) are introduced into Volterra filters, and a hierarchical pipelined adaptive Volterra filter [26] based alternative update mechanism was designed by Pang and Zhang. This sort of Volterra filters are powerful in nonlinear speech prediction. Yet, at the same time, the computational complexity of these filters still deserves to be further studied for the nested pattern of pipelined structure.

In this paper, addressing the computational complexity of two types of typical adaptive filters based on pipelined structure (such as PRNN and JPPSOV), a novel nonlinear filter with pipelined construction based on neural infinite impulse response (IIR) filter (PNIIR) is presented to model the nonlinear prediction system for speech signals. The proposed filter achieves a relatively lower computational complexity through the modularity of the pipelined realizations. Furthermore, different from PRNN, JPPSOV and neural IIR filter, the nonlinear prediction capability of the PNIIR filter is enhanced by the output’s combination of the linear combiner section and nonlinear standard IIR section.

The remainder of this paper is organized as follows. In Section 2, it gives a review of the basic knowledge of the nonlinear neural IIR adaptive filter. Section 3 introduces the novel modular network for pipelined neural IIR filter and derives its adaptive algorithm. The stability performance and convergence condition are presented in Section 4. Analysis of the computational complexity is given in Section 5. Section 6 presents the experimental study for nonlinear speech prediction to demonstrate the effectiveness of the proposed nonlinear filter by comparing with PRNN and JPPSOV. Finally, the conclusion is drawn in Section 7.

Section snippets

Basic nonlinear neural IIR adaptive filter

In this section, to model the nonlinear prediction systems proposed in this paper, here introduces the construction with a single-input, single-output, nonlinear neural IIR filter [27], which is shown in Fig. 1. In this construction, the external filter input, x(n), joining with some delayed versions of this signal, x(n−1), x(n−2), …, x(n−p + 1) and the delayed versions of the output signal, y(n−1), y(n−2), …, y(n−N + p), form the components of the input to the neural network. The neural

The architecture of PNIIR filter

From the viewpoint of biological modularity, one neural IIR filter can iteratively process signals, and several neural IIR filters can be cascaded to construct a larger filter. According to this theory, this section describes a novel modular cascade architecture called pipelined neural infinite impulse response (IIR) filter (PNIIR) that consists of a certain number M of neural IIR filter. The PNIIR filter is divided into two subsections: nonlinear pipelined subsection consisted of standard IIR

Performance analysis

In Section 3, the gradient of J(n) with respect to the weight vector H_i(n) and W(n) have been calculated. However, the weight update Eqs. (17), (18) do not ensure its stability unless a strong condition is imposed on the step sizes $η_{h}$ and $η_{w}$ . So the detailed stability analysis and convergence conditions are given in this section.

Because of the same form of the weight updating equations in nonlinear section and linear section, the recursive equation can be expressed as follows according to the

Complexity analysis

The computational complexities of PRNN filter, JPPSOV filter and the proposed PNIIR filter are analyzed in this part.

For the total number of MN neurons in PRNN, the entire computational requirement of processing a single sample is O(MN⁴ + 3(q + 1)) arithmetic operations, where N is number of neurons per module and q is number of taps in the tapped-delay-line (TDL) filter. Adaptive JPPSOV filter using the NLMS algorithm requires (M + 3)L + 3 M + 2 multiplications and (M + 2)L + 3 M-1 additions,

Simulations

In this section, the performance of the proposed filter for the nonlinear adaptive prediction of speech signals is evaluated via computer simulation comparing with PRNN and JPPSOV in terms of prediction error and prediction gain.

Conclusion

In this paper, a novel pipelined architecture based on neural IIR filter (PNIIR) is proposed for nonlinear speech prediction. Due to the advantages of the pipelined parallel operation and the neural IIR network, the nonlinear prediction capability of PNIIR filter is enhanced and the computational complexity is decreased. The simulation experiments have testified that the proposed PNIIR filter is effective in the application of nonlinear speech prediction and its performance is a little superior

Acknowledgement

This work was partially supported by National Science foundation of P. R. China (Grant: 61671392).

References (30)

Chris Bibby et al.
Prediction study of factors affecting speech privacy between rooms and the effect of ventilation openings
Appl Acoust
(2013)
J. Keränen et al.
Prediction of the spatial decay of speech in open-plan offices
Appl Acoust
(2013)
Laurent Galbrun et al.
Accuracy of speech transmission index predictions based on the reverberation time and signal-to-noise ratio
Appl Acoust
(2014)
J. Zhang et al.
Pipelined robust M-estimate adaptive second-order Volterra filter against impulsive noise
Digit Signal Process
(2014)
S. Zhang et al.
Pipelined set-membership approach to adaptive Volterra filtering
Signal Process
(2016)
J. Zhang et al.
A novel adaptive bilinear filter based on pipelined architecture
Digital Signal Process
(2010)
Y. Pang et al.
A hierarchical alternative updated adaptive Volterra filter with pipelined architecture
Digital Signal Process
(2016)
Rabiner Lawrence et al.
Fundamentals of speech recognition
(1993)
B.A. Kiselman et al.
Comparative analysis of linear and nonlinear speech signals predictors
IEEE Trans Speech Audio Process
(2005)
Ndez-Zanuy, Marcos, et al., “A Comparative Study Between Linear and Nonlinear Speech Prediction,” Biological &...

S. Haykin

Neural networks: a comprehensive foundation

(1994)

Birgmeier and Martin, “Nonlinear prediction of speech signals using radial basis function networks,” European Signal...

Wang, An Hong, et al., “A nonlinear prediction speech coding ADPCM algorithm based on RNN,” International Conference on...

H. Zhao et al.

Pipelined chebyshev functional link artificial recurrent neural network for nonlinear adaptive filter

IEEE Trans Syst Man Cyber B Cyber Publ IEEE Syst Man Cyber Soc

(2010)

D.P. Mandic et al.

Recurrent neural networks for prediction: learning algorithms, architectures and stability

(2001)

Cited by (6)

Pipelined nonlinear spline filter for speech prediction
2021, Applied Acoustics
Citation Excerpt :
The computational is greatly improved due to parallel updates of module parameters. Based on the method of pipelined framework, several architectures and algorithms such as the JPPSOV, robust pipelined SOV filter (RPSOV), joint pipelined bilinear polynomial filter (JPBPF), PNIIR, hierarchical pipelined adaptive Volterra filter (HPAVF), pipelined adaptive Volterra set-membership (PAVF-SM) algorithm and hierarchical partial update generalized FLANN (HPU-GFLANN) algorithm have been proposed in [23–27,9,28]. Following this pipelined cascade architecture and extending the work on spline filter, a novel pipelined nonlinear adaptive spline filter (PNSF) is proposed in this paper.
In this paper, a new pipelined nonlinear spline adaptive filter (PNSF) is presented for the speech prediction application. The proposed architecture is essentially an improved pipelined cascade model, where each module consists of a FIR filter followed by a spline activation function. Based on minimum mean square error cost and stochastic gradient method, the on-line learning adaptive algorithms for updating the nonlinear and linear weights are derived. We analyze the selection range of the learning rate involved in the learning algorithms to ensure the convergence of the algorithms. Simulations are carried out to evaluate the performance of the PNSF on nonlinear system identification and speech prediction. Experimental results show that the PNSF provides better performance compared to the spline adaptive filter (SAF), joint process filter using pipelined second-order Volterra filter (JPPSOV) and pipelined neural IIR (PNIIR) models.
Design and optimisation of regression-type small phase shift FIR filters and FIR-based differentiators with optimal local response in LS-sense
2021, Mechanical Systems and Signal Processing
Citation Excerpt :
They can achieve a comparable (to FIR filters) passband attenuation with a lower amount of coefficients However, stability analysis is required, and these methods do not guarantee the desired response of the IIR filter [57]. Despite their relatively complex closed-loop architecture, IIR architectures with adaptive algorithms are used, e.g. in speech prediction procedures [58] and adaptive filtration [59], where the IIR filters are used for estimation issues treated as an autoregressive process. However, the weight determination methods dedicated to FIR or IIR filter architectures are not the only filtration solutions.
The design of finite impulse response (FIR) filters and FIR-based differentiators is a popular topic in the field of digital signal processing. FIR filters/differentiators are characterised by a filtering architecture that does not include a feedback loop. Therefore, they are stable and have a relatively simple design, and most importantly, they can yield a filter with a linear phase diagram. Nevertheless, the FIR filter design methods reported in the scientific literature lead to a filter architecture that causes a transport delay or a nonlinear phase shift in the implementation environment. This limits the applicability of FIR filters/differentiators, e.g. in closed-loop control or real-time state estimation. In this paper, a method for FIR filter and FIR-based differentiator coefficient design is presented. The coefficients of the FIR filter and FIR-based differentiator are designed such that the FIR architecture results in a convolution leading to a locally optimal response in the least-squares sense. Thus, the FIR architecture response is characterised by an extremely small phase shift for the passband of the designed architecture. This feature results in a higher-order FIR architecture compared with the conventional design methods. In the proposed method, the delay caused by the FIR architecture is minimised by coefficient determination. Therefore, when the method is applied to a known FIR architecture, it causes a shorter delay than all other methods of coefficient determination.
Gated recurrent unit predictor model-based adaptive differential pulse code modulation speech decoder
2024, Eurasip Journal on Audio, Speech, and Music Processing
Waveform based speech coding using nonlinear predictive techniques: a systematic review
2023, International Journal of Speech Technology
Adaptive Speech Coding System Using Gated Recurrent Unit Predictor Model and Differential Pulse Code Modulation
2023, SSRN
Selfish herd optimization algorithm based on chaotic strategy for adaptive IIR system identification problem
2020, Soft Computing

View full text

A novel pipelined neural IIR adaptive filter for speech prediction

Abstract

Introduction

Section snippets

Basic nonlinear neural IIR adaptive filter

The architecture of PNIIR filter

Performance analysis

Complexity analysis

Simulations

Conclusion

Acknowledgement

Appl Acoust

Appl Acoust

Appl Acoust

Digit Signal Process

Signal Process

Digital Signal Process

Digital Signal Process

Fundamentals of speech recognition

Comparative analysis of linear and nonlinear speech signals predictors

IEEE Trans Speech Audio Process

Neural networks: a comprehensive foundation

Pipelined chebyshev functional link artificial recurrent neural network for nonlinear adaptive filter

IEEE Trans Syst Man Cyber B Cyber Publ IEEE Syst Man Cyber Soc

Recurrent neural networks for prediction: learning algorithms, architectures and stability