Skip to main content

2001 | Buch

Advances in Network and Acoustic Echo Cancellation

verfasst von: Jacob Benesty, Tomas Gänsler, Dennis R. Morgan, M. Mohan Sondhi, Steven L. Gay

Verlag: Springer Berlin Heidelberg

Buchreihe : Digital Signal Processing

insite
SUCHEN

Über dieses Buch

For many decades, hybrid devices have been used to connect 2-wire local circuits and 4-wire long distance circuits in telephone lines. This leads to a weIl known problem, whereby echoes are generated. The delay introduced by telecommunication satellites exacerbated this problem and the need for new methods of echo control soon became obvious. The best solution to date for solving this problem was invented in the 1960s at Bell Labs by Kelly, Logan, and Sondhi, and consists of identifying the echo path generated by the hybrid by means of an adaptive filter, a technique that became known as an echo canceler. The echo canceler allowed full-duplex communication which was not possible with older echo suppression techniques. Later, with the development of hands-free teleconferencing systems, an­ other echo problem appeared; but this time the echo was due to the coupling between the loudspeaker and microphone. It is not surprising that the same solution was proposed to solve this problem, and most of today's telecon­ ferencing systems have an acoustic echo canceler. More recently, attention has been given to the very interesting problem of multichannel acoustic echo cancellation, which leads to more exciting applications that take advantage of our binaural auditory system.

Inhaltsverzeichnis

Frontmatter
1. An Introduction to the Problem of Echo in Speech Communication
Abstract
With rare exceptions, conversations take place in the presence of echoes. We hear echoes of our speech waves as they are reflected from the floor, walls, and other neighboring objects. If a reflected wave arrives a very short time after the direct sound, it is perceived not as an echo but as a spectral distortion, or reverberation. Most people prefer some amount of reverberation to a completely anechoic environment, and the desirable amount of reverberation depends on the application. (For example, much more reverberation is desirable in a concert hall than in an office.) The situation is very different, however, when the leading edge of the reflected wave arrives a few tens of milliseconds after the direct sound. In such a case, it is heard as a distinct echo. Such echoes are invariably annoying, and under extreme conditions can completely disrupt a conversation. It is such distinct echoes that we will be concerned with in this book.
Jacob Benesty, Tomas Gänsler, Dennis R. Morgan, M. Mohan Sondhi, Steven L. Gay
2. A Family of Robust PNLMS-Like Algorithms for Network Echo Cancellation
Abstract
In this chapter, we present a family of fast-converging algorithms that are extensions of the proportionate normalized least mean square (PNLMS) algorithm introduced by Duttweiler [38], [54] This new family of algorithms is based on the affine projection algorithm/normalized least mean square (APA/NLMS) algorithm family [106], [100], [55], [125], [66]. What differentiates the new algorithms from the NLMS and APA algorithms is that they inherit the proportional step-size idea from the PNLMS algorithm, i.e., individually assigned step-sizes to each filter coefficient, where the step sizes are calculated from the previous estimate of the echo path. Because of these individual step sizes, the algorithms achieve a higher convergence rate by using the fact that the active part of a network echo path is usually much smaller (4–8 ms) than the possible echo path range (64–128 ms) that has to be covered. A natural extension of the basic PNLMS algorithm is a proportionate affine projection algorithm (PAPA) [51]. This algorithm (family) combines the fast converging APA with the proportional step size technique of PNLMS.
Jacob Benesty, Tomas Gänsler, Dennis R. Morgan, M. Mohan Sondhi, Steven L. Gay
3. A Robust Fast Recursive Least-Squares Adaptive Algorithm
Abstract
Very often in the context of system identification, the error signal (e), which is by definition the difference between the system and model filter outputs, is assumed to be zero-mean, white, and Gaussian. In this case, the leastsquares estimator is equivalent to the maximum likelihood estimator and, hence, it is asymptotically efficient. While this supposition is very convenient and extremely useful in practice, adaptive algorithms optimized on this basis may be very sensitive to minor deviations from the assumptions.
Jacob Benesty, Tomas Gänsler, Dennis R. Morgan, M. Mohan Sondhi, Steven L. Gay
4. Dynamic Resource Allocation for Network Echo Cancellation
Abstract
Current adaptation algorithms for network echo cancelers are designed without regard to the fact that, invariably, a single canceler chip handles many conversations simultaneously. This implies that for N c channels, the processor must handle N c times the peak computational load of a single channel. If the number of channels is large, however, it should be possible to reduce the demands on the processor to something close to N c times the average load. Some additional computational capacity would, of course, be necessary to take care of statistical fluctuation in the requirements, but the required safety margin becomes smaller as N c becomes larger. (With the speed and memory now available on a chip, the number of channels can be several hundred, so the safety margin might not have to be large.) Once the problem is looked upon as that of dealing with a large number of channels, it is also possible to take advantage of other knowledge about speech patterns and characteristics of long distance circuits to further reduce the computational load. In this chapter, we show how the computational requirement can, in principle, be reduced by a very large factor perhaps as large as thirty.
Jacob Benesty, Tomas Gänsler, Dennis R. Morgan, M. Mohan Sondhi, Steven L. Gay
5. Multichannel Acoustic Echo Cancellation
Abstract
One may ask a legitimate question: why do we need multichannel sound for telecommunication? Let’s take the following example. When we are in a room with several people talking, laughing, or just communicating with each other, thanks to our binaural auditory system, we can concentrate on one particular talker (if several persons are talking at the same time), localize or identify a person who is talking, and somehow we are able to process a noisy or a reverberant speech signal in order to make it intelligible. On the other hand, with only one ear or, equivalently, if we record what happens in the room with one microphone and listen to this monophonic signal, it will likely make all of the above mentioned tasks more difficult. So, multichannel sound teleconferencing systems provide a realistic presence that mono-channel systems cannot offer.
Jacob Benesty, Tomas Gänsler, Dennis R. Morgan, M. Mohan Sondhi, Steven L. Gay
6. A Fast Normalized Cross-Correlation DTD Combined with a Robust Multichannel Fast Recursive Least-Squares Algorithm
Abstract
Ideally, acoustic echo cancelers (AECs) remove undesired echoes that result from acoustic coupling between the loudspeaker and the microphone used in full-duplex hands-free telecommunication systems. Figure 6.1 shows a diagram of a single-channel AEC. The far-end speech signal x(n) goes through the echo path represented by a filter h(n) to produce the echo, y e(n), which is picked up by the microphone together with the near-end talker signal v(n) and ambient noise w(n). The composite microphone signal is denoted y(n).
Jacob Benesty, Tomas Gänsler, Dennis R. Morgan, M. Mohan Sondhi, Steven L. Gay
7. Some Practical Aspects of Stereo Teleconferencing System Implementation
Abstract
When people with normal hearing converse in a room where many people are speaking simultaneously, their binaural hearing enables them to focus in on particular talkers according to the directions from which those talkers’ voices are coming. They can do this even when the signal-to-background-noise ratio is very low; noise, in this case being the voices of those ignored. This phenomenon of human audio perception is aptly called the cocktail party effect. In monophonic teleconferencing systems, this aid to audio communication is lost across the connection. A listener on one side hears all of the talkers of the far side coming from the same direction — the direction from a single local loudspeaker. So, when people on the far side talk simultaneously, it is impossible for the local listener to spatially separate their voices as he or she normally would. A stereo connection would solve this problem because with stereo the local listener (when located in the “sweet spot” — that area where the stereo effect is most clearly perceived) hears a reconstruction of the leftright positioning of the sound from the far side. Until very recently though, teleconferencing systems have been limited to monophonic connections because stereo acoustic echo cancellation was problematic [121]. However, with the advent of new techniques (see Chap. 5) such systems are now quite realizable.
Jacob Benesty, Tomas Gänsler, Dennis R. Morgan, M. Mohan Sondhi, Steven L. Gay
8. General Derivation of Frequency-Domain Adaptive Filtering
Abstract
Adaptive filters [60] play an important role in echo cancellation because we need to identify and track unknown and time-varying channels [24J . There are roughly two classes of adaptive algorithms. One class includes filters that are updated in the time domain, sample-by-sample in general, like the classical least mean square (LMS) [134] and recursive least-squares (RLS) [4], [66] algorithms. The other class contains filters that are updated in the frequency domain, block-by-block in general, using the fast Fourier transform (FFT) as an intermediary step. As a result of this block processing, the arithmetic complexity of the algorithms in the latter category is significantly reduced compared to time-domain adaptive algorithms. Use of the FFT is appropriate to the Toeplitz structure, which results from the time-shift properties of the filter input signal. Consequently, deriving a frequency-domain (FD) adaptive algorithm is just a matter of rewriting the time-domain error criterion in a way that Toeplitz and circulant matrices are explicitly shown.
Jacob Benesty, Tomas Gänsler, Dennis R. Morgan, M. Mohan Sondhi, Steven L. Gay
9. Multichannel Acoustic Echo and Double-Talk Handling: A Frequency-Domain Approach
Abstract
A multichannel frequency-domain adaptive algorithm was presented in Chap. 8 (see also [15]). The multichannel frequency-domain algorithm has been shown to work very well in the two-channel acoustic echo cancellation application [40]. It has a fairly low computational complexity compared to the fast recursive least-squares algorithm (FRLS) [40]. Furthermore, it is an inherently stab le algorithm, well suite d for a fixed-point implementation. Our objective in this chapter is to provide a complete solution, based on the multichannel frequency-domain adaptive algorithm, which handles both echo cancellation and double-talk.
Jacob Benesty, Tomas Gänsler, Dennis R. Morgan, M. Mohan Sondhi, Steven L. Gay
10. Linear Interpolation Applied to Adaptive Filtering
Abstract
While linear prediction has been successfully applied to many topics in signal processing, linear interpolation has received little attention. This chapter gives some results on linear interpolation and shows that many well-known variables or equations can be formulated in terms of linear interpolation. Also, the so-called principle of orthogonality is generalized. From this theory, we then give a generalized least mean square algorithm and a generalized affine projection algorithm.
Jacob Benesty, Tomas Gänsler, Dennis R. Morgan, M. Mohan Sondhi, Steven L. Gay
Backmatter
Metadaten
Titel
Advances in Network and Acoustic Echo Cancellation
verfasst von
Jacob Benesty
Tomas Gänsler
Dennis R. Morgan
M. Mohan Sondhi
Steven L. Gay
Copyright-Jahr
2001
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-662-04437-7
Print ISBN
978-3-642-07507-0
DOI
https://doi.org/10.1007/978-3-662-04437-7