ABSTRACT
Audio rendering of impact sounds, such as those caused by falling objects or explosion debris, adds realism to interactive 3D audiovisual applications, and can be convincingly achieved using modal sound synthesis. Unfortunately, mode-based computations can become prohibitively expensive when many objects, each with many modes, are impacted simultaneously. We introduce a fast sound synthesis approach, based on short-time Fourier Tranforms, that exploits the inherent sparsity of modal sounds in the frequency domain. For our test scenes, this "fast mode summation" can give speedups of 5--8 times compared to a time-domain solution, with slight degradation in quality. We discuss different reconstruction windows, affecting the quality of impact sound "attacks". Our Fourier-domain processing method allows us to introduce a scalable, real-time, audio processing pipeline for both recorded and modal sounds, with auditory masking and sound source clustering. To avoid abrupt computation peaks, such as during the simultaneous impacts of an explosion, we use crossmodal perception results on audiovisual synchrony to effect temporal scheduling. We also conducted a pilot perceptual user evaluation of our method. Our implementation results show that we can treat complex audiovisual scenes in real time with high quality.
Supplemental Material
- Alais, D., and Carlile, S. 2005. Synchronizing to real events: subjective audiovisual alignment scales with perceived auditory depth and speed of sound. Proc Natl Acad Sci 102, 6, 2244--7.Google ScholarCross Ref
- Begault, D. 1999. Auditory and non-auditory factors that potentially influence virtual acoustic imagery. In Proc. AES 16th Int. Conf. on Spatial Sound Reproduction, 13--26.Google Scholar
- Fujisaki, W., Shimojo, S., Kashino, M., and Nishida, S. 2004. Recalibration of audiovisual simultaneity. Nature Neuroscience 7, 7, 773--8.Google ScholarCross Ref
- Guski, R., and Troje, N. 2003. Audiovisual phenomenal causality. Perception and Psychophysics 65, 5, 789--800.Google ScholarCross Ref
- Hormander, L. 1983. The Analysis of Linear Partial Differential Operators I. Springer-Verlag.Google Scholar
- Howell, D. C. 1992. Statistical Methods for Psychology. PWS-Kent.Google Scholar
- ITU. 2001--2003. Method for the subjective assessment of intermediate quality level of coding systems, rec. ITU-R BS.1534-1, http://www.itu.int/.Google Scholar
- James, D. L., Barbic, J., and Pai, D. K. 2006. Precomputed acoustic transfer: Output-sensitive, accurate sound generation for geometrically complex vibration sources. ACM Transactions on Graphics (ACM SIGGRAPH) 25, 3 (July), 987--995. Google ScholarDigital Library
- Larsson, P., Västfjäll, D., and Kleiner, M. 2002. Better presence and performance in virtual environments by improved binaural sound rendering. Proc. AES 22nd Intl. Conf. on virtual, synthetic and entertainment audio (June), 31--38.Google Scholar
- Moeck, T., Bonneel, N., Tsingos, N., Drettakis, G., Viaud-Delmon, I., and Aloza, D. 2007. Progressive perceptual audio rendering of complex scenes. In ACM SIGGRAPH Symp. on Interactive 3D Graphics and Games (I3D), 189--196. Google ScholarDigital Library
- O'Brien, J. F., Shen, C., and Gatchalian, C. M. 2002. Synthesizing sounds from rigid-body simulations. In ACM SIGGRAPH Symp. on Computer Animation, 175--181. Google ScholarDigital Library
- Oppenheim, A. V., Schafer, R. W., and Buck, J. R. 1999. Discrete-Time Signal Processing (2nd edition). Prentice-Hall. Google ScholarDigital Library
- Pai, D. K., van Den Doel, K., James, D. L., Lang, J., Lloyd, J. E., Richmond, J. L., and Yau, S. H. 2001. Scanning physical interaction behavior of 3d objects. In Proc. ACM SIGGRAPH 2001, 87--96. Google ScholarDigital Library
- Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P. 1992. Numerical recipes in C: The art of scientific computing. Cambridge University Press. Google ScholarDigital Library
- Raghuvanshi, N., and Lin, M. C. 2006. Interactive sound synthesis for large scale environments. In ACM SIGGRAPH Symp. on Interactive 3D Graphics and Games (I3D), 101--108. Google ScholarDigital Library
- Rodet, X., and Depalle, P. 1992. Spectral envelopes and inverse FFT synthesis. In Proc. 93rd Conv. AES, San Francisco.Google Scholar
- Sekuler, R., Sekuler, A. B., and Lau, R. 1997. Sound alters visual motion perception. Nature 385, 6614, 308.Google ScholarCross Ref
- Sugita, Y., and Suzuki, Y. 2003. Audiovisual perception: Implicit estimation of sound-arrival time. Nature 421, 6926, 911.Google ScholarCross Ref
- Tsingos, N., Gallo, E., and Drettakis, G. 2004. Perceptual audio rendering of complex virtual environments. ACM Transactions on Graphics (ACM SIGGRAPH) 23, 3 (July), 249--258. Google ScholarDigital Library
- Tsingos, N. 2005. Scalable perceptual mixing and filtering of audio signals using an augmented spectral representation. In Proc. Int. Conf. on Digital Audio Effects, 277--282.Google Scholar
- Van Den Doel, K., and Pai, D. K. 1998. The sounds of physical shapes. Presence 7, 4, 382--395. Google ScholarDigital Library
- Van Den Doel, K., and Pai, D. K. 2003. Modal synthesis for vibrating objects. Audio Anecdotes.Google Scholar
- Van Den Doel, K., Kry, P. G., and Pai, D. K. 2001. FoleyAutomatic: Physically-based sound effects for interactive simulation and animation. In Proc. ACM SIGGRAPH 2001, 537--544. Google ScholarDigital Library
- Van Den Doel, K., Pai, D. K., Adam, T., Kortchmar, L., and Pichora-Fuller, K. 2002. Measurements of perceptual quality of contact sound models. Intl. Conf. on Auditory Display, (ICAD), 345--349.Google Scholar
- Van Den Doel, K., Knott, D., and Pai, D. K. 2004. Interactive simulation of complex audiovisual scenes. Presence: Teleoperators and Virtual Environments 13, 1, 99--111. Google ScholarDigital Library
- Zölzer, U. 2002. Digital Audio Effects (DAFX), chapter 8. Wiley.Google Scholar
Index Terms
- Fast modal sounds with scalable frequency-domain synthesis
Recommendations
Example-guided physically based modal sound synthesis
Linear modal synthesis methods have often been used to generate sounds for rigid bodies. One of the key challenges in widely adopting such techniques is the lack of automatic determination of satisfactory material parameters that recreate realistic ...
FoleyAutomatic: physically-based sound effects for interactive simulation and animation
SIGGRAPH '01: Proceedings of the 28th annual conference on Computer graphics and interactive techniquesWe describe algorithms for real-time synthesis of realistic sound effects for interactive simulations (e.g., games) and animation. These sound effects are produced automatically, from 3D models using dynamic simulation and user interaction. We develop ...
Synthesizing sounds from rigid-body simulations
SCA '02: Proceedings of the 2002 ACM SIGGRAPH/Eurographics symposium on Computer animationThis paper describes a real-time technique for generating realistic and compelling sounds that correspond to the motions of rigid objects. By numerically precomputing the shape and frequencies of an object's deformation modes, audio can be synthesized ...
Comments