Skip to main content

2015 | OriginalPaper | Buchkapitel

2. Audio Acquisition, Representation and Storage

verfasst von : Francesco Camastra, Alessandro Vinciarelli

Erschienen in: Machine Learning for Audio, Image and Video Analysis

Verlag: Springer London

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

What the reader should know to understand this chapter \(\bullet \) Basic notions of physics. \(\bullet \) Basic notions of calculus (trigonometry, logarithms, exponentials, etc.)

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
Since the implementation of a low-pass filter that actually stops all frequencies above a certain threshold is not possible, it is more correct to say that the effects of the aliasing problem are reduced to a level that does not disturb human perception. See [15] for a more extensive description of this issue.
 
2
There is no noticeable difference between the performance of the two companders, the A-law compander is used in Europe and other countries affiliated to the ITU (with \(A=87.56\)), while the \(\mu \)-law compander is mostly used in the USA (with \(\mu =255\)).
 
3
The results can be found on www.apple.com/quicktime/technologies/aac/.
 
4
The advantages of this property are particularly evident in the frequency domain. In fact, the Fourier transform of a convolution between two signals corresponds to the product between the Fourier transforms of the single signals, and this simplifies significantly the analysis of the effect of a system in the frequency domain.
 
Literatur
1.
Zurück zum Zitat Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems. Technical report, International Telecommunication Union, 1997. Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems. Technical report, International Telecommunication Union, 1997.
2.
Zurück zum Zitat L.L. Beranek. Concert hall acoustics. The Journal of the Acoustical Society of America, 92(1):1–39, 1992. L.L. Beranek. Concert hall acoustics. The Journal of the Acoustical Society of America, 92(1):1–39, 1992.
3.
Zurück zum Zitat D.T. Blackstock. Fundamentals of Physical Acoustics. John Wiley and Sons, 2000. D.T. Blackstock. Fundamentals of Physical Acoustics. John Wiley and Sons, 2000.
4.
Zurück zum Zitat J. Bormans, J. Gelissen, and A. Perkis. MPEG-21: The 21st century multimedia framework. IEEE Signal Processing Magazine, 20(2):53–62, 2003. J. Bormans, J. Gelissen, and A. Perkis. MPEG-21: The 21st century multimedia framework. IEEE Signal Processing Magazine, 20(2):53–62, 2003.
5.
Zurück zum Zitat M. Bosi and R.E. Goldberg. Introduction to Digital Audio Coding and Standards. Kluwer, 2003. M. Bosi and R.E. Goldberg. Introduction to Digital Audio Coding and Standards. Kluwer, 2003.
6.
Zurück zum Zitat J.C. Brown. Determination of the meter of musical scores by autocorrelation. The Journal of the Acoustical Society of America, 94(4):1953–1957, 1993. J.C. Brown. Determination of the meter of musical scores by autocorrelation. The Journal of the Acoustical Society of America, 94(4):1953–1957, 1993.
7.
Zurück zum Zitat R. Burnett, I. and van de Walle, K. Hill, J. Bormans, and F. Pereira. MPEG-21: Goals and achievements. IEEE Multimedia, 10(4):60–70, 2003. R. Burnett, I. and van de Walle, K. Hill, J. Bormans, and F. Pereira. MPEG-21: Goals and achievements. IEEE Multimedia, 10(4):60–70, 2003.
8.
Zurück zum Zitat M.J. Carey, E.S. Parris, and H. Lloyd-Thomas. A comparison of features for speech-music discrimination. In Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing, pages 149–152, 1999. M.J. Carey, E.S. Parris, and H. Lloyd-Thomas. A comparison of features for speech-music discrimination. In Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing, pages 149–152, 1999.
9.
Zurück zum Zitat J.C. Catford. Theoretical Acoustics. Oxford University Press, 2002. J.C. Catford. Theoretical Acoustics. Oxford University Press, 2002.
10.
Zurück zum Zitat P. Cummiskey. Adaptive quantization in differential PCM coding of speech. Bell Systems Technical Journal, 7:1105, 1973. P. Cummiskey. Adaptive quantization in differential PCM coding of speech. Bell Systems Technical Journal, 7:1105, 1973.
11.
Zurück zum Zitat T.F.W. Embleton. Tutorial on sound propagation outdoors. The Journal of the Acoustical Society of America, 100(1):31–48, 1996. T.F.W. Embleton. Tutorial on sound propagation outdoors. The Journal of the Acoustical Society of America, 100(1):31–48, 1996.
12.
Zurück zum Zitat H. Fletcher. Auditory patterns. Review of Modern Physics, pages 47–65, 1940. H. Fletcher. Auditory patterns. Review of Modern Physics, pages 47–65, 1940.
13.
Zurück zum Zitat A. Ghias, J. Logan, D. Chamberlin, and B.C. Smith. Query by humming: musical information retrieval in audio database. In Proceedings of the ACM Conference on Multimedia, pages 231–236, 1995. A. Ghias, J. Logan, D. Chamberlin, and B.C. Smith. Query by humming: musical information retrieval in audio database. In Proceedings of the ACM Conference on Multimedia, pages 231–236, 1995.
14.
Zurück zum Zitat A. Hanjalic and L.-Q. Xu. Affective video content representation and modeling. IEEE Transactions on Multimedia, 7(1):143–154, 2005. A. Hanjalic and L.-Q. Xu. Affective video content representation and modeling. IEEE Transactions on Multimedia, 7(1):143–154, 2005.
15.
Zurück zum Zitat X. Huang, A. Acero, and H.-W. Hon. Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice-Hall, 2001. X. Huang, A. Acero, and H.-W. Hon. Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice-Hall, 2001.
16.
Zurück zum Zitat L.E. Kinsler, A.R. Frey, A.B. Coppens, and J.V. Sanders. Fundamentals of Acoustics. John Wiley and Sons, New York, 2000. L.E. Kinsler, A.R. Frey, A.B. Coppens, and J.V. Sanders. Fundamentals of Acoustics. John Wiley and Sons, New York, 2000.
17.
Zurück zum Zitat P. Ladefoged. Vowels and consonants. Blackwell Publishing, 2001. P. Ladefoged. Vowels and consonants. Blackwell Publishing, 2001.
18.
Zurück zum Zitat C.M. Lee and S.S. Narayanan. Toward detecting emotions in spoken dialogs. IEEE Transactions on Multimedia, 13(2):293–303, 2005. C.M. Lee and S.S. Narayanan. Toward detecting emotions in spoken dialogs. IEEE Transactions on Multimedia, 13(2):293–303, 2005.
19.
Zurück zum Zitat L. Lu, H. Jiang, and H.J. Zhang. A robust audio classification and segmentation method. In Proceedings of the ACM Conference on Multimedia, pages 203–211, 2001. L. Lu, H. Jiang, and H.J. Zhang. A robust audio classification and segmentation method. In Proceedings of the ACM Conference on Multimedia, pages 203–211, 2001.
20.
Zurück zum Zitat Y.-F. Ma, X.-S Hua, L. Lu, and H.-J. Zhang. A generic framework for user attention model and its application in video summarization. IEEE Transactions on Multimedia, 7(5):907–919, 2005. Y.-F. Ma, X.-S Hua, L. Lu, and H.-J. Zhang. A generic framework for user attention model and its application in video summarization. IEEE Transactions on Multimedia, 7(5):907–919, 2005.
21.
Zurück zum Zitat J. Makhoul. Linear prediction: A tutorial review. Proceedings of the IEEE, 63(4):561–580, 1975. J. Makhoul. Linear prediction: A tutorial review. Proceedings of the IEEE, 63(4):561–580, 1975.
22.
Zurück zum Zitat B.S. Manjunath, P. Salembier, and T. Sikora, editors. Introduction to MPEG-7. John Wiley and Sons, Chichester, UK, 2002. B.S. Manjunath, P. Salembier, and T. Sikora, editors. Introduction to MPEG-7. John Wiley and Sons, Chichester, UK, 2002.
23.
Zurück zum Zitat S.K. Mitra. Digital Signal Processing - A Computer Based Approach. McGraw-Hill, 1998. S.K. Mitra. Digital Signal Processing - A Computer Based Approach. McGraw-Hill, 1998.
24.
Zurück zum Zitat B.C.J. Moore. An Introduction to the Psychology of Hearing. Academic Press, 1997. B.C.J. Moore. An Introduction to the Psychology of Hearing. Academic Press, 1997.
25.
Zurück zum Zitat P.M. Morse and K. Ingard. Theoretical Acoustics. McGraw-Hill, 1968. P.M. Morse and K. Ingard. Theoretical Acoustics. McGraw-Hill, 1968.
26.
Zurück zum Zitat P. Noll. Wideband speech and audio coding. IEEE Communications Magazine, (11):34–44, november 1993. P. Noll. Wideband speech and audio coding. IEEE Communications Magazine, (11):34–44, november 1993.
27.
Zurück zum Zitat P. Noll. MPEG digital audio coding. IEEE Signal Processing Magazine, 14(5):59–81, 1997. P. Noll. MPEG digital audio coding. IEEE Signal Processing Magazine, 14(5):59–81, 1997.
28.
Zurück zum Zitat B.M. Oliver, J. Pierce, and C.E. Shannon. The philosophy of PCM. Proceedings of IEEE, 36:1324–1331, 1948. B.M. Oliver, J. Pierce, and C.E. Shannon. The philosophy of PCM. Proceedings of IEEE, 36:1324–1331, 1948.
29.
Zurück zum Zitat A.V. Oppenheim and R.W. Schafer. Discrete-Time Signal Processing. Prentice-Hall, 1989. A.V. Oppenheim and R.W. Schafer. Discrete-Time Signal Processing. Prentice-Hall, 1989.
30.
Zurück zum Zitat T. Painter and A. Spanias. Perceptual coding of digital audio. Proceedings of IEEE, 88(4):451–513, 2000. T. Painter and A. Spanias. Perceptual coding of digital audio. Proceedings of IEEE, 88(4):451–513, 2000.
31.
Zurück zum Zitat J.O. Pickles. An Introduction to the Physiology of Hearing. Academic Press, 1988. J.O. Pickles. An Introduction to the Physiology of Hearing. Academic Press, 1988.
32.
Zurück zum Zitat L. Rabiner. On the use of autocorrelation analysis for pitch detection. IEEE Transactions on Acoustics, Speech and Signal Processing, 25(1):24–33, 1977. L. Rabiner. On the use of autocorrelation analysis for pitch detection. IEEE Transactions on Acoustics, Speech and Signal Processing, 25(1):24–33, 1977.
33.
Zurück zum Zitat L.R. Rabiner and R.W. Schafer, editors. Digital Processing of Speech Signals. Prentice-Hall, 1978. L.R. Rabiner and R.W. Schafer, editors. Digital Processing of Speech Signals. Prentice-Hall, 1978.
34.
Zurück zum Zitat L.R. R Rabiner and M.R. Sambur. An algorithm for determining the endpoints of isolated utterances. Bell System Technical Journal, 54(2):297–315, 1975. L.R. R Rabiner and M.R. Sambur. An algorithm for determining the endpoints of isolated utterances. Bell System Technical Journal, 54(2):297–315, 1975.
35.
Zurück zum Zitat E. Scheirer and M. Slaney. Construction and evaluation of a robust multifeature speech/music discriminator. In Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing, pages 1331–1334, 1997. E. Scheirer and M. Slaney. Construction and evaluation of a robust multifeature speech/music discriminator. In Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing, pages 1331–1334, 1997.
36.
Zurück zum Zitat A. Spanias. Speech coding: a tutorial review. Proceedings of IEEE, 82(10):1541–1582, 1994. A. Spanias. Speech coding: a tutorial review. Proceedings of IEEE, 82(10):1541–1582, 1994.
37.
Zurück zum Zitat A.S. Spanias. Speech coding: A tutorial review. Proceedings of the IEEE, 82(10):1541–1582, 1994. A.S. Spanias. Speech coding: A tutorial review. Proceedings of the IEEE, 82(10):1541–1582, 1994.
38.
Zurück zum Zitat S. Sukittanon and L.E. Atlas. Modulation frequency features for audio fingerprinting. In Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing, pages 1773–1776, 2002. S. Sukittanon and L.E. Atlas. Modulation frequency features for audio fingerprinting. In Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing, pages 1773–1776, 2002.
39.
Zurück zum Zitat E. Wold, T. Blum, D. Keislar, and J. Wheaten. Content-based classification, search, and retrieval of audio. IEEE MultiMedia, 3(3):27–36, 1996. E. Wold, T. Blum, D. Keislar, and J. Wheaten. Content-based classification, search, and retrieval of audio. IEEE MultiMedia, 3(3):27–36, 1996.
Metadaten
Titel
Audio Acquisition, Representation and Storage
verfasst von
Francesco Camastra
Alessandro Vinciarelli
Copyright-Jahr
2015
Verlag
Springer London
DOI
https://doi.org/10.1007/978-1-4471-6735-8_2

Premium Partner