Skip to main content
Erschienen in: Journal on Multimodal User Interfaces 3/2019

08.11.2018 | Original Paper

Musical Vision: an interactive bio-inspired sonification tool to convert images into music

verfasst von: Antonio Polo, Xavier Sevillano

Erschienen in: Journal on Multimodal User Interfaces | Ausgabe 3/2019

Einloggen

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Musical Vision is a highly flexible, interactive and bio-inspired sonification tool that translates color images into harmonic polyphonic music by mimicking the human visual system in terms of its field of vision and photosensitive sensors. Putting the user at the center of the sonification process, Musical Vision allows the interactive design of fully configurable mappings between the color space and the MIDI instruments and audio pitch spaces to tailor the music rendering results to the application needs. Moreover, Musical Vision incorporates a harmonizer capable of introducing the necessary modifications to create melodies using harmonic chords. Above all else, Musical Vision is an extremely flexible system that the user can interactively configure to convert an image into either a few seconds or a several minutes long musical piece. Thus, it can be used, for instance, with trans-artistic purposes like the conversion of a painting into music, for augmenting vision with music, or for learning musical skills such as sol-fa. To evaluate the proposed sonification tool, we conducted a pilot user study, in which twelve volunteers were tested to interpret images containing geometric patterns from music rendered by Musical Vision. Results show that even those users with no musical education background were able to achieve nearly 70% accuracy in multiple choice tests after less than 25 min training. Moreover, users with some musical education were capable of accurately “drawing by ear” the images from no other stimuli than the sonifications.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
See Sensory substitution for the blind: a walk in the garden wearing The vOICe at http://​www.​youtube.​com/​watch?​v=​8xRgfaUJkdM (last accessed on October 2018).
 
2
In general terms, people are more used to the idea of describing colors as a mixture of primary colors, be it subtractive or additive (for instance, yellow equals red plus green in the additive color system). Instead, getting users to internalize the description of colors in terms of its HSV components would require a certain training.
 
3
See Fig. 3 in Sect. 3.2.2 for details on how this correspondences are made depending on how the music rendering process is configured by the user.
 
4
The decision of which pixels will sound simultaneously is intimately related to the image scanning process, which is described in Sect. 3.2.2.
 
6
As regards the Image processing module, the following parameters are presented: visual field number, color information reduction strength, and polyphonic rules (polyphony distribution and simultaneous notes). As for the Music rendering module, the following configuration parameters are provided: image scanning pattern (referred to as Scan type), the degree of harmonization applied on each visual field region (in %). Moreover, information about the Vector spaces module is provided in terms of the predefined chord mapping employed, as well as data regarding the Differentiation tools applied and the tempo.
 
7
The designed test is available online at http://​goo.​gl/​lWdgrz.
 
Literatur
2.
Zurück zum Zitat Gonzalez RC, Woods RE (2018) Digital image processing, 4th edn. Prentice Hall, Upper Saddle River Gonzalez RC, Woods RE (2018) Digital image processing, 4th edn. Prentice Hall, Upper Saddle River
3.
Zurück zum Zitat Schwartz SH (2004) Visual perception: a clinical orientation. McGraw-Hill Professional, New York City Schwartz SH (2004) Visual perception: a clinical orientation. McGraw-Hill Professional, New York City
4.
Zurück zum Zitat Sarkar R, Bakshi S, Sa PK (2012) Review on image sonification: a non-visual scene representation. In: Proceedings of the RAIT conference, pp 86–90 Sarkar R, Bakshi S, Sa PK (2012) Review on image sonification: a non-visual scene representation. In: Proceedings of the RAIT conference, pp 86–90
5.
Zurück zum Zitat Revuelta-Sanz P, Ruiz-Mezcua B, Sánchez-Pena JM, Walker BN (2014) Scenes and images into sounds: a taxonomy of image sonification methods for mobility applications. J Audio Eng Soc 62(3):161–171CrossRef Revuelta-Sanz P, Ruiz-Mezcua B, Sánchez-Pena JM, Walker BN (2014) Scenes and images into sounds: a taxonomy of image sonification methods for mobility applications. J Audio Eng Soc 62(3):161–171CrossRef
6.
Zurück zum Zitat Meijer PBL (1992) An experimental system for auditory image representations. IEEE Trans Biomed Eng 39(2):112–121CrossRef Meijer PBL (1992) An experimental system for auditory image representations. IEEE Trans Biomed Eng 39(2):112–121CrossRef
7.
Zurück zum Zitat Haigh A, Brown DJ, Meijer PBL, Proulx MJ (2013) How well do you see what you hear? The acuity of visual-to-auditory sensory substitution. Front Psychol 4:330CrossRef Haigh A, Brown DJ, Meijer PBL, Proulx MJ (2013) How well do you see what you hear? The acuity of visual-to-auditory sensory substitution. Front Psychol 4:330CrossRef
8.
Zurück zum Zitat Capelle C, Trullemans C, Arno P, Veraart C (1998) A real-time experimental prototype for enhancement of vision rehabilitation using auditory substitution. IEEE Trans Biomed Eng 45(10):1279–1293CrossRef Capelle C, Trullemans C, Arno P, Veraart C (1998) A real-time experimental prototype for enhancement of vision rehabilitation using auditory substitution. IEEE Trans Biomed Eng 45(10):1279–1293CrossRef
9.
Zurück zum Zitat Renier L, De Volder AG (2010) Vision substitution and depth perception: early blind subjects experience visual perspective through their ears. Disabil Rehabil Assist Technol 5(3):175–183CrossRef Renier L, De Volder AG (2010) Vision substitution and depth perception: early blind subjects experience visual perspective through their ears. Disabil Rehabil Assist Technol 5(3):175–183CrossRef
10.
Zurück zum Zitat Payling D, Mills S, Howle T (2007) Hue music—creating timbral soundscapes from coloured pictures. In: Proceedings of ICAD conference, pp 91–97 Payling D, Mills S, Howle T (2007) Hue music—creating timbral soundscapes from coloured pictures. In: Proceedings of ICAD conference, pp 91–97
11.
Zurück zum Zitat Peris-Fajarnes G, Dunai L, Praderas VS, Dunai I (2010) CASBliP—a new cognitive object detection and orientation system for impaired people. In: Proceedings of CogSys conference Peris-Fajarnes G, Dunai L, Praderas VS, Dunai I (2010) CASBliP—a new cognitive object detection and orientation system for impaired people. In: Proceedings of CogSys conference
12.
Zurück zum Zitat Yang X, Tian Y, Yi C, Arditi A (2010) Context-based indoor object detection as an aid to blind persons accessing unfamiliar environments. In: Proceedings of ACM multimedia conference, pp 1087–1090 Yang X, Tian Y, Yi C, Arditi A (2010) Context-based indoor object detection as an aid to blind persons accessing unfamiliar environments. In: Proceedings of ACM multimedia conference, pp 1087–1090
13.
Zurück zum Zitat Kopecek I, Oslejsek R (2008) Hybrid approach to sonification of color images. In: Proceedings of ICHIT conference, pp 722–727 Kopecek I, Oslejsek R (2008) Hybrid approach to sonification of color images. In: Proceedings of ICHIT conference, pp 722–727
14.
Zurück zum Zitat Levy-Tzedek S, Hanassy S, Abboud S, Maidenbaum S, Amedi A (2012) Fast, accurate reaching movements with a visual-to-auditory sensory substitution device. Restor Neurol Neurosci 30:313–323 Levy-Tzedek S, Hanassy S, Abboud S, Maidenbaum S, Amedi A (2012) Fast, accurate reaching movements with a visual-to-auditory sensory substitution device. Restor Neurol Neurosci 30:313–323
15.
Zurück zum Zitat Abboud S, Hanassy S, Levy-Tzedek S, Maidenbaum S, Amedi A (2014) EyeMusic: introducing a visual colorful experience for the blind using auditory sensory substitution. Restor Neurol Neurosci 32:247–257 Abboud S, Hanassy S, Levy-Tzedek S, Maidenbaum S, Amedi A (2014) EyeMusic: introducing a visual colorful experience for the blind using auditory sensory substitution. Restor Neurol Neurosci 32:247–257
16.
Zurück zum Zitat Okunaka T, Tonomura Y (2012) Eyeke: what you hear is what you see. In: Proceedings of ACM multimedia conference, pp 1287–1288 Okunaka T, Tonomura Y (2012) Eyeke: what you hear is what you see. In: Proceedings of ACM multimedia conference, pp 1287–1288
17.
Zurück zum Zitat Cavaco S, Henriques JT, Mengucci M, Correia N, Medeiros F (2013) Color sonification for the visually impaired. Procedia Technol 9:1048–1057CrossRef Cavaco S, Henriques JT, Mengucci M, Correia N, Medeiros F (2013) Color sonification for the visually impaired. Procedia Technol 9:1048–1057CrossRef
18.
Zurück zum Zitat Chambel T, Neves S, Sousa C, Francisco R (2010) Synesthetic video: hearing colors, seeing sounds. In: Proceedings of MindTrek conference, pp 130–133 Chambel T, Neves S, Sousa C, Francisco R (2010) Synesthetic video: hearing colors, seeing sounds. In: Proceedings of MindTrek conference, pp 130–133
19.
Zurück zum Zitat San Pedro J, Church K (2013) The sound of light: induced synesthesia for augmenting the photography experience. In: Proceedings of ACM CHI conference, pp 745–750 San Pedro J, Church K (2013) The sound of light: induced synesthesia for augmenting the photography experience. In: Proceedings of ACM CHI conference, pp 745–750
21.
Zurück zum Zitat Adhitya S, Kuuskankare M (2012) SUM: from image-based sonification to computer-aided composition. In: Proceedings of CMMR symposium, pp 94–101 Adhitya S, Kuuskankare M (2012) SUM: from image-based sonification to computer-aided composition. In: Proceedings of CMMR symposium, pp 94–101
22.
Zurück zum Zitat Huang YC, Wu KY, Chen MC (2014) Seeing aural—an installation transferring the materials you gaze to sounds you hear. In: Proceedings of ACM TEI conference, pp 323–324 Huang YC, Wu KY, Chen MC (2014) Seeing aural—an installation transferring the materials you gaze to sounds you hear. In: Proceedings of ACM TEI conference, pp 323–324
23.
Zurück zum Zitat Kubovy M, Schutz M (2010) Audio-visual objects. Rev Philos Psychol 1:4161CrossRef Kubovy M, Schutz M (2010) Audio-visual objects. Rev Philos Psychol 1:4161CrossRef
24.
Zurück zum Zitat Larson AM, Loschky LC (2009) The contributions of central versus peripheral vision to scene gist recognition. J Vis 9(10):6,116CrossRef Larson AM, Loschky LC (2009) The contributions of central versus peripheral vision to scene gist recognition. J Vis 9(10):6,116CrossRef
25.
Zurück zum Zitat Bowmaker JK, Dartnall HJ (1980) Visual pigments of rods and cones in a human retina. J Physiol 298:501–511CrossRef Bowmaker JK, Dartnall HJ (1980) Visual pigments of rods and cones in a human retina. J Physiol 298:501–511CrossRef
26.
Zurück zum Zitat Rocchesso D, Delle Monache S (2012) Perception and replication of planar sonic gestures. ACM Trans Appl Percept 9(4):18CrossRef Rocchesso D, Delle Monache S (2012) Perception and replication of planar sonic gestures. ACM Trans Appl Percept 9(4):18CrossRef
27.
Zurück zum Zitat Thoret E, Aramaki M, Kronland-Martinet R, Velay JL, Ystad S (2014) From sound to shape: auditory perception of drawing movements. J Exp Psychol Hum Percept Perform 40(3):983–994CrossRef Thoret E, Aramaki M, Kronland-Martinet R, Velay JL, Ystad S (2014) From sound to shape: auditory perception of drawing movements. J Exp Psychol Hum Percept Perform 40(3):983–994CrossRef
28.
Zurück zum Zitat Mayron L (2013) A comparison of biologically-inspired methods for unsupervised salient object detection. In: Proceedings of ICME conference Mayron L (2013) A comparison of biologically-inspired methods for unsupervised salient object detection. In: Proceedings of ICME conference
29.
Zurück zum Zitat Rumsey F (2003) Desktop audio technology: digital audio and MIDI principles. Focal Press, WalthamCrossRef Rumsey F (2003) Desktop audio technology: digital audio and MIDI principles. Focal Press, WalthamCrossRef
Metadaten
Titel
Musical Vision: an interactive bio-inspired sonification tool to convert images into music
verfasst von
Antonio Polo
Xavier Sevillano
Publikationsdatum
08.11.2018
Verlag
Springer International Publishing
Erschienen in
Journal on Multimodal User Interfaces / Ausgabe 3/2019
Print ISSN: 1783-7677
Elektronische ISSN: 1783-8738
DOI
https://doi.org/10.1007/s12193-018-0280-4

Weitere Artikel der Ausgabe 3/2019

Journal on Multimodal User Interfaces 3/2019 Zur Ausgabe