Skip to main content
Top
Published in: Journal on Multimodal User Interfaces 3/2019

08-11-2018 | Original Paper

Musical Vision: an interactive bio-inspired sonification tool to convert images into music

Published in: Journal on Multimodal User Interfaces | Issue 3/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Musical Vision is a highly flexible, interactive and bio-inspired sonification tool that translates color images into harmonic polyphonic music by mimicking the human visual system in terms of its field of vision and photosensitive sensors. Putting the user at the center of the sonification process, Musical Vision allows the interactive design of fully configurable mappings between the color space and the MIDI instruments and audio pitch spaces to tailor the music rendering results to the application needs. Moreover, Musical Vision incorporates a harmonizer capable of introducing the necessary modifications to create melodies using harmonic chords. Above all else, Musical Vision is an extremely flexible system that the user can interactively configure to convert an image into either a few seconds or a several minutes long musical piece. Thus, it can be used, for instance, with trans-artistic purposes like the conversion of a painting into music, for augmenting vision with music, or for learning musical skills such as sol-fa. To evaluate the proposed sonification tool, we conducted a pilot user study, in which twelve volunteers were tested to interpret images containing geometric patterns from music rendered by Musical Vision. Results show that even those users with no musical education background were able to achieve nearly 70% accuracy in multiple choice tests after less than 25 min training. Moreover, users with some musical education were capable of accurately “drawing by ear” the images from no other stimuli than the sonifications.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
See Sensory substitution for the blind: a walk in the garden wearing The vOICe at http://​www.​youtube.​com/​watch?​v=​8xRgfaUJkdM (last accessed on October 2018).
 
2
In general terms, people are more used to the idea of describing colors as a mixture of primary colors, be it subtractive or additive (for instance, yellow equals red plus green in the additive color system). Instead, getting users to internalize the description of colors in terms of its HSV components would require a certain training.
 
3
See Fig. 3 in Sect. 3.2.2 for details on how this correspondences are made depending on how the music rendering process is configured by the user.
 
4
The decision of which pixels will sound simultaneously is intimately related to the image scanning process, which is described in Sect. 3.2.2.
 
6
As regards the Image processing module, the following parameters are presented: visual field number, color information reduction strength, and polyphonic rules (polyphony distribution and simultaneous notes). As for the Music rendering module, the following configuration parameters are provided: image scanning pattern (referred to as Scan type), the degree of harmonization applied on each visual field region (in %). Moreover, information about the Vector spaces module is provided in terms of the predefined chord mapping employed, as well as data regarding the Differentiation tools applied and the tempo.
 
7
The designed test is available online at http://​goo.​gl/​lWdgrz.
 
Literature
2.
go back to reference Gonzalez RC, Woods RE (2018) Digital image processing, 4th edn. Prentice Hall, Upper Saddle River Gonzalez RC, Woods RE (2018) Digital image processing, 4th edn. Prentice Hall, Upper Saddle River
3.
go back to reference Schwartz SH (2004) Visual perception: a clinical orientation. McGraw-Hill Professional, New York City Schwartz SH (2004) Visual perception: a clinical orientation. McGraw-Hill Professional, New York City
4.
go back to reference Sarkar R, Bakshi S, Sa PK (2012) Review on image sonification: a non-visual scene representation. In: Proceedings of the RAIT conference, pp 86–90 Sarkar R, Bakshi S, Sa PK (2012) Review on image sonification: a non-visual scene representation. In: Proceedings of the RAIT conference, pp 86–90
5.
go back to reference Revuelta-Sanz P, Ruiz-Mezcua B, Sánchez-Pena JM, Walker BN (2014) Scenes and images into sounds: a taxonomy of image sonification methods for mobility applications. J Audio Eng Soc 62(3):161–171CrossRef Revuelta-Sanz P, Ruiz-Mezcua B, Sánchez-Pena JM, Walker BN (2014) Scenes and images into sounds: a taxonomy of image sonification methods for mobility applications. J Audio Eng Soc 62(3):161–171CrossRef
6.
go back to reference Meijer PBL (1992) An experimental system for auditory image representations. IEEE Trans Biomed Eng 39(2):112–121CrossRef Meijer PBL (1992) An experimental system for auditory image representations. IEEE Trans Biomed Eng 39(2):112–121CrossRef
7.
go back to reference Haigh A, Brown DJ, Meijer PBL, Proulx MJ (2013) How well do you see what you hear? The acuity of visual-to-auditory sensory substitution. Front Psychol 4:330CrossRef Haigh A, Brown DJ, Meijer PBL, Proulx MJ (2013) How well do you see what you hear? The acuity of visual-to-auditory sensory substitution. Front Psychol 4:330CrossRef
8.
go back to reference Capelle C, Trullemans C, Arno P, Veraart C (1998) A real-time experimental prototype for enhancement of vision rehabilitation using auditory substitution. IEEE Trans Biomed Eng 45(10):1279–1293CrossRef Capelle C, Trullemans C, Arno P, Veraart C (1998) A real-time experimental prototype for enhancement of vision rehabilitation using auditory substitution. IEEE Trans Biomed Eng 45(10):1279–1293CrossRef
9.
go back to reference Renier L, De Volder AG (2010) Vision substitution and depth perception: early blind subjects experience visual perspective through their ears. Disabil Rehabil Assist Technol 5(3):175–183CrossRef Renier L, De Volder AG (2010) Vision substitution and depth perception: early blind subjects experience visual perspective through their ears. Disabil Rehabil Assist Technol 5(3):175–183CrossRef
10.
go back to reference Payling D, Mills S, Howle T (2007) Hue music—creating timbral soundscapes from coloured pictures. In: Proceedings of ICAD conference, pp 91–97 Payling D, Mills S, Howle T (2007) Hue music—creating timbral soundscapes from coloured pictures. In: Proceedings of ICAD conference, pp 91–97
11.
go back to reference Peris-Fajarnes G, Dunai L, Praderas VS, Dunai I (2010) CASBliP—a new cognitive object detection and orientation system for impaired people. In: Proceedings of CogSys conference Peris-Fajarnes G, Dunai L, Praderas VS, Dunai I (2010) CASBliP—a new cognitive object detection and orientation system for impaired people. In: Proceedings of CogSys conference
12.
go back to reference Yang X, Tian Y, Yi C, Arditi A (2010) Context-based indoor object detection as an aid to blind persons accessing unfamiliar environments. In: Proceedings of ACM multimedia conference, pp 1087–1090 Yang X, Tian Y, Yi C, Arditi A (2010) Context-based indoor object detection as an aid to blind persons accessing unfamiliar environments. In: Proceedings of ACM multimedia conference, pp 1087–1090
13.
go back to reference Kopecek I, Oslejsek R (2008) Hybrid approach to sonification of color images. In: Proceedings of ICHIT conference, pp 722–727 Kopecek I, Oslejsek R (2008) Hybrid approach to sonification of color images. In: Proceedings of ICHIT conference, pp 722–727
14.
go back to reference Levy-Tzedek S, Hanassy S, Abboud S, Maidenbaum S, Amedi A (2012) Fast, accurate reaching movements with a visual-to-auditory sensory substitution device. Restor Neurol Neurosci 30:313–323 Levy-Tzedek S, Hanassy S, Abboud S, Maidenbaum S, Amedi A (2012) Fast, accurate reaching movements with a visual-to-auditory sensory substitution device. Restor Neurol Neurosci 30:313–323
15.
go back to reference Abboud S, Hanassy S, Levy-Tzedek S, Maidenbaum S, Amedi A (2014) EyeMusic: introducing a visual colorful experience for the blind using auditory sensory substitution. Restor Neurol Neurosci 32:247–257 Abboud S, Hanassy S, Levy-Tzedek S, Maidenbaum S, Amedi A (2014) EyeMusic: introducing a visual colorful experience for the blind using auditory sensory substitution. Restor Neurol Neurosci 32:247–257
16.
go back to reference Okunaka T, Tonomura Y (2012) Eyeke: what you hear is what you see. In: Proceedings of ACM multimedia conference, pp 1287–1288 Okunaka T, Tonomura Y (2012) Eyeke: what you hear is what you see. In: Proceedings of ACM multimedia conference, pp 1287–1288
17.
go back to reference Cavaco S, Henriques JT, Mengucci M, Correia N, Medeiros F (2013) Color sonification for the visually impaired. Procedia Technol 9:1048–1057CrossRef Cavaco S, Henriques JT, Mengucci M, Correia N, Medeiros F (2013) Color sonification for the visually impaired. Procedia Technol 9:1048–1057CrossRef
18.
go back to reference Chambel T, Neves S, Sousa C, Francisco R (2010) Synesthetic video: hearing colors, seeing sounds. In: Proceedings of MindTrek conference, pp 130–133 Chambel T, Neves S, Sousa C, Francisco R (2010) Synesthetic video: hearing colors, seeing sounds. In: Proceedings of MindTrek conference, pp 130–133
19.
go back to reference San Pedro J, Church K (2013) The sound of light: induced synesthesia for augmenting the photography experience. In: Proceedings of ACM CHI conference, pp 745–750 San Pedro J, Church K (2013) The sound of light: induced synesthesia for augmenting the photography experience. In: Proceedings of ACM CHI conference, pp 745–750
21.
go back to reference Adhitya S, Kuuskankare M (2012) SUM: from image-based sonification to computer-aided composition. In: Proceedings of CMMR symposium, pp 94–101 Adhitya S, Kuuskankare M (2012) SUM: from image-based sonification to computer-aided composition. In: Proceedings of CMMR symposium, pp 94–101
22.
go back to reference Huang YC, Wu KY, Chen MC (2014) Seeing aural—an installation transferring the materials you gaze to sounds you hear. In: Proceedings of ACM TEI conference, pp 323–324 Huang YC, Wu KY, Chen MC (2014) Seeing aural—an installation transferring the materials you gaze to sounds you hear. In: Proceedings of ACM TEI conference, pp 323–324
23.
24.
go back to reference Larson AM, Loschky LC (2009) The contributions of central versus peripheral vision to scene gist recognition. J Vis 9(10):6,116CrossRef Larson AM, Loschky LC (2009) The contributions of central versus peripheral vision to scene gist recognition. J Vis 9(10):6,116CrossRef
25.
go back to reference Bowmaker JK, Dartnall HJ (1980) Visual pigments of rods and cones in a human retina. J Physiol 298:501–511CrossRef Bowmaker JK, Dartnall HJ (1980) Visual pigments of rods and cones in a human retina. J Physiol 298:501–511CrossRef
26.
go back to reference Rocchesso D, Delle Monache S (2012) Perception and replication of planar sonic gestures. ACM Trans Appl Percept 9(4):18CrossRef Rocchesso D, Delle Monache S (2012) Perception and replication of planar sonic gestures. ACM Trans Appl Percept 9(4):18CrossRef
27.
go back to reference Thoret E, Aramaki M, Kronland-Martinet R, Velay JL, Ystad S (2014) From sound to shape: auditory perception of drawing movements. J Exp Psychol Hum Percept Perform 40(3):983–994CrossRef Thoret E, Aramaki M, Kronland-Martinet R, Velay JL, Ystad S (2014) From sound to shape: auditory perception of drawing movements. J Exp Psychol Hum Percept Perform 40(3):983–994CrossRef
28.
go back to reference Mayron L (2013) A comparison of biologically-inspired methods for unsupervised salient object detection. In: Proceedings of ICME conference Mayron L (2013) A comparison of biologically-inspired methods for unsupervised salient object detection. In: Proceedings of ICME conference
29.
go back to reference Rumsey F (2003) Desktop audio technology: digital audio and MIDI principles. Focal Press, WalthamCrossRef Rumsey F (2003) Desktop audio technology: digital audio and MIDI principles. Focal Press, WalthamCrossRef
Metadata
Title
Musical Vision: an interactive bio-inspired sonification tool to convert images into music
Publication date
08-11-2018
Published in
Journal on Multimodal User Interfaces / Issue 3/2019
Print ISSN: 1783-7677
Electronic ISSN: 1783-8738
DOI
https://doi.org/10.1007/s12193-018-0280-4

Other articles of this Issue 3/2019

Journal on Multimodal User Interfaces 3/2019 Go to the issue

Premium Partner