Skip to main content
Top
Published in: International Journal of Multimedia Information Retrieval 1/2019

23-01-2019 | Regular Paper

Mining exoticism from visual content with fusion-based deep neural networks

Authors: Andrea Ceroni, Chenyang Ma, Ralph Ewerth

Published in: International Journal of Multimedia Information Retrieval | Issue 1/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Exoticism is the charm of the unfamiliar or something remote. It has received significant interest in different kinds of arts, but although visual concept classification in images and videos for semantic multimedia retrieval has been researched for years, the visual concept of exoticism has not been investigated yet from a computational perspective. In this paper, we present the first approach to automatically classify images as exotic or non-exotic. We have gathered two large datasets that cover exoticism in a general as well as a concept-specific way. The datasets have been annotated in a crowdsourcing approach. To circumvent cultural differences in the annotation, only North American crowdworkers are employed for this task. Two deep learning architectures to learn the concept of exoticism are evaluated. Besides deep learning features, we also investigate the usefulness of hand-crafted features, which are combined with deep features in our proposed fusion-based approach. Different machine learning models are compared with the fusion-based approach, which is the best performing one, reaching an accuracy over 83% and 91% on two different datasets. Comprehensive experimental results provide insights into which features contribute at most to recognizing exoticism. The estimation of image exoticism could be applied in fields like advertising and travel suggestions, as well as to increase serendipity and diversity of recommendations and search results.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Achanta R, Hemami S, Estrada F, Susstrunk S (2009) Frequency-tuned salient region detection. In: IEEE CVPR ’09 Achanta R, Hemami S, Estrada F, Susstrunk S (2009) Frequency-tuned salient region detection. In: IEEE CVPR ’09
2.
go back to reference Adamopoulos P, Tuzhilin A (2015) On unexpectedness in recommender systems: or how to better expect the unexpected. ACM TIST 5(4):54 Adamopoulos P, Tuzhilin A (2015) On unexpectedness in recommender systems: or how to better expect the unexpected. ACM TIST 5(4):54
3.
go back to reference Boiy E, Moens M-F (2009) A machine learning approach to sentiment analysis in multilingual web texts. Inf Retrieval 12(5):526–558CrossRef Boiy E, Moens M-F (2009) A machine learning approach to sentiment analysis in multilingual web texts. Inf Retrieval 12(5):526–558CrossRef
4.
go back to reference Borth D, Chen T, Ji R, Chang S (2013) Sentibank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content. In: MM’13 Borth D, Chen T, Ji R, Chang S (2013) Sentibank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content. In: MM’13
5.
go back to reference Bradski G (2000) The openCV library. Dr. Dobb’s J Softw Tools 120:122–125 Bradski G (2000) The openCV library. Dr. Dobb’s J Softw Tools 120:122–125
6.
go back to reference Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: ICML ’14 Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: ICML ’14
8.
go back to reference Eickhoff C, de Vries AP (2013) Increasing cheat robustness of crowdsourcing tasks. Inf Retrieval 16(2):121–137CrossRef Eickhoff C, de Vries AP (2013) Increasing cheat robustness of crowdsourcing tasks. Inf Retrieval 16(2):121–137CrossRef
9.
go back to reference Everingham M, Eslami SMA, Van Gool L, Williams CKI, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136CrossRef Everingham M, Eslami SMA, Van Gool L, Williams CKI, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136CrossRef
10.
go back to reference Ewerth R, Springstein M, Phan-Vogtmann LA, Schütze J (2017) “Are machines better than humans in image tagging?”: a user study adds to the puzzle. In: Jose JM, Hauff C, Altıngovde IS, Song D, Albakour D, Watt S, Tait J (eds) Advances in information retrieval. Springer, Cham, pp 186–198CrossRef Ewerth R, Springstein M, Phan-Vogtmann LA, Schütze J (2017) “Are machines better than humans in image tagging?”: a user study adds to the puzzle. In: Jose JM, Hauff C, Altıngovde IS, Song D, Albakour D, Watt S, Tait J (eds) Advances in information retrieval. Springer, Cham, pp 186–198CrossRef
11.
go back to reference Ge M, Delgado-Battenfeld C, Jannach D (2010) Beyond accuracy: evaluating recommender systems by coverage and serendipity. In: RecSys ’10 Ge M, Delgado-Battenfeld C, Jannach D (2010) Beyond accuracy: evaluating recommender systems by coverage and serendipity. In: RecSys ’10
12.
go back to reference Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR ’14 Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR ’14
13.
go back to reference Goldwater RJ (1986) Primitivism in modern art. Harvard University Press, CambridgeCrossRef Goldwater RJ (1986) Primitivism in modern art. Harvard University Press, CambridgeCrossRef
14.
go back to reference Gracia J, Montiel-Ponsoda E, Cimiano P, Gómez-Pérez A, Buitelaar P, McCrae J (2012) Challenges for the multilingual web of data. Web Semant Sci Serv Agents World Wide Web 11:63–71CrossRef Gracia J, Montiel-Ponsoda E, Cimiano P, Gómez-Pérez A, Buitelaar P, McCrae J (2012) Challenges for the multilingual web of data. Web Semant Sci Serv Agents World Wide Web 11:63–71CrossRef
15.
go back to reference Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18CrossRef Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18CrossRef
16.
go back to reference Hall MA (1999) Correlation-based feature selection for machine learning. PhD thesis, The University of Waikato Hall MA (1999) Correlation-based feature selection for machine learning. PhD thesis, The University of Waikato
17.
go back to reference Haralick RM (1979) Statistical and structural approaches to texture. Proc IEEE 67(5):786–804CrossRef Haralick RM (1979) Statistical and structural approaches to texture. Proc IEEE 67(5):786–804CrossRef
18.
go back to reference Hare J, Samangooei S, Dupplaw D (2011) OpenIMAJ and ImageTerrier: Java libraries and tools for scalable multimedia analysis and indexing of images. In: MM ’11 Hare J, Samangooei S, Dupplaw D (2011) OpenIMAJ and ImageTerrier: Java libraries and tools for scalable multimedia analysis and indexing of images. In: MM ’11
19.
go back to reference Howarth P, Rüger S (2004) Evaluation of texture features for content-based image retrieval. In: CIVR ’04 Howarth P, Rüger S (2004) Evaluation of texture features for content-based image retrieval. In: CIVR ’04
20.
go back to reference Hull DA, Grefenstette G (1996) Querying across languages: a dictionary-based approach to multilingual information retrieval. In: SIGIR ’96 Hull DA, Grefenstette G (1996) Querying across languages: a dictionary-based approach to multilingual information retrieval. In: SIGIR ’96
21.
go back to reference Jacobs M (1995) The painted voyage: art, travel and exploration, 1564–1875 (Art History). British Museum Press, London Jacobs M (1995) The painted voyage: art, travel and exploration, 1564–1875 (Art History). British Museum Press, London
22.
go back to reference Jenkins OH (1999) Understanding and measuring tourist destination images. Int J Tour Res 1:1–15CrossRef Jenkins OH (1999) Understanding and measuring tourist destination images. Int J Tour Res 1:1–15CrossRef
23.
go back to reference Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: ACM MM ’14 Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: ACM MM ’14
25.
go back to reference Jou B, Chen T, Pappas N, Redi M, Topkara M, Chang S (2015) Visual affect around the world: a large-scale multilingual visual sentiment ontology. In: MM ’15 Jou B, Chen T, Pappas N, Redi M, Topkara M, Chang S (2015) Visual affect around the world: a large-scale multilingual visual sentiment ontology. In: MM ’15
26.
go back to reference Kaminskas M, Bridge D (2017) Diversity, serendipity, novelty, and coverage: a survey and empirical analysis of beyond-accuracy objectives in recommender systems. ACM Trans Interact Intell Syst 7(1):2 Kaminskas M, Bridge D (2017) Diversity, serendipity, novelty, and coverage: a survey and empirical analysis of beyond-accuracy objectives in recommender systems. ACM Trans Interact Intell Syst 7(1):2
27.
go back to reference Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS ’12 Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS ’12
28.
go back to reference Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174CrossRefMATH Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174CrossRefMATH
29.
go back to reference Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755 Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755
30.
go back to reference Locke RP (2009) Musical exoticism. Images and reflections. Cambridge University Pres, Cambridge Locke RP (2009) Musical exoticism. Images and reflections. Cambridge University Pres, Cambridge
31.
go back to reference Luo Y, Tang X (2008) Photo and video quality evaluation: focusing on the subject. In: ECCV ’08 Luo Y, Tang X (2008) Photo and video quality evaluation: focusing on the subject. In: ECCV ’08
32.
go back to reference Machajdik J, Hanbury A (2010) Affective image classification using features inspired by psychology and art theory. In: ACM MM’10 Machajdik J, Hanbury A (2010) Affective image classification using features inspired by psychology and art theory. In: ACM MM’10
33.
go back to reference Markatopoulou F, Moumtzidou A, Tzelepis C, Avgerinakis K, Gkalelis N, Vrochidis S, Mezaris V, Kompatsiaris I (2015) ITI-CERTH participation to TRECVID 2015. In: TRECVID 2015 workshop Markatopoulou F, Moumtzidou A, Tzelepis C, Avgerinakis K, Gkalelis N, Vrochidis S, Mezaris V, Kompatsiaris I (2015) ITI-CERTH participation to TRECVID 2015. In: TRECVID 2015 workshop
34.
go back to reference Mavridaki E, Mezaris V (2014) No-reference blur assessment in natural images using Fourier transform and spatial pyramids. In: ICIP ’14 Mavridaki E, Mezaris V (2014) No-reference blur assessment in natural images using Fourier transform and spatial pyramids. In: ICIP ’14
35.
go back to reference Mavridaki E, Mezaris V (2015) A comprehensive aesthetic quality assessment method for natural images using basic rules of photography. In: IEEE ICIP ’15 Mavridaki E, Mezaris V (2015) A comprehensive aesthetic quality assessment method for natural images using basic rules of photography. In: IEEE ICIP ’15
36.
go back to reference Mihalcea R, Banea C, Wiebe J (2007) Learning multilingual subjective language via cross-lingual projections. In: ACL ’07 Mihalcea R, Banea C, Wiebe J (2007) Learning multilingual subjective language via cross-lingual projections. In: ACL ’07
38.
go back to reference Müller-Budack E, Pustu-Iren K, Ewerth R (2018) Geolocation estimation of photos using a hierarchical model and scene classification. In: European conference on computer vision (ECCV). Springer, Munich, pp 575–592 Müller-Budack E, Pustu-Iren K, Ewerth R (2018) Geolocation estimation of photos using a hierarchical model and scene classification. In: European conference on computer vision (ECCV). Springer, Munich, pp 575–592
39.
go back to reference Nguyen TT, Hui P, Harper F, Terveen L, Konstan J (2014) Exploring the filter bubble: the effect of using recommender systems on content diversity. In: WWW’14 Nguyen TT, Hui P, Harper F, Terveen L, Konstan J (2014) Exploring the filter bubble: the effect of using recommender systems on content diversity. In: WWW’14
40.
go back to reference Over P, Awad G, Fiscus J, Sanders G, Shaw B, Michel M, Smeaton A, Kraaij W, Quénot G (2013) TRECVID 2013: an overview of the goals, tasks, data, evaluation mechanisms and metrics. In: Proceedings of TRECVID 2013. Washington, USA. https://hal.inria.fr/hal-00953093 Over P, Awad G, Fiscus J, Sanders G, Shaw B, Michel M, Smeaton A, Kraaij W, Quénot G (2013) TRECVID 2013: an overview of the goals, tasks, data, evaluation mechanisms and metrics. In: Proceedings of TRECVID 2013. Washington, USA. https://​hal.​inria.​fr/​hal-00953093
41.
go back to reference Pappas N, Redi M, Topkara M, Jou B, Liu H, Chen T, Chang S (2016) Multilingual visual sentiment concept matching. In: ICMR ’16 Pappas N, Redi M, Topkara M, Jou B, Liu H, Chen T, Chang S (2016) Multilingual visual sentiment concept matching. In: ICMR ’16
42.
go back to reference Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252MathSciNetCrossRef Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252MathSciNetCrossRef
43.
go back to reference San Pedro J, Siersdorfer S (2009) Ranking and classifying attractiveness of photos in folksonomies. In: WWW ’09 San Pedro J, Siersdorfer S (2009) Ranking and classifying attractiveness of photos in folksonomies. In: WWW ’09
44.
go back to reference Segalen V (2002) Essay on exoticism: an aesthetics of diversity. Duke University Press, Durham Segalen V (2002) Essay on exoticism: an aesthetics of diversity. Duke University Press, Durham
45.
go back to reference Sharma G, Wu W, Dalal EN (2005) The CIEDE2000 color-difference formula: implementation notes, supplementary test data, and mathematical observations. Color Res Appl 30(1):21–30CrossRef Sharma G, Wu W, Dalal EN (2005) The CIEDE2000 color-difference formula: implementation notes, supplementary test data, and mathematical observations. Color Res Appl 30(1):21–30CrossRef
46.
go back to reference Sheridan P, Ballerini JP (1996) Experiments in multilingual information retrieval using the spider system. In: SIGIR ’96 Sheridan P, Ballerini JP (1996) Experiments in multilingual information retrieval using the spider system. In: SIGIR ’96
47.
go back to reference Shi Y, Larson M, Hanjalic A (2014) Collaborative filtering beyond the user-item matrix: a survey of the state of the art and future challenges. ACM Comput Surv 47:1–45CrossRef Shi Y, Larson M, Hanjalic A (2014) Collaborative filtering beyond the user-item matrix: a survey of the state of the art and future challenges. ACM Comput Surv 47:1–45CrossRef
48.
go back to reference Song K, Tian Y, Gao W, Huang T (2006) Diversifying the image retrieval results. In: ACM MM ’06 Song K, Tian Y, Gao W, Huang T (2006) Diversifying the image retrieval results. In: ACM MM ’06
49.
go back to reference Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: CVPR’15 Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: CVPR’15
50.
go back to reference Tamura H, Mori S, Yamawaki T (1978) Textural features corresponding to visual perception. IEEE Trans Syst Man Cybern 8(6):460–473CrossRef Tamura H, Mori S, Yamawaki T (1978) Textural features corresponding to visual perception. IEEE Trans Syst Man Cybern 8(6):460–473CrossRef
51.
go back to reference Tapachai N, Waryszak R (2000) An examination of the role of beneficial image in tourist destination selection. J Travel Res 39(1):37–44CrossRef Tapachai N, Waryszak R (2000) An examination of the role of beneficial image in tourist destination selection. J Travel Res 39(1):37–44CrossRef
52.
go back to reference Thomee B, Shamma DA, Friedland G, Elizalde B, Ni K, Poland D, Borth D, Li LJ (2016) Yfcc100m: the new data in multimedia research. Commun ACM 59(2):64–73CrossRef Thomee B, Shamma DA, Friedland G, Elizalde B, Ni K, Poland D, Borth D, Li LJ (2016) Yfcc100m: the new data in multimedia research. Commun ACM 59(2):64–73CrossRef
53.
go back to reference Tong H, Li M, Zhang H, He J, Zhang C (2004) Classification of digital photos taken by photographers or home users. In: PCM ’04 Tong H, Li M, Zhang H, He J, Zhang C (2004) Classification of digital photos taken by photographers or home users. In: PCM ’04
54.
go back to reference van Leuken RH, Garcia L, Olivares X, van Zwol R (2009) Visual diversification of image search results. In: WWW ’09 van Leuken RH, Garcia L, Olivares X, van Zwol R (2009) Visual diversification of image search results. In: WWW ’09
55.
go back to reference van de Weijer J, Schmid C, Verbeek J (2007) Learning color names from real-world images. In: IEEE CVPR’07 van de Weijer J, Schmid C, Verbeek J (2007) Learning color names from real-world images. In: IEEE CVPR’07
56.
go back to reference Vargas S, Castells P (2011) Rank and relevance in novelty and diversity metrics for recommender systems. In: RecSys ’11 Vargas S, Castells P (2011) Rank and relevance in novelty and diversity metrics for recommender systems. In: RecSys ’11
57.
go back to reference Weyand T, Kostrikov I, Philbin J (2016) Planet-photo geolocation with convolutional neural networks. In: European conference on computer vision. Springer, pp 37–55 Weyand T, Kostrikov I, Philbin J (2016) Planet-photo geolocation with convolutional neural networks. In: European conference on computer vision. Springer, pp 37–55
58.
go back to reference Wu S, Chen YC, Li X, Wu AC, You JJ, Zheng WS (2016) An enhanced deep feature representation for person re-identification. In: 2016 IEEE winter conference on applications of computer vision (WACV), pp 1–8 Wu S, Chen YC, Li X, Wu AC, You JJ, Zheng WS (2016) An enhanced deep feature representation for person re-identification. In: 2016 IEEE winter conference on applications of computer vision (WACV), pp 1–8
59.
go back to reference Wu Y, Bauckhage C, Thurau C (2010) The good, the bad, and the ugly: predicting aesthetic image labels. In: ICPR ’10 Wu Y, Bauckhage C, Thurau C (2010) The good, the bad, and the ugly: predicting aesthetic image labels. In: ICPR ’10
60.
go back to reference Xiao J, Hays J, Ehinger KA, Oliva A, Torralba A (2010) Sun database: large-scale scene recognition from abbey to zoo. In: 2010 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3485–3492 Xiao J, Hays J, Ehinger KA, Oliva A, Torralba A (2010) Sun database: large-scale scene recognition from abbey to zoo. In: 2010 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3485–3492
61.
go back to reference Yeh CH, Ho YC, Barsky BA, Ouhyoung M (2010) Personalized photograph ranking and selection system. In: ACM MM ’10 Yeh CH, Ho YC, Barsky BA, Ouhyoung M (2010) Personalized photograph ranking and selection system. In: ACM MM ’10
62.
go back to reference Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: ECCV ’14 Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: ECCV ’14
63.
go back to reference Zhang N, Donahue J, Girshick RB, Darrell T (2014) Part-based R-CNNs for fine-grained category detection. In: ECCV ’14 Zhang N, Donahue J, Girshick RB, Darrell T (2014) Part-based R-CNNs for fine-grained category detection. In: ECCV ’14
64.
go back to reference Zhao S, Gao Y, Jiang X, Yao H, Chua T, Sun X (2014) Exploring principles-of-art features for image emotion recognition. In: MM ’14 Zhao S, Gao Y, Jiang X, Yao H, Chua T, Sun X (2014) Exploring principles-of-art features for image emotion recognition. In: MM ’14
65.
go back to reference Zhao S, Ding G, Huang Q, Chua TS, Schuller BW, Keutzer K (2018) Affective image content analysis: a comprehensive survey. In: IJCAI, pp 5534–5541 Zhao S, Ding G, Huang Q, Chua TS, Schuller BW, Keutzer K (2018) Affective image content analysis: a comprehensive survey. In: IJCAI, pp 5534–5541
Metadata
Title
Mining exoticism from visual content with fusion-based deep neural networks
Authors
Andrea Ceroni
Chenyang Ma
Ralph Ewerth
Publication date
23-01-2019
Publisher
Springer London
Published in
International Journal of Multimedia Information Retrieval / Issue 1/2019
Print ISSN: 2192-6611
Electronic ISSN: 2192-662X
DOI
https://doi.org/10.1007/s13735-018-00165-4

Other articles of this Issue 1/2019

International Journal of Multimedia Information Retrieval 1/2019 Go to the issue

Premium Partner