Skip to main content

2016 | OriginalPaper | Buchkapitel

Predicting Image Aesthetics with Deep Learning

verfasst von : Simone Bianco, Luigi Celona, Paolo Napoletano, Raimondo Schettini

Erschienen in: Advanced Concepts for Intelligent Vision Systems

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper we investigate the use of a deep Convolutional Neural Network (CNN) to predict image aesthetics. To this end we fine-tune a canonical CNN architecture, originally trained to classify objects and scenes, by casting the image aesthetic prediction as a regression problem. We also investigate whether image aesthetic is a global or local attribute, and the role played by bottom-up and top-down salient regions to the prediction of the global image aesthetic. Experimental results on the canonical Aesthetic Visual Analysis (AVA) dataset show the robustness of the solution proposed, which outperforms the best solution in the state of the art by almost 17 % in terms of Mean Residual Sum of Squares Error (MRSSE).

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bengio, Y.: Deep learning of representations for unsupervised and transfer learning. Unsupervised Transf. Learn. Challenges Mach. Learn. 7, 19 (2012) Bengio, Y.: Deep learning of representations for unsupervised and transfer learning. Unsupervised Transf. Learn. Challenges Mach. Learn. 7, 19 (2012)
2.
Zurück zum Zitat Bhattacharya, S., Sukthankar, R., Shah, M.: A framework for photo-quality assessment and enhancement based on visual aesthetics. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 271–280. ACM (2010) Bhattacharya, S., Sukthankar, R., Shah, M.: A framework for photo-quality assessment and enhancement based on visual aesthetics. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 271–280. ACM (2010)
3.
Zurück zum Zitat Bianco, S.: Reflectance spectra recovery from tristimulus values by adaptive estimation with metameric shape correction. JOSA A 27(8), 1868–1877 (2010)CrossRef Bianco, S.: Reflectance spectra recovery from tristimulus values by adaptive estimation with metameric shape correction. JOSA A 27(8), 1868–1877 (2010)CrossRef
4.
Zurück zum Zitat Bianco, S., Bruna, A.R., Naccari, F., Schettini, R.: Color correction pipeline optimization for digital cameras. J. Electron. Imaging 22(2), 023014–023014 (2013)CrossRef Bianco, S., Bruna, A.R., Naccari, F., Schettini, R.: Color correction pipeline optimization for digital cameras. J. Electron. Imaging 22(2), 023014–023014 (2013)CrossRef
5.
Zurück zum Zitat Bianco, S., Ciocca, G., Marini, F., Schettini, R.: Image quality assessment by preprocessing and full reference model combination. In: IS&T/SPIE Electronic Imaging, p. 72420O. International Society for Optics and Photonics (2009) Bianco, S., Ciocca, G., Marini, F., Schettini, R.: Image quality assessment by preprocessing and full reference model combination. In: IS&T/SPIE Electronic Imaging, p. 72420O. International Society for Optics and Photonics (2009)
6.
Zurück zum Zitat Bianco, S., Ciocca, G., Napoletano, P., Schettini, R.: An interactive tool for manual, semi-automatic and automatic video annotation. Comput. Vis. Image Underst. 131, 88–99 (2015)CrossRef Bianco, S., Ciocca, G., Napoletano, P., Schettini, R.: An interactive tool for manual, semi-automatic and automatic video annotation. Comput. Vis. Image Underst. 131, 88–99 (2015)CrossRef
7.
Zurück zum Zitat Bianco, S., Schettini, R.: Adaptive color constancy using faces. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1505–1518 (2014)CrossRef Bianco, S., Schettini, R.: Adaptive color constancy using faces. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1505–1518 (2014)CrossRef
8.
Zurück zum Zitat Cagli, R.C., Coraggio, P., Napoletano, P., Boccignone, G.: What the draughtsman’s hand tells the draughtsman’s eye: a sensorimotor account of drawing. Int. J. Pattern Recogn. Artif. Intell. 22(05), 1015–1029 (2008)CrossRef Cagli, R.C., Coraggio, P., Napoletano, P., Boccignone, G.: What the draughtsman’s hand tells the draughtsman’s eye: a sensorimotor account of drawing. Int. J. Pattern Recogn. Artif. Intell. 22(05), 1015–1029 (2008)CrossRef
9.
Zurück zum Zitat Colace, F., De Santo, M., Greco, L., Napoletano, P.: A query expansion method based on a weighted word pairs approach. In: Proceedings of the 3rd Italian Information Retrieval (IIR) vol. 964, pp. 17–28 (2013) Colace, F., De Santo, M., Greco, L., Napoletano, P.: A query expansion method based on a weighted word pairs approach. In: Proceedings of the 3rd Italian Information Retrieval (IIR) vol. 964, pp. 17–28 (2013)
10.
Zurück zum Zitat Colace, F., De Santo, M., Greco, L., Napoletano, P.: Weighted word pairs for query expansion. Inf. Process. Manag. 51(1), 179–193 (2015)CrossRef Colace, F., De Santo, M., Greco, L., Napoletano, P.: Weighted word pairs for query expansion. Inf. Process. Manag. 51(1), 179–193 (2015)CrossRef
11.
Zurück zum Zitat Cusano, C., Napoletano, P., Schettini, R.: Evaluating color texture descriptors under large variations of controlled lighting conditions. JOSA A 33(1), 17–30 (2016)CrossRef Cusano, C., Napoletano, P., Schettini, R.: Evaluating color texture descriptors under large variations of controlled lighting conditions. JOSA A 33(1), 17–30 (2016)CrossRef
12.
Zurück zum Zitat Datta, R., Joshi, D., Li, J., Wang, J.Z.: Studying aesthetics in photographic images using a computational approach. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 288–301. Springer, Heidelberg (2006). doi:10.1007/11744078_23 CrossRef Datta, R., Joshi, D., Li, J., Wang, J.Z.: Studying aesthetics in photographic images using a computational approach. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 288–301. Springer, Heidelberg (2006). doi:10.​1007/​11744078_​23 CrossRef
13.
Zurück zum Zitat Datta, R., Li, J., Wang, J.Z.: Learning the consensus on visual quality for next-generation image management. In: Proceedings of the 15th International Conference on Multimedia, pp. 533–536. ACM (2007) Datta, R., Li, J., Wang, J.Z.: Learning the consensus on visual quality for next-generation image management. In: Proceedings of the 15th International Conference on Multimedia, pp. 533–536. ACM (2007)
14.
Zurück zum Zitat Datta, R., Li, J., Wang, J.Z.: Algorithmic inferencing of aesthetics and emotion in natural images: an exposition. In: 15th IEEE International Conference on Image Processing, ICIP 2008, pp. 105–108. IEEE (2008) Datta, R., Li, J., Wang, J.Z.: Algorithmic inferencing of aesthetics and emotion in natural images: an exposition. In: 15th IEEE International Conference on Image Processing, ICIP 2008, pp. 105–108. IEEE (2008)
15.
Zurück zum Zitat Deng, J., Berg, A., Satheesh, S., Su, H., Khosla, A., Fei-Fei, L.: Imagenet large Scale Visual Recognition Competition (ILSVRC 2012) (2012) Deng, J., Berg, A., Satheesh, S., Su, H., Khosla, A., Fei-Fei, L.: Imagenet large Scale Visual Recognition Competition (ILSVRC 2012) (2012)
16.
Zurück zum Zitat Itti, L., Koch, C.: Computational modelling of visual attention. Nat. Rev. Neurosci. 2(3), 194–203 (2001)CrossRef Itti, L., Koch, C.: Computational modelling of visual attention. Nat. Rev. Neurosci. 2(3), 194–203 (2001)CrossRef
17.
Zurück zum Zitat Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675–678. ACM (2014) Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675–678. ACM (2014)
18.
Zurück zum Zitat Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: IEEE International Conference on Computer Vision (ICCV) (2009) Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: IEEE International Conference on Computer Vision (ICCV) (2009)
19.
Zurück zum Zitat Kao, Y., Wang, C., Huang, K.: Visual aesthetic quality assessment with a regression model. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 1583–1587. IEEE (2015) Kao, Y., Wang, C., Huang, K.: Visual aesthetic quality assessment with a regression model. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 1583–1587. IEEE (2015)
20.
Zurück zum Zitat Ke, Y., Tang, X., Jing, F.: The design of high-level features for photo quality assessment. In: Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on. vol. 1, pp. 419–426. IEEE (2006) Ke, Y., Tang, X., Jing, F.: The design of high-level features for photo quality assessment. In: Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on. vol. 1, pp. 419–426. IEEE (2006)
21.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems, pp. 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
22.
Zurück zum Zitat LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef
23.
Zurück zum Zitat LeCun, Y., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient backprop. In: Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 1524, pp. 9–50. Springer, Heidelberg (1998). doi:10.1007/3-540-49430-8_2 CrossRef LeCun, Y., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient backprop. In: Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 1524, pp. 9–50. Springer, Heidelberg (1998). doi:10.​1007/​3-540-49430-8_​2 CrossRef
24.
Zurück zum Zitat Lu, X., Lin, Z., Jin, H., Yang, J., Wang, J.Z.: Rapid: rating pictorial aesthetics using deep learning. In: Proceedings of the ACM International Conference on Multimedia, pp. 457–466. ACM (2014) Lu, X., Lin, Z., Jin, H., Yang, J., Wang, J.Z.: Rapid: rating pictorial aesthetics using deep learning. In: Proceedings of the ACM International Conference on Multimedia, pp. 457–466. ACM (2014)
25.
Zurück zum Zitat Marchesotti, L., Perronnin, F., Larlus, D., Csurka, G.: Assessing the aesthetic quality of photographs using generic image descriptors. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1784–1791. IEEE (2011) Marchesotti, L., Perronnin, F., Larlus, D., Csurka, G.: Assessing the aesthetic quality of photographs using generic image descriptors. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1784–1791. IEEE (2011)
26.
Zurück zum Zitat Murray, N., Marchesotti, L., Perronnin, F.: Ava: a large-scale database for aesthetic visual analysis. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2408–2415. IEEE (2012) Murray, N., Marchesotti, L., Perronnin, F.: Ava: a large-scale database for aesthetic visual analysis. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2408–2415. IEEE (2012)
27.
Zurück zum Zitat Napoletano, P., Boccignone, G., Tisato, F.: Attentive monitoring of multiple video streams driven by a Bayesian foraging strategy. IEEE Trans. Image Process. 24(11), 3266–3281 (2015)MathSciNetCrossRef Napoletano, P., Boccignone, G., Tisato, F.: Attentive monitoring of multiple video streams driven by a Bayesian foraging strategy. IEEE Trans. Image Process. 24(11), 3266–3281 (2015)MathSciNetCrossRef
28.
Zurück zum Zitat Nishiyama, M., Okabe, T., Sato, I., Sato, Y.: Aesthetic quality classification of photographs based on color harmony. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 33–40. IEEE (2011) Nishiyama, M., Okabe, T., Sato, I., Sato, Y.: Aesthetic quality classification of photographs based on color harmony. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 33–40. IEEE (2011)
29.
Zurück zum Zitat Simond, F., Arvanitopoulos Darginis, N., Süsstrunk, S.: Image aesthetics depends on context. In: International Conference on Image Processing, vol. 1 (2015) Simond, F., Arvanitopoulos Darginis, N., Süsstrunk, S.: Image aesthetics depends on context. In: International Conference on Image Processing, vol. 1 (2015)
30.
Zurück zum Zitat Wu, O., Hu, W., Gao, J.: Learning to predict the perceived visual quality of photos. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 225–232. IEEE (2011) Wu, O., Hu, W., Gao, J.: Learning to predict the perceived visual quality of photos. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 225–232. IEEE (2011)
31.
Zurück zum Zitat Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Proceedings of Advances in Neural Information Processing Systems, pp. 3320–3328 (2014) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Proceedings of Advances in Neural Information Processing Systems, pp. 3320–3328 (2014)
32.
Zurück zum Zitat Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Proceedings of Advances in Neural Information Processing Systems, pp. 487–495 (2014) Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Proceedings of Advances in Neural Information Processing Systems, pp. 487–495 (2014)
Metadaten
Titel
Predicting Image Aesthetics with Deep Learning
verfasst von
Simone Bianco
Luigi Celona
Paolo Napoletano
Raimondo Schettini
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-48680-2_11

Premium Partner