skip to main content
10.1145/2813524.2813530acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Diving Deep into Sentiment: Understanding Fine-tuned CNNs for Visual Sentiment Prediction

Published:30 October 2015Publication History

ABSTRACT

Visual media are powerful means of expressing emotions and sentiments. The constant generation of new content in social networks highlights the need of automated visual sentiment analysis tools. While Convolutional Neural Networks (CNNs) have established a new state-of-the-art in several vision problems, their application to the task of sentiment analysis is mostly unexplored and there are few studies regarding how to design CNNs for this purpose. In this work, we study the suitability of fine-tuning a CNN for visual sentiment prediction as well as explore performance boosting techniques within this deep learning setting. Finally, we provide a deep-dive analysis into a benchmark, state-of-the-art network architecture to gain insight about how to design patterns for CNNs on the task of visual sentiment prediction.

References

  1. D. Borth, R. Ji, T. Chen, T. Breuel, and S.-F. Chang. Large-scale visual sentiment ontology and detectors using adjective noun pairs. In ACM MM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman. Return of the devil in the details: Delving deep into convolutional nets. In British Machine Vision Conference, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  3. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages 248--255. IEEE, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  4. K. Dinakar, B. Jones, C. Havasi, and H. Lieberman. Common sense reasoning for detection, prevention, and mitigation of cyberbullying.Google ScholarGoogle Scholar
  5. K. He, X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. arXiv:abs/1502.01852 {cs.CV}, 2015.Google ScholarGoogle Scholar
  6. J. Jia, S. Wu, X. Wang, P. Hu, L. Cai, and J. Tang. Can we understand van Gogh's mood?: Learning to infer affects from images in social networks. In ACM MM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. In ACM MM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y.-G. Jiang, B. Xu, and X. Xue. Predicting emotions in user-generated videos. In AAAI, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. X. Jin, A. Gallagher, L. Cao, J. Luo, and J. Han. The wisdom of social multimedia: Using Flickr for prediction and forecast. In ACM MM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. Jou, S. Bhattacharya, and S.-F. Chang. Predicting viewer perceived emotions in animated GIFs. In ACM MM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Y. Kim, H. Lee, and E. M. Provost. Deep learning for robust feature generation in audiovisual emotion recognition. In ICASSP, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  12. A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks. In NIPS, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. Lang, M. Bradley, and B. Cuthbert. International Affective Picture System (IAPS): Technical manual and affective ratings. Technical report, NIMH CSEA, 1997.Google ScholarGoogle Scholar
  14. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. In Proc. of the IEEE, 1998.Google ScholarGoogle Scholar
  15. J. Machajdik and A. Hanbury. Affective image classification using features inspired by psychology and art theory. In ACM MM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. McDuff, R. Kaliouby, J. Cohn, and R. Picard. Predicting ad liking and purchase intent: Large-scale analysis of facial responses to ads.Google ScholarGoogle Scholar
  17. M. Oquab, L. Bottou, I. Laptev, and J. Sivic. Learning and transferring mid-level image representations using convolutional neural networks. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages 1717--1724. IEEE, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. K.-C. Peng, T. Chen, A. Sadovnik, and A. Gallagher. A mixed bag of emotions: Model, predict, and transfer emotion distributions. In CVPR, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  19. K.-C. Peng, K. Karlsson, T. Chen, D.-Q. Zhang, and H. Yu. A framework of changing image emotion using emotion prediction.Google ScholarGoogle Scholar
  20. R. Plutchik. Emotion: A Psychoevolutionary Synthesis. Harper & Row, 1980.Google ScholarGoogle Scholar
  21. A. S. Razavian, H. Azizpour, J. Sullivan, and S. Carlsson. CNN features off-the-shelf: An astounding baseline for recognition. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2014 IEEE Conference on, pages 512--519. IEEE, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Salvador, M. Zeppelzauer, D. Manchon-Vizuete, A. Calafell, and X. Giro-i Nieto. Cultural event recognition with visual convnets and temporal models. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2015 IEEE Conference on. IEEE, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  23. S. Siersdorfer, E. Minack, F. Deng, and J. Hare. Analyzing and predicting sentiment of images on the social web. In Proceedings of the international conference on Multimedia, pages 715--718. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. arXiv preprint arXiv:1409.4842, 2014.Google ScholarGoogle Scholar
  25. C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. In ICLR, 2014.Google ScholarGoogle Scholar
  26. Y. Tang. Deep learning using linear support vector machines. In ICML Workshop on Challenges in Representation Learning, 2013.Google ScholarGoogle Scholar
  27. X. Wang, J. Jia, and L. Cai. Affective image adjustment with a single word. The Visual Computer, 29(11):1121--1133, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Xu, S. Cetintas, K.-C. Lee, and L.-J. Li. Visual sentiment prediction with deep convolutional neural networks. arXiv preprint arXiv:1411.5731, 2014.Google ScholarGoogle Scholar
  29. V. Yanulevskaya, J. van Gemert, K. Roth, A. Herbold, N. Sebe, and J. M. Geusebroek. Emotional valence categorization using holistic image features. In ICIP, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  30. Q. You, J. Luo, H. Jin, and J. Yang. Robust image sentiment analysis using progressively trained and domain transferred deep networks. In The Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI), 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. In Computer Vision--ECCV 2014, pages 818--833. Springer, 2014.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Diving Deep into Sentiment: Understanding Fine-tuned CNNs for Visual Sentiment Prediction

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ASM '15: Proceedings of the 1st International Workshop on Affect & Sentiment in Multimedia
      October 2015
      70 pages
      ISBN:9781450337502
      DOI:10.1145/2813524

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 October 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      ASM '15 Paper Acceptance Rate9of16submissions,56%Overall Acceptance Rate9of16submissions,56%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader