Skip to main content
Erschienen in: International Journal of Computer Vision 5/2018

24.11.2017

From Facial Expression Recognition to Interpersonal Relation Prediction

verfasst von: Zhanpeng Zhang, Ping Luo, Chen Change Loy, Xiaoou Tang

Erschienen in: International Journal of Computer Vision | Ausgabe 5/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Interpersonal relation defines the association, e.g., warm, friendliness, and dominance, between two or more people. We investigate if such fine-grained and high-level relation traits can be characterized and quantified from face images in the wild. We address this challenging problem by first studying a deep network architecture for robust recognition of facial expressions. Unlike existing models that typically learn from facial expression labels alone, we devise an effective multitask network that is capable of learning from rich auxiliary attributes such as gender, age, and head pose, beyond just facial expression data. While conventional supervised training requires datasets with complete labels (e.g., all samples must be labeled with gender, age, and expression), we show that this requirement can be relaxed via a novel attribute propagation method. The approach further allows us to leverage the inherent correspondences between heterogeneous attribute sources despite the disparate distributions of different datasets. With the network we demonstrate state-of-the-art results on existing facial expression recognition benchmarks. To predict inter-personal relation, we use the expression recognition network as branches for a Siamese model. Extensive experiments show that our model is capable of mining mutual context of faces for accurate fine-grained interpersonal prediction.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
Despite we did not study the integration of face and body cues, if body posture and hand gesture information are available, they can be naturally used as additional input channels for our deep models.
 
Literatur
Zurück zum Zitat Bi, W., & Kwok, J. T. (2014). Multilabel classification with label correlations and missing labels. In AAAI conference on artificial intelligence (pp. 1680–1686). Bi, W., & Kwok, J. T. (2014). Multilabel classification with label correlations and missing labels. In AAAI conference on artificial intelligence (pp. 1680–1686).
Zurück zum Zitat Bromley, J., Guyon, I., Lecun, Y., Säckinger, E., & Shah, R. (1994). Signature verification using a Siamese time delay neural network. In Advances in neural information processing systems. Bromley, J., Guyon, I., Lecun, Y., Säckinger, E., & Shah, R. (1994). Signature verification using a Siamese time delay neural network. In Advances in neural information processing systems.
Zurück zum Zitat Celeux, G., Forbes, F., & Peyrard, N. (2003). EM procedures using mean field-like approximations for markov model-based image segmentation. Pattern Recognition, 36(1), 131–144.CrossRefMATH Celeux, G., Forbes, F., & Peyrard, N. (2003). EM procedures using mean field-like approximations for markov model-based image segmentation. Pattern Recognition, 36(1), 131–144.CrossRefMATH
Zurück zum Zitat Chakraborty, I., Cheng, H., & Javed, O. (2013). 3D visual proxemics: Recognizing human interactions in 3D from a single image. In IEEE conference on computer vision and pattern recognition (pp. 3406–3413). Chakraborty, I., Cheng, H., & Javed, O. (2013). 3D visual proxemics: Recognizing human interactions in 3D from a single image. In IEEE conference on computer vision and pattern recognition (pp. 3406–3413).
Zurück zum Zitat Chen, Y. Y., Hsu, W. H., & Liao, H. Y. M. (2012). Discovering informative social subgraphs and predicting pairwise relationships from group photos. In ACM multimedia (pp. 669–678). Chen, Y. Y., Hsu, W. H., & Liao, H. Y. M. (2012). Discovering informative social subgraphs and predicting pairwise relationships from group photos. In ACM multimedia (pp. 669–678).
Zurück zum Zitat Chu, X., Ouyang, W., Yang, W., & Wang, X. (2015). Multi-task recurrent neural network for immediacy prediction. In Proceedings of the IEEE international conference on computer vision (pp. 3352–3360). Chu, X., Ouyang, W., Yang, W., & Wang, X. (2015). Multi-task recurrent neural network for immediacy prediction. In Proceedings of the IEEE international conference on computer vision (pp. 3352–3360).
Zurück zum Zitat Cristani, M., Raghavendra, R., Del Bue, A., & Murino, V. (2013). Human behavior analysis in video surveillance: A social signal processing perspective. Neurocomputing, 100, 86–97.CrossRef Cristani, M., Raghavendra, R., Del Bue, A., & Murino, V. (2013). Human behavior analysis in video surveillance: A social signal processing perspective. Neurocomputing, 100, 86–97.CrossRef
Zurück zum Zitat Dahmane, M., & Meunier, J. (2011). Emotion recognition using dynamic grid-based hog features. In IEEE international conference on automatic face & gesture recognition (pp. 884–888). Dahmane, M., & Meunier, J. (2011). Emotion recognition using dynamic grid-based hog features. In IEEE international conference on automatic face & gesture recognition (pp. 884–888).
Zurück zum Zitat Deng, Z., Vahdat, A., Hu, H., & Mori, G. (2016). Structure inference machines: Recurrent neural networks for analyzing relations in group activity recognition. In IEEE conference on computer vision and pattern recognition. Deng, Z., Vahdat, A., Hu, H., & Mori, G. (2016). Structure inference machines: Recurrent neural networks for analyzing relations in group activity recognition. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Dhall, A., Asthana, A., Goecke, R., & Gedeon, T. (2011). Emotion recognition using PHOG and LPQ features. In IEEE international conference on automatic face & gesture recognition and workshops (pp. 878–883). Dhall, A., Asthana, A., Goecke, R., & Gedeon, T. (2011). Emotion recognition using PHOG and LPQ features. In IEEE international conference on automatic face & gesture recognition and workshops (pp. 878–883).
Zurück zum Zitat Dhall, A., Ramana Murthy, O., Goecke, R., Joshi, J., & Gedeon, T. (2015). Video and image based emotion recognition challenges in the wild: Emotiw 2015. In ACM international conference on multimodal interaction (pp. 423–426). Dhall, A., Ramana Murthy, O., Goecke, R., Joshi, J., & Gedeon, T. (2015). Video and image based emotion recognition challenges in the wild: Emotiw 2015. In ACM international conference on multimodal interaction (pp. 423–426).
Zurück zum Zitat Ding, L., & Yilmaz, A. (2010). Learning relations among movie characters: A social network perspective. In European conference on computer vision. Ding, L., & Yilmaz, A. (2010). Learning relations among movie characters: A social network perspective. In European conference on computer vision.
Zurück zum Zitat Ding, L., & Yilmaz, A. (2011). Inferring social relations from visual concepts. In IEEE international conference on computer vision (pp. 699–706). Ding, L., & Yilmaz, A. (2011). Inferring social relations from visual concepts. In IEEE international conference on computer vision (pp. 699–706).
Zurück zum Zitat Emily, M., & Hand, R. C. (2017). Attributes for improved attributes: A multi-task network utilizing implicit and explicit relationships for facial attribute classification. In AAAI conference on artificial intelligence. Emily, M., & Hand, R. C. (2017). Attributes for improved attributes: A multi-task network utilizing implicit and explicit relationships for facial attribute classification. In AAAI conference on artificial intelligence.
Zurück zum Zitat Fabian Benitez-Quiroz, C., Srinivasan, R., & Martinez, A. M. (2016). Emotionet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In IEEE conference on computer vision and pattern recognition. Fabian Benitez-Quiroz, C., Srinivasan, R., & Martinez, A. M. (2016). Emotionet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Fathi, A., Hodgins, J. K., & Rehg, J. M. (2012). Social interactions: A first-person perspective. In IEEE conference on computer vision and pattern recognition. Fathi, A., Hodgins, J. K., & Rehg, J. M. (2012). Social interactions: A first-person perspective. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Gallagher, A. C., & Chen, T. (2009). Understanding images of groups of people. In IEEE conference on computer vision and pattern recognition (pp. 256–263). IEEE. Gallagher, A. C., & Chen, T. (2009). Understanding images of groups of people. In IEEE conference on computer vision and pattern recognition (pp. 256–263). IEEE.
Zurück zum Zitat Girard, J. M. (2014). Perceptions of interpersonal behavior are influenced by gender, facial expression intensity, and head pose. In ACM international conference on multimodal interaction (pp. 394–398). Girard, J. M. (2014). Perceptions of interpersonal behavior are influenced by gender, facial expression intensity, and head pose. In ACM international conference on multimodal interaction (pp. 394–398).
Zurück zum Zitat Gottman, J., Levenson, R., & Woodin, E. (2001). Facial expressions during marital conflict. Journal of Family Communication, 1(1), 37–57.CrossRef Gottman, J., Levenson, R., & Woodin, E. (2001). Facial expressions during marital conflict. Journal of Family Communication, 1(1), 37–57.CrossRef
Zurück zum Zitat Gupta, A. K., & Nagar, D. K. (1999). Matrix variate distributions. Boca Raton: CRC Press.MATH Gupta, A. K., & Nagar, D. K. (1999). Matrix variate distributions. Boca Raton: CRC Press.MATH
Zurück zum Zitat Hess, U., Blairy, S., & Kleck, R. E. (2000). The influence of facial emotion displays, gender, and ethnicity on judgments of dominance and affiliation. Journal of Nonverbal Behavior, 24(4), 265–283.CrossRef Hess, U., Blairy, S., & Kleck, R. E. (2000). The influence of facial emotion displays, gender, and ethnicity on judgments of dominance and affiliation. Journal of Nonverbal Behavior, 24(4), 265–283.CrossRef
Zurück zum Zitat Hoai, M., & Zisserman, A. (2014). Talking heads: Detecting humans and recognizing their interactions. In IEEE conference on computer vision and pattern recognition. Hoai, M., & Zisserman, A. (2014). Talking heads: Detecting humans and recognizing their interactions. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Huang, C., Li, Y., Loy, C. C., & Tang, X. (2016). Learning deep representation for imbalanced classification. In IEEE conference on computer vision and pattern recognition. Huang, C., Li, Y., Loy, C. C., & Tang, X. (2016). Learning deep representation for imbalanced classification. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Hung, H., Jayagopi, D., Yeo, C., Friedland, G., Ba, S., Odobez, J. M., et al. (2007). Using audio and video features to classify the most dominant person in a group meeting. In ACM multimedia. Hung, H., Jayagopi, D., Yeo, C., Friedland, G., Ba, S., Odobez, J. M., et al. (2007). Using audio and video features to classify the most dominant person in a group meeting. In ACM multimedia.
Zurück zum Zitat Ibrahim, M., Muralidharan, S., Deng, Z., Vahdat, A., & Mori, G. (2016). A hierarchical deep temporal model for group activity recognition. In IEEE conference on computer vision and pattern recognition. Ibrahim, M., Muralidharan, S., Deng, Z., Vahdat, A., & Mori, G. (2016). A hierarchical deep temporal model for group activity recognition. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd international conference on machine learning (pp. 448–456). Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd international conference on machine learning (pp. 448–456).
Zurück zum Zitat Joo, J., Li, W., Steen, F., & Zhu, S. C. (2014). Visual persuasion: Inferring communicative intents of images. In IEEE conference on computer vision and pattern recognition (pp. 216–223). Joo, J., Li, W., Steen, F., & Zhu, S. C. (2014). Visual persuasion: Inferring communicative intents of images. In IEEE conference on computer vision and pattern recognition (pp. 216–223).
Zurück zum Zitat Jung, H., Lee, S., Yim, J., Park, S., & Kim, J. (2015). Joint fine-tuning in deep neural networks for facial expression recognition. In IEEE international conference on computer vision. Jung, H., Lee, S., Yim, J., Park, S., & Kim, J. (2015). Joint fine-tuning in deep neural networks for facial expression recognition. In IEEE international conference on computer vision.
Zurück zum Zitat Khorrami, P., Paine, T., & Huang, T. (2015). Do deep neural networks learn facial action units when doing expression recognition? In IEEE international conference on computer vision workshop. Khorrami, P., Paine, T., & Huang, T. (2015). Do deep neural networks learn facial action units when doing expression recognition? In IEEE international conference on computer vision workshop.
Zurück zum Zitat Kiesler, D. J. (1983). The 1982 interpersonal circle: A taxonomy for complementarity in human transactions. Psychological Review, 90(3), 185.CrossRef Kiesler, D. J. (1983). The 1982 interpersonal circle: A taxonomy for complementarity in human transactions. Psychological Review, 90(3), 185.CrossRef
Zurück zum Zitat Knutson, B. (1996). Facial expressions of emotion influence interpersonal trait inferences. Journal of Nonverbal Behavior, 20(3), 165–182.CrossRef Knutson, B. (1996). Facial expressions of emotion influence interpersonal trait inferences. Journal of Nonverbal Behavior, 20(3), 165–182.CrossRef
Zurück zum Zitat Kong, Y., Jia, Y., & Fu, Y. (2012). Learning human interaction by interactive phrases. In European conference on computer vision (pp. 300–313). Kong, Y., Jia, Y., & Fu, Y. (2012). Learning human interaction by interactive phrases. In European conference on computer vision (pp. 300–313).
Zurück zum Zitat Kostinger, M., Wohlhart, P., Roth, P., & Bischof, H. (2011). Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization. In IEEE international conference on computer vision workshop (pp. 2144–2151). Kostinger, M., Wohlhart, P., Roth, P., & Bischof, H. (2011). Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization. In IEEE international conference on computer vision workshop (pp. 2144–2151).
Zurück zum Zitat Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In F. Pereira, C. J. C. Burges, L. Bottou & K. Q. Weinberger (Eds.), Advances in neural information processing systems. Curran Associates, Inc. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In F. Pereira, C. J. C. Burges, L. Bottou & K. Q. Weinberger (Eds.), Advances in neural information processing systems. Curran Associates, Inc.
Zurück zum Zitat Kumar, N., Belhumeur, P., & Nayar, S. (2008). Facetracer: A search engine for large collections of images with faces. In European conference on computer vision (pp. 340–353). Berlin: Springer. Kumar, N., Belhumeur, P., & Nayar, S. (2008). Facetracer: A search engine for large collections of images with faces. In European conference on computer vision (pp. 340–353). Berlin: Springer.
Zurück zum Zitat Lan, T., Sigal, L., & Mori, G. (2012). Social roles in hierarchical models for human activity recognition. In IEEE conference on computer vision and pattern recognition. Lan, T., Sigal, L., & Mori, G. (2012). Social roles in hierarchical models for human activity recognition. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Lee, D. H. (2013). Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In International conference on machine learning workshop (vol. 3, p. 2). Lee, D. H. (2013). Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In International conference on machine learning workshop (vol. 3, p. 2).
Zurück zum Zitat Levi, G., & Hassner, T. (2015). Emotion recognition in the wild via convolutional neural networks and mapped binary patterns. In ACM international conference on multimodal interaction (pp. 503–510). Levi, G., & Hassner, T. (2015). Emotion recognition in the wild via convolutional neural networks and mapped binary patterns. In ACM international conference on multimodal interaction (pp. 503–510).
Zurück zum Zitat Li, H., Lin, Z., Shen, X., Brandt, J., & Hua, G. (2015). A convolutional neural network cascade for face detection. In IEEE conference on computer vision and pattern recognition. Li, H., Lin, Z., Shen, X., Brandt, J., & Hua, G. (2015). A convolutional neural network cascade for face detection. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Liu, M., Li, S., Shan, S., & Chen, X. (2015). AU-inspired deep networks for facial expression feature learning. Neurocomputing, 159, 126–136.CrossRef Liu, M., Li, S., Shan, S., & Chen, X. (2015). AU-inspired deep networks for facial expression feature learning. Neurocomputing, 159, 126–136.CrossRef
Zurück zum Zitat Liu, M., Li, S., Shan, S., Wang, R., & Chen, X. (2014a). Deeply learning deformable facial action parts model for dynamic expression analysis. In Asian conference on computer vision. Liu, M., Li, S., Shan, S., Wang, R., & Chen, X. (2014a). Deeply learning deformable facial action parts model for dynamic expression analysis. In Asian conference on computer vision.
Zurück zum Zitat Liu, M., Shan, S., Wang, R., & Chen, X. (2014b). Learning expressionlets on spatio-temporal manifold for dynamic facial expression recognition. In IEEE conference on computer vision and pattern recognition. Liu, M., Shan, S., Wang, R., & Chen, X. (2014b). Learning expressionlets on spatio-temporal manifold for dynamic facial expression recognition. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Liu, P., Han, S., Meng, Z., & Tong, Y. (2014c). Facial expression recognition via a boosted deep belief network. In IEEE conference on computer vision and pattern recognition (pp. 1805–1812). Liu, P., Han, S., Meng, Z., & Tong, Y. (2014c). Facial expression recognition via a boosted deep belief network. In IEEE conference on computer vision and pattern recognition (pp. 1805–1812).
Zurück zum Zitat Liu, S., Yang, J., Huang, C., & Yang, M. H. (2015a). Multi-objective convolutional learning for face labeling. In IEEE conference on computer vision and pattern recognition. Liu, S., Yang, J., Huang, C., & Yang, M. H. (2015a). Multi-objective convolutional learning for face labeling. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Liu, Z., Luo, P., Wang, X., & Tang, X. (2015b). Deep learning face attributes in the wild. In IEEE international conference on computer vision. Liu, Z., Luo, P., Wang, X., & Tang, X. (2015b). Deep learning face attributes in the wild. In IEEE international conference on computer vision.
Zurück zum Zitat Lucey, P., Cohn, J. F., Kanade, T., Saragih, J., Ambadar, Z., & Matthews, I. (2010). The extended Cohn–Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression. In IEEE conference on computer vision and pattern recognition workshops (pp. 94–101). Lucey, P., Cohn, J. F., Kanade, T., Saragih, J., Ambadar, Z., & Matthews, I. (2010). The extended Cohn–Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression. In IEEE conference on computer vision and pattern recognition workshops (pp. 94–101).
Zurück zum Zitat Luo, P., Wang, X., & Tang, X. (2012). Hierarchical face parsing via deep learning. In IEEE conference on computer vision and pattern recognition. Luo, P., Wang, X., & Tang, X. (2012). Hierarchical face parsing via deep learning. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Lyons, M. J., Budynek, J., & Akamatsu, S. (1999). Automatic classification of single facial images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(12), 1357–1362.CrossRef Lyons, M. J., Budynek, J., & Akamatsu, S. (1999). Automatic classification of single facial images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(12), 1357–1362.CrossRef
Zurück zum Zitat Mollahosseini, A., Chan, D., & Mahoor, M. H. (2016). Going deeper in facial expression recognition using deep neural networks. In IEEE winter conference on applications of computer vision. Mollahosseini, A., Chan, D., & Mahoor, M. H. (2016). Going deeper in facial expression recognition using deep neural networks. In IEEE winter conference on applications of computer vision.
Zurück zum Zitat Moody, J., Hanson, S., Krogh, A., & Hertz, J. A. (1995). A simple weight decay can improve generalization. Advances in Neural Information Processing Systems, 4, 950–957. Moody, J., Hanson, S., Krogh, A., & Hertz, J. A. (1995). A simple weight decay can improve generalization. Advances in Neural Information Processing Systems, 4, 950–957.
Zurück zum Zitat Ng, H. W., Nguyen, V. D., Vonikakis, V., & Winkler, S. (2015). Deep learning for emotion recognition on small datasets using transfer learning. In ACM international conference on multimodal interaction (pp. 443–449). Ng, H. W., Nguyen, V. D., Vonikakis, V., & Winkler, S. (2015). Deep learning for emotion recognition on small datasets using transfer learning. In ACM international conference on multimodal interaction (pp. 443–449).
Zurück zum Zitat Opitz, M., Waltner, G., Poier, G., Possegger, H., & Bischof, H. (2016). Grid loss: Detecting occluded faces. In European conference on computer vision. Opitz, M., Waltner, G., Poier, G., Possegger, H., & Bischof, H. (2016). Grid loss: Detecting occluded faces. In European conference on computer vision.
Zurück zum Zitat Pantic, M., Cowie, R., D’Errico, F., Heylen, D., Mehu, M., Pelachaud, C., et al. (2011). Social signal processing: The research agenda. In T. B. Moeslund, A. Hilton, V. Krüger & L. Sigal (Eds.), Visual analysis of humans (pp. 511–538). Berlin: Springer. Pantic, M., Cowie, R., D’Errico, F., Heylen, D., Mehu, M., Pelachaud, C., et al. (2011). Social signal processing: The research agenda. In T. B. Moeslund, A. Hilton, V. Krüger & L. Sigal (Eds.), Visual analysis of humans (pp. 511–538). Berlin: Springer.
Zurück zum Zitat Pantic, M., Valstar, M., Rademaker, R., & Maat, L. (2005). Web-based database for facial expression analysis. In IEEE international conference on multimedia and expo. Pantic, M., Valstar, M., Rademaker, R., & Maat, L. (2005). Web-based database for facial expression analysis. In IEEE international conference on multimedia and expo.
Zurück zum Zitat Parkhi, O. M., Vedaldi, A., & Zisserman, A. (2015). Deep face recognition. In British machine vision conference. Parkhi, O. M., Vedaldi, A., & Zisserman, A. (2015). Deep face recognition. In British machine vision conference.
Zurück zum Zitat Pearson, K. (1895). Note on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London, 58, 240–242.CrossRef Pearson, K. (1895). Note on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London, 58, 240–242.CrossRef
Zurück zum Zitat Pentland, A. (2007). Social signal processing. IEEE Signal Processing Magazine, 24(4), 108.CrossRef Pentland, A. (2007). Social signal processing. IEEE Signal Processing Magazine, 24(4), 108.CrossRef
Zurück zum Zitat Raducanu, B., & Gatica-Perez, D. (2012). Inferring competitive role patterns in reality TV show through nonverbal analysis. Multimedia Tools and Applications, 56(1), 207–226.CrossRef Raducanu, B., & Gatica-Perez, D. (2012). Inferring competitive role patterns in reality TV show through nonverbal analysis. Multimedia Tools and Applications, 56(1), 207–226.CrossRef
Zurück zum Zitat Ramanathan, V., Yao, B., & Fei-Fei, L. (2013). Social role discovery in human events. In IEEE conference on computer vision and pattern recognition (pp. 2475–2482). Ramanathan, V., Yao, B., & Fei-Fei, L. (2013). Social role discovery in human events. In IEEE conference on computer vision and pattern recognition (pp. 2475–2482).
Zurück zum Zitat Ricci, E., Varadarajan, J., Subramanian, R., Rota Bulo, S., Ahuja, N., & Lanz, O. (2015). Uncovering interactions and interactors: Joint estimation of head, body orientation and f-formations from surveillance videos. In IEEE international conference on computer vision. Ricci, E., Varadarajan, J., Subramanian, R., Rota Bulo, S., Ahuja, N., & Lanz, O. (2015). Uncovering interactions and interactors: Joint estimation of head, body orientation and f-formations from surveillance videos. In IEEE international conference on computer vision.
Zurück zum Zitat Ruiz, A., Van de Weijer, J., & Binefa, X. (2015). From emotions to action units with hidden and semi-hidden-task learning. In IEEE international conference on computer vision (pp. 3703–3711). Ruiz, A., Van de Weijer, J., & Binefa, X. (2015). From emotions to action units with hidden and semi-hidden-task learning. In IEEE international conference on computer vision (pp. 3703–3711).
Zurück zum Zitat Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.MathSciNetCrossRef Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.MathSciNetCrossRef
Zurück zum Zitat Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In IEEE conference on computer vision and pattern recognition. Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Shan, C., Gong, S., & McOwan, P. W. (2009). Facial expression recognition based on local binary patterns: A comprehensive study. Image and Vision Computing, 27(6), 803–816.CrossRef Shan, C., Gong, S., & McOwan, P. W. (2009). Facial expression recognition based on local binary patterns: A comprehensive study. Image and Vision Computing, 27(6), 803–816.CrossRef
Zurück zum Zitat Sun, Y., Wang, X., & Tang, X. (2016). Sparsifying neural network connections for face recognition. In IEEE conference on computer vision and pattern recognition. Sun, Y., Wang, X., & Tang, X. (2016). Sparsifying neural network connections for face recognition. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., et al. (2015). Going deeper with convolutions. In IEEE conference on computer vision and pattern recognition. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., et al. (2015). Going deeper with convolutions. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Tian, Y., Kanade, T., & Cohn, J. F. (2011). Facial expression recognition. In S. Z. Li & A. K. Jain (Eds.), Handbook of face recognition. Berlin: Springer. Tian, Y., Kanade, T., & Cohn, J. F. (2011). Facial expression recognition. In S. Z. Li & A. K. Jain (Eds.), Handbook of face recognition. Berlin: Springer.
Zurück zum Zitat Tian, Y. I., Kanade, T., & Cohn, J. F. (2001). Recognizing action units for facial expression analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(2), 97–115.CrossRef Tian, Y. I., Kanade, T., & Cohn, J. F. (2001). Recognizing action units for facial expression analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(2), 97–115.CrossRef
Zurück zum Zitat Tianqi, C., Li, M., Li, Y., Lin, M., Wang, N., Wang, M., et al. (2016). Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. In NIPS workshop on machine learning systems. Tianqi, C., Li, M., Li, Y., Lin, M., Wang, N., Wang, M., et al. (2016). Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. In NIPS workshop on machine learning systems.
Zurück zum Zitat Trigeorgis, G., Snape, P., Nicolaou, M. A., Antonakos, E., & Zafeiriou, S. (2016). Mnemonic descent method: A recurrent process applied for end-to-end face alignment. In IEEE conference on computer vision and pattern recognition. Trigeorgis, G., Snape, P., Nicolaou, M. A., Antonakos, E., & Zafeiriou, S. (2016). Mnemonic descent method: A recurrent process applied for end-to-end face alignment. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Valstar, M. F., Mehu, M., Jiang, B., Pantic, M., & Scherer, K. (2012). Meta-analysis of the first facial expression recognition challenge. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 42(4), 966–979.CrossRef Valstar, M. F., Mehu, M., Jiang, B., Pantic, M., & Scherer, K. (2012). Meta-analysis of the first facial expression recognition challenge. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 42(4), 966–979.CrossRef
Zurück zum Zitat Vinciarelli, A., Pantic, M., & Bourlard, H. (2009). Social signal processing: Survey of an emerging domain. Image and Vision Computing, 27(12), 1743–1759.CrossRef Vinciarelli, A., Pantic, M., & Bourlard, H. (2009). Social signal processing: Survey of an emerging domain. Image and Vision Computing, 27(12), 1743–1759.CrossRef
Zurück zum Zitat Vinciarelli, A., Pantic, M., Heylen, D., Pelachaud, C., Poggi, I., D’Errico, F., et al. (2012). Bridging the gap between social animal and unsocial machine: A survey of social signal processing. IEEE Transactions on Affective Computing, 3(1), 69–87.CrossRef Vinciarelli, A., Pantic, M., Heylen, D., Pelachaud, C., Poggi, I., D’Errico, F., et al. (2012). Bridging the gap between social animal and unsocial machine: A survey of social signal processing. IEEE Transactions on Affective Computing, 3(1), 69–87.CrossRef
Zurück zum Zitat Wang, G., Gallagher, A., Luo, J., & Forsyth, D. (2010). Seeing people in social context: Recognizing people and social relationships. In European conference on computer vision (pp. 169–182). Wang, G., Gallagher, A., Luo, J., & Forsyth, D. (2010). Seeing people in social context: Recognizing people and social relationships. In European conference on computer vision (pp. 169–182).
Zurück zum Zitat Wang, J., Cheng, Y., & Feris, R. S. (2016). Walk and learn: Facial attribute representation learning from egocentric video and contextual data. In IEEE conference on computer vision and pattern recognition. Wang, J., Cheng, Y., & Feris, R. S. (2016). Walk and learn: Facial attribute representation learning from egocentric video and contextual data. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Weng, C. Y., Chu, W. T., & Wu, J. L. (2009). RoleNet: Movie analysis from the perspective of social networks. IEEE Transactions on Multimedia, 11(2), 256–271.CrossRef Weng, C. Y., Chu, W. T., & Wu, J. L. (2009). RoleNet: Movie analysis from the perspective of social networks. IEEE Transactions on Multimedia, 11(2), 256–271.CrossRef
Zurück zum Zitat Wu, Y., & Ji, Q. (2016). Constrained joint cascade regression framework for simultaneous facial action unit recognition and facial landmark detection. In IEEE conference on computer vision and pattern recognition. Wu, Y., & Ji, Q. (2016). Constrained joint cascade regression framework for simultaneous facial action unit recognition and facial landmark detection. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Yang, B., Yan, J., Lei, Z., & Li, S. Z. (2014). Aggregate channel features for multi-view face detection. In International joint conference on biometrics. Yang, B., Yan, J., Lei, Z., & Li, S. Z. (2014). Aggregate channel features for multi-view face detection. In International joint conference on biometrics.
Zurück zum Zitat Yang, B., Yan, J., Lei, Z., & Li, S. Z. (2015). Convolutional channel features. In IEEE international conference on computer vision. Yang, B., Yan, J., Lei, Z., & Li, S. Z. (2015). Convolutional channel features. In IEEE international conference on computer vision.
Zurück zum Zitat Yang, H., Zhou, J. T., & Cai, J. (2016). Improving multi-label learning with missing labels by structured semantic correlations. In European conference on computer vision (pp. 835–851). Yang, H., Zhou, J. T., & Cai, J. (2016). Improving multi-label learning with missing labels by structured semantic correlations. In European conference on computer vision (pp. 835–851).
Zurück zum Zitat Yang, S., Luo, P., Loy, C. C., & Tang, X. (2015). From facial parts responses to face detection: A deep learning approach. In IEEE international conference on computer vision. Yang, S., Luo, P., Loy, C. C., & Tang, X. (2015). From facial parts responses to face detection: A deep learning approach. In IEEE international conference on computer vision.
Zurück zum Zitat Yang, S., Luo, P., Loy, C. C., & Tang, X. (2016). Wider face: A face detection benchmark. In IEEE conference on computer vision and pattern recognition. Yang, S., Luo, P., Loy, C. C., & Tang, X. (2016). Wider face: A face detection benchmark. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Yao, A., Shao, J., Ma, N., & Chen, Y. (2015). Capturing au-aware facial features and their latent relations for emotion recognition in the wild. In ACM international conference on multimodal interaction (pp. 451–458). Yao, A., Shao, J., Ma, N., & Chen, Y. (2015). Capturing au-aware facial features and their latent relations for emotion recognition in the wild. In ACM international conference on multimodal interaction (pp. 451–458).
Zurück zum Zitat Yu, H. F., Jain, P., Kar, P., & Dhillon, I. (2014). Large-scale multi-label learning with missing labels. In International conference on machine learning (pp. 593–601). Yu, H. F., Jain, P., Kar, P., & Dhillon, I. (2014). Large-scale multi-label learning with missing labels. In International conference on machine learning (pp. 593–601).
Zurück zum Zitat Yu, Z., & Zhang, C. (2015). Image based static facial expression recognition with multiple deep network learning. In ACM international conference on multimodal interaction (pp. 435–442). Yu, Z., & Zhang, C. (2015). Image based static facial expression recognition with multiple deep network learning. In ACM international conference on multimodal interaction (pp. 435–442).
Zurück zum Zitat Zafeiriou, S., Papaioannou, A., Kotsia, I., Nicolaou, M. A., & Zhao, G. (2016). Facial affect in-the-wild: A survey and a new database. In IEEE conference on computer vision and pattern recognition workshop. Zafeiriou, S., Papaioannou, A., Kotsia, I., Nicolaou, M. A., & Zhao, G. (2016). Facial affect in-the-wild: A survey and a new database. In IEEE conference on computer vision and pattern recognition workshop.
Zurück zum Zitat Zelnik-Manor, L., & Perona, P. (2004). Self-tuning spectral clustering. In NIPS (pp. 1601–1608). Zelnik-Manor, L., & Perona, P. (2004). Self-tuning spectral clustering. In NIPS (pp. 1601–1608).
Zurück zum Zitat Zhang, N., Paluri, M., Ranzato, M., Darrell, T., & Bourdev, L. (2014). Panda: Pose aligned networks for deep attribute modeling. In IEEE conference on computer vision and pattern recognition. Zhang, N., Paluri, M., Ranzato, M., Darrell, T., & Bourdev, L. (2014). Panda: Pose aligned networks for deep attribute modeling. In IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Zhang, Z., Luo, P., Loy, C. C., & Tang, X. (2015a). Learning deep representation for face alignment with auxiliary attributes. In IEEE transactions on pattern analysis and machine intelligence. Zhang, Z., Luo, P., Loy, C. C., & Tang, X. (2015a). Learning deep representation for face alignment with auxiliary attributes. In IEEE transactions on pattern analysis and machine intelligence.
Zurück zum Zitat Zhang, Z., Luo, P., Loy, C. C., & Tang, X. (2015b). Learning social relation traits from face images. In IEEE international conference on computer vision. Zhang, Z., Luo, P., Loy, C. C., & Tang, X. (2015b). Learning social relation traits from face images. In IEEE international conference on computer vision.
Zurück zum Zitat Zhang, Z., Luo, P., Loy, C. C., & Tang, X. (2016). Joint face representation adaptation and clustering in videos. In European conference on computer vision. Zhang, Z., Luo, P., Loy, C. C., & Tang, X. (2016). Joint face representation adaptation and clustering in videos. In European conference on computer vision.
Zurück zum Zitat Zhao, G., Huang, X., Taini, M., Li, S. Z., & Pietikäinen, M. (2011). Facial expression recognition from near-infrared videos. Image and Vision Computing, 29(9), 607–619.CrossRef Zhao, G., Huang, X., Taini, M., Li, S. Z., & Pietikäinen, M. (2011). Facial expression recognition from near-infrared videos. Image and Vision Computing, 29(9), 607–619.CrossRef
Zurück zum Zitat Zhao, G., & Pietikainen, M. (2007). Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6), 915–928.CrossRef Zhao, G., & Pietikainen, M. (2007). Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6), 915–928.CrossRef
Zurück zum Zitat Zhao, X., Liang, X., Liu, L., Li, T., Vasconcelos, N., & Yan, S. (2016). Peak-piloted deep network for facial expression recognition. In European conference on computer vision. Zhao, X., Liang, X., Liu, L., Li, T., Vasconcelos, N., & Yan, S. (2016). Peak-piloted deep network for facial expression recognition. In European conference on computer vision.
Zurück zum Zitat Zhong, L., Liu, Q., Yang, P., Liu, B., Huang, J., & Metaxas, D. N. (2012). Learning active facial patches for expression analysis. In IEEE conference on computer vision and pattern recognition (pp. 2562–2569). Zhong, L., Liu, Q., Yang, P., Liu, B., Huang, J., & Metaxas, D. N. (2012). Learning active facial patches for expression analysis. In IEEE conference on computer vision and pattern recognition (pp. 2562–2569).
Zurück zum Zitat Zhu, X., Lei, Z., Liu, X., Shi, H., & Li, S. Z. (2016). Face alignment across large poses: A 3d solution. In IEEE conference on computer vision and pattern recognition. Zhu, X., Lei, Z., Liu, X., Shi, H., & Li, S. Z. (2016). Face alignment across large poses: A 3d solution. In IEEE conference on computer vision and pattern recognition.
Metadaten
Titel
From Facial Expression Recognition to Interpersonal Relation Prediction
verfasst von
Zhanpeng Zhang
Ping Luo
Chen Change Loy
Xiaoou Tang
Publikationsdatum
24.11.2017
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 5/2018
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-017-1055-1

Weitere Artikel der Ausgabe 5/2018

International Journal of Computer Vision 5/2018 Zur Ausgabe

Premium Partner