research-article

A Novel Multi-task Tensor Correlation Neural Network for Facial Attribute Prediction

Authors:
Mingxing Duan

Hunan University, China

Hunan University, China

0000-0002-1049-6244
View Profile

,
Kenli Li

Hunan University, China

Hunan University, China
View Profile

,
Keqin Li

State University of New York, USA

State University of New York, USA
View Profile

,
Qi Tian

Huawei, China

Huawei, China
View Profile

ACM Transactions on Intelligent Systems and Technology Volume 12 Issue 1Article No.: 3pp 1–22https://doi.org/10.1145/3418285

Published:13 November 2020Publication History

ACM Transactions on Intelligent Systems and Technology

Abstract

Multi-task learning plays an important role in face multi-attribute prediction. At present, most researches excavate the shared information between attributes by sharing all convolutional layers. However, it is not appropriate to treat the low-level and high-level features of the face multi-attribute equally, because the high-level features are more biased toward the specific content of the category. In this article, a novel multi-attribute tensor correlation neural network (MTCN) is used to predict face attributes. MTCN shares all attribute features at the low-level layers, and then distinguishes each attribute feature at the high-level layers. To better excavate the correlations among high-level attribute features, each sub-network explores useful information from other networks to enhance its original information. Then a tensor canonical correlation analysis method is used to seek the correlations among the highest-level attributes, which enhances the original information of each attribute. After that, these features are mapped into a highly correlated space through the correlation matrix. Finally, we use sufficient experiments to verify the performance of MTCN on the CelebA and LFWA datasets and our MTCN achieves the best performance compared with the latest multi-attribute recognition algorithms under the same settings.

References

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. Retrieved from https://arXiv:1603.04467.Google Scholar
A. H. Abdulnabi, G. Wang, J. Lu, and K. Jia. 2015. Multi-task CNN model for attribute prediction. IEEE Trans. Multimedia 17, 11 (Nov. 2015), 1949--1959. DOI:https://doi.org/10.1109/TMM.2015.2477680Google ScholarDigital Library
Jiajiong Cao, Yingming Li, and Zhongfei Zhang. 2018. Partially shared multi-task convolutional neural network with local constraint for face attribute learning. In Proceedings of the CVPR. 4290--4299.Google ScholarCross Ref
J. Douglas Carroll and Jih-Jie Chang. 1970. Analysis of individual differences in multidimensional scaling via an N-way generalization of Eckart-Young decomposition. Psychometrika 35, 3 (1970), 283--319.Google ScholarCross Ref
Pierre Comon, Xavier Luciani, and André L. F. De Almeida. 2009. Tensor decompositions, alternating least squares and other tales. J. Chemo.: J. Chemo. Soc. 23, 7--8 (2009), 393--405.Google Scholar
Cottrell, W Garrison, Metcalfe, and Janet. 1990. EMPATH: Face, emotion, and gender recognition using holons. In Proceedings of the NIPs. 564--571.Google Scholar
A. Dantcheva and F. Bremond. 2017. Gender estimation based on smile-dynamics. IEEE Trans. Info. Forensics Secur. 12, 3 (Mar. 2017), 719--729. DOI:https://doi.org/10.1109/TIFS.2016.2632070Google ScholarDigital Library
Hamdi Dibeklioğlu, Fares Alnajar, Albert Ali Salah, and Theo Gevers. 2015. Combining facial dynamics with appearance for age estimation. IEEE TIP 24, 6 (2015), 1928--1943.Google Scholar
Hui Ding, Hao Zhou, Shaohua Kevin Zhou, and Rama Chellappa. 2018. A deep cascade network for unaligned face attribute classification. In Proceedings of the AAAI.Google Scholar
M. Duan, K. Li, and K. Li. 2018. An ensemble CNN2ELM for age estimation. IEEE Trans. Info. Forensics Secur. 13, 3 (Mar. 2018), 758--772. DOI:https://doi.org/10.1109/TIFS.2017.2766583Google ScholarCross Ref
Mingxing Duan, Kenli Li, Xiangke Liao, Keqin Li, and Qi Tian. 2019. Features-enhanced multi-attribute estimation with convolutional tensor correlation fusion network. ACM Trans. Multimedia Comput. Commun. Appl. 15, 3s (2019), 1--23.Google ScholarDigital Library
Max Ehrlich, Timothy J. Shields, Timur Almaev, and Mohamed R. Amer. 2016. Facial attributes classification using multi-task representation learning. In Proceedings of the CVPR Workshops. 47--55.Google Scholar
Nour El Din Elmadany, Yifeng He, and Ling Guan. 2016. Multiview learning via deep discriminative canonical correlation analysis. In Proceedings of the IEEE ICASSP. 2409--2413.Google ScholarDigital Library
S. Fu, H. He, and Z. G. Hou. 2014. Learning race from face: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 36, 12 (Dec. 2014), 2483--2509. DOI:https://doi.org/10.1109/TPAMI.2014.2321570Google ScholarCross Ref
Yun Fu, Guodong Guo, and Thomas S. Huang. 2010. Age synthesis and estimation via faces: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 32, 11 (2010), 1955--1976.Google ScholarDigital Library
Lei Gao, Rui Zhang, Lin Qi, Enqing Chen, and Ling Guan. 2018. The labeled multiple canonical correlation analysis for information fusion. IEEE Trans. Multimedia 21, 2 (2018), 375--387.Google ScholarDigital Library
G. Guo and G. Mu. 2010. Human age estimation: What is the influence across race and gender? In Proceedings of the CVPR Workshops. 71--78. DOI:https://doi.org/10.1109/CVPRW.2010.5543609Google Scholar
Guodong Guo and Guowang Mu. 2014. A framework for joint estimation of age, gender and ethnicity on a large database. Image Vision Comput. 32, 10 (2014), 761--770.Google ScholarCross Ref
Hu Han, Anil K. Jain, Fang Wang, Shiguang Shan, and Xilin Chen. 2018. Heterogeneous face attribute estimation: A deep multi-task learning approach. IEEE Trans. Pattern Anal. Mach. Intell. 40, 11 (2018), 2597--2609.Google ScholarDigital Library
Emily M. Hand and Rama Chellappa. 2017. Attributes for improved attributes: A multi-task network utilizing implicit and explicit relationships for facial attribute classification. In Proceedings of the AAAI. 4068--4074.Google Scholar
David R. Hardoon, Sandor R. Szedmak, and John R. Shawe-Taylor. 2004. Canonical correlation analysis: An overview with application to learning methods. Neural Comput. 16, 12 (2004), 2639.Google ScholarDigital Library
Sandor Szedmak Hardoon, David R. and John Shawe-Taylor. 2004. Canonical correlation analysis: An overview with application to learning methods. Neural Comput. 16, 12 (2004), 2639--2664.Google ScholarDigital Library
Chen Huang, Yining Li, Chen Change Loy, and Xiaoou Tang. 2016. Learning deep representation for imbalanced classification. In Proceedings of the CVPR. 5375--5384.Google ScholarCross Ref
Chen Huang, Yining Li, Chen Change Loy, and Xiaoou Tang. 2020. Deep imbalanced learning for face recognition and attribute prediction. IEEE Trans. Pattern Anal. Mach. Intell. 42, 11 (2020), 2781--2794. DOI:10.1109/TPAMI.2019.2914680Google ScholarCross Ref
Gary B. Huang, Marwan Mattar, Tamara Berg, and Eric Learned-Miller. 2008. Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. In Workshop on Faces in “Real-Life” Images: Detection, Alignment, and Recognition. Erik Learned-Miller, Andras Ferencz, and Frédéric Jurie, Marseille, France. (inria-00321923). https://hal.inria.fr/inria-00321923/file/Huang_long_eccv2008-lfw.pdf.Google Scholar
Siyu Huang, Xi Li, Zhi-Qi Cheng, Zhongfei Zhang, and Alexander Hauptmann. 2018. Gnas: A greedy neural architecture search method for multi-attribute learning. In ACM Multimedia. 2049--2057.Google ScholarDigital Library
S. J. Hwang, F. Sha, and K. Grauman. 2011. Sharing features between objects and their attributes. In Proceedings of the CVPR. 1761--1768. DOI:https://doi.org/10.1109/CVPR.2011.5995543Google Scholar
Mahdi M. Kalayeh, Boqing Gong, and Mubarak Shah. 2017. Improving facial attribute prediction using semantic segmentation. In Proceedings of the CVPR. 6942--6950.Google ScholarCross Ref
Tae-Kyun Kim, Shu-Fai Wong, and Roberto Cipolla. 2007. Tensor canonical correlation analysis for action classification. In Proceedings of the CVPR. IEEE, 1--8.Google ScholarCross Ref
Pieter M. Kroonenberg and Jan De Leeuw. 1980. Principal component analysis of three-mode data by means of alternating least squares algorithms. Psychometrika 45, 1 (1980), 69--97.Google ScholarCross Ref
Neeraj Kumar, Peter Belhumeur, and Shree Nayar. 2008. FaceTracer: A search engine for large collections of images with faces. In Proceedings of the ECCV. 340--353.Google ScholarDigital Library
Young Ho Kwon and N. Da Vitoria Lobo. 1994. Age classification from facial images. In Proceedings of the CVPR. 762--767.Google Scholar
Lieven De Lathauwer, Bart De Moor, and Joos Vandewalle. 2006. On the best Rank-1 and Rank-(R1, R2,…, RN) approximation of higher-order tensors. SIAM J. Matrix Anal. Appl. 21, 4 (2006), 1324--1342.Google ScholarDigital Library
Gil Levi and Tal Hassncer. 2015. Age and gender classification using convolutional neural networks. In Proceedings of the CVPR Workshops. 34--42.Google ScholarCross Ref
Qiaozhe Li, Xin Zhao, Ran He, and Kaiqi Huang. 2019. Visual-semantic graph reasoning for pedestrian attribute recognition. In Proceedings of the AAAI, Vol. 33. 8634--8641.Google ScholarCross Ref
Zhifeng Li, Dihong Gong, Qiang Li, Dacheng Tao, and Xuelong Li. 2016. Mutual component analysis for heterogeneous face recognition. ACM Trans. Intell. Syst. Technol. 7, 3 (2016), 28.Google ScholarDigital Library
Giuseppe Lisanti, Svebor Karaman, and Iacopo Masi. 2017. Multi channel-Kernel Canonical Correlation Analysis for Cross-View Person Reidentification. ACM Trans. Multimedia Comput. Commun. Appl. 13, 2 (2017), 13.Google ScholarDigital Library
Fan Liu, Jinhui Tang, Yan Song, Liyan Zhang, and Zhenmin Tang. 2015. Local structure-based sparse representation for face recognition. ACM Trans. Intell. Syst. Technol. 7, 1 (2015), 2.Google ScholarDigital Library
Kuan Hsien Liu, Shuicheng Yan, and C. C. Jay Kuo. 2015. Age estimation via grouping and decision fusion. IEEE Trans. Info. Forensics Secur. 10, 11 (2015), 2408--2423.Google ScholarDigital Library
Z. Liu, P. Luo, X. Wang, and X. Tang. 2015. Deep learning face attributes in the wild. In Proceedings of the ICCV. 3730--3738. DOI:https://doi.org/10.1109/ICCV.2015.425Google Scholar
Yong Luo, Dacheng Tao, Kotagiri Ramamohanarao, Chao Xu, and Yonggang Wen. 2015. Tensor canonical correlation analysis for multi-view dimension reduction. IEEE Trans. Knowl. Data Eng. 27, 11 (2015), 3111--3124.Google ScholarDigital Library
Chao Ma, Jia-Bin Huang, Xiaokang Yang, and Ming-Hsuan Yang. 2015. Hierarchical convolutional features for visual tracking. In Proceedings of the ICCV. 3074--3082.Google ScholarDigital Library
S. Mehrkanoon and J. A. K. Suykens. 2018. Regularized semipaired kernel CCA for domain adaptation. IEEE Trans. Neural Netw. Learn. Syst. 29, 7 (July 2018), 3199--3213. DOI:https://doi.org/10.1109/TNNLS.2017.2728719Google Scholar
Venkatesh N. Murthy, Subhransu Maji, and R. Manmatha. 2015. Automatic image annotation using deep learning representations. Proceedings of the 5th ICMR. 603--606.Google Scholar
X. Ning, W. Li, B. Tang, and H. He. 2018. BULDP: Biomimetic uncorrelated locality discriminant projection for feature extraction in face recognition. IEEE Trans. Image Process. 27, 5 (May 2018), 2575--2586. DOI:https://doi.org/10.1109/TIP.2018.2806229Google ScholarCross Ref
G. J. Qi, C. Aggarwal, Q. Tian, H. Ji, and T. Huang. 2012. Exploring context and content links in social media: A latent space method. IEEE Trans. Pattern Anal. Mach. Intell. 34, 5 (May 2012), 850--862. DOI:https://doi.org/10.1109/TPAMI.2011.191Google Scholar
Guo Jun Qi, Xian Sheng Hua, and Hong Jiang Zhang. 2009. Learning semantic distance from community-tagged media collection. In Proceedings of the ICME. 243--252.Google ScholarDigital Library
R. Ranjan, V. M. Patel, and R. Chellappa. 2017. HyperFace: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. PP, 99 (2017), 1--1. DOI:https://doi.org/10.1109/TPAMI.2017.2781233Google Scholar
Rasmus Rothe, Radu Timofte, and Luc Van Gool. 2018. Deep expectation of real and apparent age from a single image without facial landmarks. Int. J. Comput. Vision 126, 2–4 (2018), 144–157.Google ScholarDigital Library
Ethan M. Rudd, Manuel Günther, and Terrance E. Boult. 2016. Moon: A mixed objective optimization network for the recognition of facial attributes. In Proceedings of the ECCV. Springer, 19--35.Google Scholar
C. O. Sakar and O. Kursun. 2017. Discriminative feature extraction by a neural implementation of canonical correlation analysis. IEEE Trans. Neural Netw. Learn. Syst. 28, 1 (Jan 2017), 164--176. DOI:https://doi.org/10.1109/TNNLS.2015.2504724Google ScholarCross Ref
Richard Socher and Fei Fei Li. 2010. Connecting modalities: Semi-supervised segmentation and annotation of images using unaligned text corpora. In Proceedings of the CVPR. 966--973.Google ScholarCross Ref
Zichang Tan, Jun Wan, Zhen Lei, Ruicong Zhi, Guodong Guo, and Stan Z. Li. 2017. Efficient group-n encoding and decoding for facial age estimation. IEEE Trans. Pattern Anal. Mach. Intell. 40, 11 (2017), 2610--2623.Google ScholarDigital Library
Zichang Tan, Yang Yang, Wan Jun, Guodong Guo, and Stan Z. Li. 2020. Relation-aware pedestrian attribute recognition with graph convolutional networks. In Proceedings of the AAAI.Google Scholar
Zichang Tan, Yang Yang, Jun Wan, Hanyuan Hang, Guodong Guo, and Stan Z. Li. 2019. Attention-based pedestrian attribute analysis. IEEE Trans. Image Process. 28, 12 (2019), 6126--6140.Google ScholarDigital Library
J. Tang, Y. Tian, P. Zhang, and X. Liu. 2018. Multiview privileged support vector machines. IEEE Trans. Neural Netw. Learn. Syst. 29, 8 (Aug. 2018), 3463--3477. DOI:https://doi.org/10.1109/TNNLS.2017.2728139Google Scholar
Michal Uricar, Radu Timofte, Rasmus Rothe, Jiri Matas, and Luc Van Gool. 2016. Structured output SVM prediction of apparent age, gender and smile from deep features. In Proceedings of the CVPR Workshops. 730--738.Google ScholarCross Ref
Alexei Vinokourov, John Shawe-Taylor, and Nello Cristianini. 2002. Inferring a semantic representation of text via cross-language correlation analysis. In Proceedings of the NIPs. 1497--1504.Google Scholar
Weiran Wang, Raman Arora, Karen Livescu, and Jeff Bilmes. [n.d.]. On deep multi-view representation learning. In Proceedings of the ICML. 1083–1092.Google Scholar
Z. Wu, Q. Ke, J. Sun, and H. Y. Shum. 2011. Scalable face image retrieval with identity-based quantization and multireference reranking. IEEE Trans. Pattern Anal. Mach. Intell. 33, 10 (Oct. 2011), 1991--2001. DOI:https://doi.org/10.1109/TPAMI.2011.111Google Scholar
Liping Xie, Dacheng Tao, and Haikun Wei. 2018. Early expression detection via online multi-instance learning with nonlinear extension. IEEE Trans. Neural Netw. Learn. Syst. 30, 5 (2018), 1486--1496.Google ScholarCross Ref
Fei Yan and Krystian Mikolajczyk. 2015. Deep correlation for matching images and text. In Proceedings of the CVPR. 3441--3450.Google ScholarCross Ref
Xinghao Yang, Weifeng Liu, Dapeng Tao, and Jun Cheng. 2017. Canonical correlation analysis networks for two-view image recognition. Info. Sci. Int. J. 385, C (2017), 338--352.Google Scholar
Ting Yao, Tao Mei, and Chong Wah Ngo. 2015. Learning query and image similarities with ranking canonical correlation analysis. In Proceedings of the ICCV. 28--36.Google ScholarDigital Library
Dong Yi, Zhen Lei, and Stan Z. Li. 2014. Age estimation by multi-scale convolutional network. In Proceedings of the ACCV. 144--158.Google Scholar
J. Yu, X. Yang, F. Gao, and D. Tao. 2017. Deep multimodal distance metric learning using click constraints for image ranking. IEEE Trans. Cybernet. 47, 12 (Dec 2017), 4014--4024. DOI:https://doi.org/10.1109/TCYB.2016.2591583Google ScholarCross Ref
Jun Yu, Chaoyang Zhu, Jian Zhang, Qingming Huang, and Dacheng Tao. 2019. Spatial pyramid-enhanced NetVLAD with weighted triplet loss for place recognition. IEEE Trans. Neural Netw. Learn. Syst. 31, 2 (2019), 661--674.Google ScholarCross Ref
Yang Zhong, Josephine Sullivan, and Haibo Li. 2016. Face attribute prediction using off-the-shelf CNN features. In Proceedings of the IJCB. 1--7.Google ScholarCross Ref

Index Terms

A Novel Multi-task Tensor Correlation Neural Network for Facial Attribute Prediction
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Neural networks
2. General and reference
  1. Document types
    1. Reference works

Recommendations

Features-Enhanced Multi-Attribute Estimation with Convolutional Tensor Correlation Fusion Network
Special Issue on Face Analysis for Applications and Special Issue on Affective Computing for Large-Scale Heterogeneous Multimedia Data

To achieve robust facial attribute estimation, a hierarchical prediction system referred to as tensor correlation fusion network (TCFN) is proposed for attribute estimation. The system includes feature extraction, correlation excavation among facial ...
Read More
Multi-Task Learning with Deep Dual-Path Network for Facial Attribute Recognition
ICCPR '20: Proceedings of the 2020 9th International Conference on Computing and Pattern Recognition

Facial attribute recognition is a popular and challenging research topic in computer vision. In the traditional deep learning based attribute recognition methods, the mid-level network features and the differences between attribute groups are not fully ...
Read More
Correlated attribute transfer with multi-task graph-guided fusion
MM '12: Proceedings of the 20th ACM international conference on Multimedia

Due to the describable or human-nameable nature of visual attributes, the attribute-based methods have been receiving much attentions in recent years in many applications. The advantages of the utilization of visual attributes are that they can be ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Intelligent Systems and Technology Volume 12, Issue 1
Regular Papers
February 2021
280 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/3436534
Editor:
Yu Zheng
JD Digits, China
Issue’s Table of Contents
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 November 2020
- Revised: 1 August 2020
- Accepted: 1 August 2020
- Received: 1 March 2020
Published in tist Volume 12, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Attribute prediction
correlation
multi-task learning
tensor correlation analysis algorithm
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 30
  Total Citations
  View Citations
- 320
  Total Downloads
- Downloads (Last 12 months)21
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

A Novel Multi-task Tensor Correlation Neural Network for Facial Attribute Prediction

ACM Transactions on Intelligent Systems and Technology

Abstract

References

Cited By

Index Terms

Recommendations

Features-Enhanced Multi-Attribute Estimation with Convolutional Tensor Correlation Fusion Network

Multi-Task Learning with Deep Dual-Path Network for Facial Attribute Recognition

Correlated attribute transfer with multi-task graph-guided fusion