skip to main content
research-article

A Novel Multi-task Tensor Correlation Neural Network for Facial Attribute Prediction

Published:13 November 2020Publication History
Skip Abstract Section

Abstract

Multi-task learning plays an important role in face multi-attribute prediction. At present, most researches excavate the shared information between attributes by sharing all convolutional layers. However, it is not appropriate to treat the low-level and high-level features of the face multi-attribute equally, because the high-level features are more biased toward the specific content of the category. In this article, a novel multi-attribute tensor correlation neural network (MTCN) is used to predict face attributes. MTCN shares all attribute features at the low-level layers, and then distinguishes each attribute feature at the high-level layers. To better excavate the correlations among high-level attribute features, each sub-network explores useful information from other networks to enhance its original information. Then a tensor canonical correlation analysis method is used to seek the correlations among the highest-level attributes, which enhances the original information of each attribute. After that, these features are mapped into a highly correlated space through the correlation matrix. Finally, we use sufficient experiments to verify the performance of MTCN on the CelebA and LFWA datasets and our MTCN achieves the best performance compared with the latest multi-attribute recognition algorithms under the same settings.

References

  1. Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. Retrieved from https://arXiv:1603.04467.Google ScholarGoogle Scholar
  2. A. H. Abdulnabi, G. Wang, J. Lu, and K. Jia. 2015. Multi-task CNN model for attribute prediction. IEEE Trans. Multimedia 17, 11 (Nov. 2015), 1949--1959. DOI:https://doi.org/10.1109/TMM.2015.2477680Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jiajiong Cao, Yingming Li, and Zhongfei Zhang. 2018. Partially shared multi-task convolutional neural network with local constraint for face attribute learning. In Proceedings of the CVPR. 4290--4299.Google ScholarGoogle ScholarCross RefCross Ref
  4. J. Douglas Carroll and Jih-Jie Chang. 1970. Analysis of individual differences in multidimensional scaling via an N-way generalization of Eckart-Young decomposition. Psychometrika 35, 3 (1970), 283--319.Google ScholarGoogle ScholarCross RefCross Ref
  5. Pierre Comon, Xavier Luciani, and André L. F. De Almeida. 2009. Tensor decompositions, alternating least squares and other tales. J. Chemo.: J. Chemo. Soc. 23, 7--8 (2009), 393--405.Google ScholarGoogle Scholar
  6. Cottrell, W Garrison, Metcalfe, and Janet. 1990. EMPATH: Face, emotion, and gender recognition using holons. In Proceedings of the NIPs. 564--571.Google ScholarGoogle Scholar
  7. A. Dantcheva and F. Bremond. 2017. Gender estimation based on smile-dynamics. IEEE Trans. Info. Forensics Secur. 12, 3 (Mar. 2017), 719--729. DOI:https://doi.org/10.1109/TIFS.2016.2632070Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Hamdi Dibeklioğlu, Fares Alnajar, Albert Ali Salah, and Theo Gevers. 2015. Combining facial dynamics with appearance for age estimation. IEEE TIP 24, 6 (2015), 1928--1943.Google ScholarGoogle Scholar
  9. Hui Ding, Hao Zhou, Shaohua Kevin Zhou, and Rama Chellappa. 2018. A deep cascade network for unaligned face attribute classification. In Proceedings of the AAAI.Google ScholarGoogle Scholar
  10. M. Duan, K. Li, and K. Li. 2018. An ensemble CNN2ELM for age estimation. IEEE Trans. Info. Forensics Secur. 13, 3 (Mar. 2018), 758--772. DOI:https://doi.org/10.1109/TIFS.2017.2766583Google ScholarGoogle ScholarCross RefCross Ref
  11. Mingxing Duan, Kenli Li, Xiangke Liao, Keqin Li, and Qi Tian. 2019. Features-enhanced multi-attribute estimation with convolutional tensor correlation fusion network. ACM Trans. Multimedia Comput. Commun. Appl. 15, 3s (2019), 1--23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Max Ehrlich, Timothy J. Shields, Timur Almaev, and Mohamed R. Amer. 2016. Facial attributes classification using multi-task representation learning. In Proceedings of the CVPR Workshops. 47--55.Google ScholarGoogle Scholar
  13. Nour El Din Elmadany, Yifeng He, and Ling Guan. 2016. Multiview learning via deep discriminative canonical correlation analysis. In Proceedings of the IEEE ICASSP. 2409--2413.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. Fu, H. He, and Z. G. Hou. 2014. Learning race from face: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 36, 12 (Dec. 2014), 2483--2509. DOI:https://doi.org/10.1109/TPAMI.2014.2321570Google ScholarGoogle ScholarCross RefCross Ref
  15. Yun Fu, Guodong Guo, and Thomas S. Huang. 2010. Age synthesis and estimation via faces: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 32, 11 (2010), 1955--1976.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Lei Gao, Rui Zhang, Lin Qi, Enqing Chen, and Ling Guan. 2018. The labeled multiple canonical correlation analysis for information fusion. IEEE Trans. Multimedia 21, 2 (2018), 375--387.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. G. Guo and G. Mu. 2010. Human age estimation: What is the influence across race and gender? In Proceedings of the CVPR Workshops. 71--78. DOI:https://doi.org/10.1109/CVPRW.2010.5543609Google ScholarGoogle Scholar
  18. Guodong Guo and Guowang Mu. 2014. A framework for joint estimation of age, gender and ethnicity on a large database. Image Vision Comput. 32, 10 (2014), 761--770.Google ScholarGoogle ScholarCross RefCross Ref
  19. Hu Han, Anil K. Jain, Fang Wang, Shiguang Shan, and Xilin Chen. 2018. Heterogeneous face attribute estimation: A deep multi-task learning approach. IEEE Trans. Pattern Anal. Mach. Intell. 40, 11 (2018), 2597--2609.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Emily M. Hand and Rama Chellappa. 2017. Attributes for improved attributes: A multi-task network utilizing implicit and explicit relationships for facial attribute classification. In Proceedings of the AAAI. 4068--4074.Google ScholarGoogle Scholar
  21. David R. Hardoon, Sandor R. Szedmak, and John R. Shawe-Taylor. 2004. Canonical correlation analysis: An overview with application to learning methods. Neural Comput. 16, 12 (2004), 2639.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Sandor Szedmak Hardoon, David R. and John Shawe-Taylor. 2004. Canonical correlation analysis: An overview with application to learning methods. Neural Comput. 16, 12 (2004), 2639--2664.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Chen Huang, Yining Li, Chen Change Loy, and Xiaoou Tang. 2016. Learning deep representation for imbalanced classification. In Proceedings of the CVPR. 5375--5384.Google ScholarGoogle ScholarCross RefCross Ref
  24. Chen Huang, Yining Li, Chen Change Loy, and Xiaoou Tang. 2020. Deep imbalanced learning for face recognition and attribute prediction. IEEE Trans. Pattern Anal. Mach. Intell. 42, 11 (2020), 2781--2794. DOI:10.1109/TPAMI.2019.2914680Google ScholarGoogle ScholarCross RefCross Ref
  25. Gary B. Huang, Marwan Mattar, Tamara Berg, and Eric Learned-Miller. 2008. Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. In Workshop on Faces in “Real-Life” Images: Detection, Alignment, and Recognition. Erik Learned-Miller, Andras Ferencz, and Frédéric Jurie, Marseille, France. (inria-00321923). https://hal.inria.fr/inria-00321923/file/Huang_long_eccv2008-lfw.pdf.Google ScholarGoogle Scholar
  26. Siyu Huang, Xi Li, Zhi-Qi Cheng, Zhongfei Zhang, and Alexander Hauptmann. 2018. Gnas: A greedy neural architecture search method for multi-attribute learning. In ACM Multimedia. 2049--2057.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. J. Hwang, F. Sha, and K. Grauman. 2011. Sharing features between objects and their attributes. In Proceedings of the CVPR. 1761--1768. DOI:https://doi.org/10.1109/CVPR.2011.5995543Google ScholarGoogle Scholar
  28. Mahdi M. Kalayeh, Boqing Gong, and Mubarak Shah. 2017. Improving facial attribute prediction using semantic segmentation. In Proceedings of the CVPR. 6942--6950.Google ScholarGoogle ScholarCross RefCross Ref
  29. Tae-Kyun Kim, Shu-Fai Wong, and Roberto Cipolla. 2007. Tensor canonical correlation analysis for action classification. In Proceedings of the CVPR. IEEE, 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  30. Pieter M. Kroonenberg and Jan De Leeuw. 1980. Principal component analysis of three-mode data by means of alternating least squares algorithms. Psychometrika 45, 1 (1980), 69--97.Google ScholarGoogle ScholarCross RefCross Ref
  31. Neeraj Kumar, Peter Belhumeur, and Shree Nayar. 2008. FaceTracer: A search engine for large collections of images with faces. In Proceedings of the ECCV. 340--353.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Young Ho Kwon and N. Da Vitoria Lobo. 1994. Age classification from facial images. In Proceedings of the CVPR. 762--767.Google ScholarGoogle Scholar
  33. Lieven De Lathauwer, Bart De Moor, and Joos Vandewalle. 2006. On the best Rank-1 and Rank-(R1, R2,…, RN) approximation of higher-order tensors. SIAM J. Matrix Anal. Appl. 21, 4 (2006), 1324--1342.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Gil Levi and Tal Hassncer. 2015. Age and gender classification using convolutional neural networks. In Proceedings of the CVPR Workshops. 34--42.Google ScholarGoogle ScholarCross RefCross Ref
  35. Qiaozhe Li, Xin Zhao, Ran He, and Kaiqi Huang. 2019. Visual-semantic graph reasoning for pedestrian attribute recognition. In Proceedings of the AAAI, Vol. 33. 8634--8641.Google ScholarGoogle ScholarCross RefCross Ref
  36. Zhifeng Li, Dihong Gong, Qiang Li, Dacheng Tao, and Xuelong Li. 2016. Mutual component analysis for heterogeneous face recognition. ACM Trans. Intell. Syst. Technol. 7, 3 (2016), 28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Giuseppe Lisanti, Svebor Karaman, and Iacopo Masi. 2017. Multi channel-Kernel Canonical Correlation Analysis for Cross-View Person Reidentification. ACM Trans. Multimedia Comput. Commun. Appl. 13, 2 (2017), 13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Fan Liu, Jinhui Tang, Yan Song, Liyan Zhang, and Zhenmin Tang. 2015. Local structure-based sparse representation for face recognition. ACM Trans. Intell. Syst. Technol. 7, 1 (2015), 2.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Kuan Hsien Liu, Shuicheng Yan, and C. C. Jay Kuo. 2015. Age estimation via grouping and decision fusion. IEEE Trans. Info. Forensics Secur. 10, 11 (2015), 2408--2423.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Z. Liu, P. Luo, X. Wang, and X. Tang. 2015. Deep learning face attributes in the wild. In Proceedings of the ICCV. 3730--3738. DOI:https://doi.org/10.1109/ICCV.2015.425Google ScholarGoogle Scholar
  41. Yong Luo, Dacheng Tao, Kotagiri Ramamohanarao, Chao Xu, and Yonggang Wen. 2015. Tensor canonical correlation analysis for multi-view dimension reduction. IEEE Trans. Knowl. Data Eng. 27, 11 (2015), 3111--3124.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Chao Ma, Jia-Bin Huang, Xiaokang Yang, and Ming-Hsuan Yang. 2015. Hierarchical convolutional features for visual tracking. In Proceedings of the ICCV. 3074--3082.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. S. Mehrkanoon and J. A. K. Suykens. 2018. Regularized semipaired kernel CCA for domain adaptation. IEEE Trans. Neural Netw. Learn. Syst. 29, 7 (July 2018), 3199--3213. DOI:https://doi.org/10.1109/TNNLS.2017.2728719Google ScholarGoogle Scholar
  44. Venkatesh N. Murthy, Subhransu Maji, and R. Manmatha. 2015. Automatic image annotation using deep learning representations. Proceedings of the 5th ICMR. 603--606.Google ScholarGoogle Scholar
  45. X. Ning, W. Li, B. Tang, and H. He. 2018. BULDP: Biomimetic uncorrelated locality discriminant projection for feature extraction in face recognition. IEEE Trans. Image Process. 27, 5 (May 2018), 2575--2586. DOI:https://doi.org/10.1109/TIP.2018.2806229Google ScholarGoogle ScholarCross RefCross Ref
  46. G. J. Qi, C. Aggarwal, Q. Tian, H. Ji, and T. Huang. 2012. Exploring context and content links in social media: A latent space method. IEEE Trans. Pattern Anal. Mach. Intell. 34, 5 (May 2012), 850--862. DOI:https://doi.org/10.1109/TPAMI.2011.191Google ScholarGoogle Scholar
  47. Guo Jun Qi, Xian Sheng Hua, and Hong Jiang Zhang. 2009. Learning semantic distance from community-tagged media collection. In Proceedings of the ICME. 243--252.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. R. Ranjan, V. M. Patel, and R. Chellappa. 2017. HyperFace: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. PP, 99 (2017), 1--1. DOI:https://doi.org/10.1109/TPAMI.2017.2781233Google ScholarGoogle Scholar
  49. Rasmus Rothe, Radu Timofte, and Luc Van Gool. 2018. Deep expectation of real and apparent age from a single image without facial landmarks. Int. J. Comput. Vision 126, 2–4 (2018), 144–157.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Ethan M. Rudd, Manuel Günther, and Terrance E. Boult. 2016. Moon: A mixed objective optimization network for the recognition of facial attributes. In Proceedings of the ECCV. Springer, 19--35.Google ScholarGoogle Scholar
  51. C. O. Sakar and O. Kursun. 2017. Discriminative feature extraction by a neural implementation of canonical correlation analysis. IEEE Trans. Neural Netw. Learn. Syst. 28, 1 (Jan 2017), 164--176. DOI:https://doi.org/10.1109/TNNLS.2015.2504724Google ScholarGoogle ScholarCross RefCross Ref
  52. Richard Socher and Fei Fei Li. 2010. Connecting modalities: Semi-supervised segmentation and annotation of images using unaligned text corpora. In Proceedings of the CVPR. 966--973.Google ScholarGoogle ScholarCross RefCross Ref
  53. Zichang Tan, Jun Wan, Zhen Lei, Ruicong Zhi, Guodong Guo, and Stan Z. Li. 2017. Efficient group-n encoding and decoding for facial age estimation. IEEE Trans. Pattern Anal. Mach. Intell. 40, 11 (2017), 2610--2623.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Zichang Tan, Yang Yang, Wan Jun, Guodong Guo, and Stan Z. Li. 2020. Relation-aware pedestrian attribute recognition with graph convolutional networks. In Proceedings of the AAAI.Google ScholarGoogle Scholar
  55. Zichang Tan, Yang Yang, Jun Wan, Hanyuan Hang, Guodong Guo, and Stan Z. Li. 2019. Attention-based pedestrian attribute analysis. IEEE Trans. Image Process. 28, 12 (2019), 6126--6140.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. J. Tang, Y. Tian, P. Zhang, and X. Liu. 2018. Multiview privileged support vector machines. IEEE Trans. Neural Netw. Learn. Syst. 29, 8 (Aug. 2018), 3463--3477. DOI:https://doi.org/10.1109/TNNLS.2017.2728139Google ScholarGoogle Scholar
  57. Michal Uricar, Radu Timofte, Rasmus Rothe, Jiri Matas, and Luc Van Gool. 2016. Structured output SVM prediction of apparent age, gender and smile from deep features. In Proceedings of the CVPR Workshops. 730--738.Google ScholarGoogle ScholarCross RefCross Ref
  58. Alexei Vinokourov, John Shawe-Taylor, and Nello Cristianini. 2002. Inferring a semantic representation of text via cross-language correlation analysis. In Proceedings of the NIPs. 1497--1504.Google ScholarGoogle Scholar
  59. Weiran Wang, Raman Arora, Karen Livescu, and Jeff Bilmes. [n.d.]. On deep multi-view representation learning. In Proceedings of the ICML. 1083–1092.Google ScholarGoogle Scholar
  60. Z. Wu, Q. Ke, J. Sun, and H. Y. Shum. 2011. Scalable face image retrieval with identity-based quantization and multireference reranking. IEEE Trans. Pattern Anal. Mach. Intell. 33, 10 (Oct. 2011), 1991--2001. DOI:https://doi.org/10.1109/TPAMI.2011.111Google ScholarGoogle Scholar
  61. Liping Xie, Dacheng Tao, and Haikun Wei. 2018. Early expression detection via online multi-instance learning with nonlinear extension. IEEE Trans. Neural Netw. Learn. Syst. 30, 5 (2018), 1486--1496.Google ScholarGoogle ScholarCross RefCross Ref
  62. Fei Yan and Krystian Mikolajczyk. 2015. Deep correlation for matching images and text. In Proceedings of the CVPR. 3441--3450.Google ScholarGoogle ScholarCross RefCross Ref
  63. Xinghao Yang, Weifeng Liu, Dapeng Tao, and Jun Cheng. 2017. Canonical correlation analysis networks for two-view image recognition. Info. Sci. Int. J. 385, C (2017), 338--352.Google ScholarGoogle Scholar
  64. Ting Yao, Tao Mei, and Chong Wah Ngo. 2015. Learning query and image similarities with ranking canonical correlation analysis. In Proceedings of the ICCV. 28--36.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Dong Yi, Zhen Lei, and Stan Z. Li. 2014. Age estimation by multi-scale convolutional network. In Proceedings of the ACCV. 144--158.Google ScholarGoogle Scholar
  66. J. Yu, X. Yang, F. Gao, and D. Tao. 2017. Deep multimodal distance metric learning using click constraints for image ranking. IEEE Trans. Cybernet. 47, 12 (Dec 2017), 4014--4024. DOI:https://doi.org/10.1109/TCYB.2016.2591583Google ScholarGoogle ScholarCross RefCross Ref
  67. Jun Yu, Chaoyang Zhu, Jian Zhang, Qingming Huang, and Dacheng Tao. 2019. Spatial pyramid-enhanced NetVLAD with weighted triplet loss for place recognition. IEEE Trans. Neural Netw. Learn. Syst. 31, 2 (2019), 661--674.Google ScholarGoogle ScholarCross RefCross Ref
  68. Yang Zhong, Josephine Sullivan, and Haibo Li. 2016. Face attribute prediction using off-the-shelf CNN features. In Proceedings of the IJCB. 1--7.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. A Novel Multi-task Tensor Correlation Neural Network for Facial Attribute Prediction

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Intelligent Systems and Technology
        ACM Transactions on Intelligent Systems and Technology  Volume 12, Issue 1
        Regular Papers
        February 2021
        280 pages
        ISSN:2157-6904
        EISSN:2157-6912
        DOI:10.1145/3436534
        Issue’s Table of Contents

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 November 2020
        • Revised: 1 August 2020
        • Accepted: 1 August 2020
        • Received: 1 March 2020
        Published in tist Volume 12, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format