Top

Multimedia Systems

Published in:

16-07-2022 | Regular Paper

Semantically guided projection for zero-shot 3D model classification and retrieval

Authors: Yuting Su, Jiayu Li, Wenhui Li, Zan Gao, Haipeng Chen, Xuanya Li, An-An Liu

Published in: Multimedia Systems | Issue 6/2022

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

The most existing methods for 3D model classification and retrieval rely on the fully supervised training scheme, which are prohibitive and time-consuming to collect and label 3D models of wide different categories. How to make full use of the existing known data to represent the unknown data is a crucial topic. Inspired by the zero-shot learning in 2D image domain, we propose the semantically guided projection method to classify and retrieve unseen 3D models by exploring the semantic relationship between seen and unseen 3D models. First, we explore the multi-view information of 3D models to construct the semantic attributes as the prior knowledge to represent 3D models. Then, we learn bidirectional projections from visual features to semantics and from semantics to visual features, which can eliminate the gap between seen and unseen domains. Extensive experiments for zero-shot 3D model classification and retrieval on two popular datasets, ModelNet40 and ShapeNetCore55, have demonstrated the effectiveness and superiority of the proposed method.

previous article Multi-attribute object detection benchmark for smart city

next article ENet: event based highlight generation network for broadcast sports videos

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for image classification. IEEE Trans. Pattern Anal. Mach. Intell. 38(7), 1425–1438 (2016)CrossRef

Chang, A.X., Funkhouser, T.A., Guibas, L.J., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., Yu, F.: Shapenet: An information-rich 3d model repository. CoRR abs/1512.03012 (2015)

Chi, J., Peng, Y.: Zero-shot cross-media embedding learning with dual adversarial distribution network. IEEE Trans. Circuits Syst. Video Technol. 30(4), 1173–1187 (2020)CrossRef

Dai, G., Xie, J., Fang, Y.: Siamese cnn-bilstm architecture for 3d shape representation learning. In: IJCAI, pp. 670–676 (2018)

Elhoseiny, M., Saleh, B., Elgammal, A.M.: Write a classifier: zero-shot learning using purely textual descriptions. In: ICCV, pp. 2584–2591 (2013)

Feng, Y., Zhang, Z., Zhao, X., Ji, R., Gao, Y.: GVCNN: group-view convolutional neural networks for 3d shape recognition. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 264–272 (2018)

Feng, Y., Zhang, Z., Zhao, X., Ji, R., Gao, Y.: GVCNN: group-view convolutional neural networks for 3d shape recognition. In: CVPR, pp. 264–272 (2018)

Han, Z., Shang, M., Liu, Y., Zwicker, M.: View inter-prediction GAN: unsupervised representation learning for 3d shapes by learning global shape memories to support local view predictions. In: The Thirty-Third AAAI Conference on Artificial Intelligence, pp. 8376–8384 (2019)

Huang, H., Wang, C., Yu, P.S., Wang, C.: Generative dual adversarial network for generalized zero-shot learning. In: CVPR, pp. 801–810 (2019)

10.

Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)CrossRef

11.

Kampffmeyer, M., Chen, Y., Liang, X., Wang, H., Zhang, Y., Xing, E.P.: Rethinking knowledge graph propagation for zero-shot learning. In: CVPR, pp. 11487–11496 (2019)

12.

Ko, Y.: A study of term weighting schemes using class information for text classification. In: The 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’12, Portland, OR, USA, August 12–16, 2012, pp. 1029–1030 (2012)

13.

Kodirov, E., Xiang, T., Gong, S.: Semantic autoencoder for zero-shot learning. In: CVPR, pp. 4447–4456 (2017)

14.

Kwon, G., Al Regib, G.: A gating model for bias calibration in generalized zero-shot learning. IEEE Transactions on Image Processing (2022)

15.

Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR, pp. 951–958 (2009)

16.

Larochelle, H., Erhan, D., Bengio, Y.: Zero-data learning of new tasks. In: AAAI, pp. 646–651 (2008)

17.

Lei Ba, J., Swersky, K., Fidler, S., et al.: Predicting deep zero-shot convolutional neural networks using textual descriptions. In: ICCV, pp. 4247–4255 (2015)

18.

Li, J., Jing, M., Lu, K., Ding, Z., Zhu, L., Huang, Z.: Leveraging the invariant side of generative zero-shot learning. In: CVPR, pp. 7402–7411 (2019)

19.

Li, F., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: CVPR, pp. 524–531 (2005)

20.

Liu, L., Wu, S., Chen, R., Zhou, M.: Zero-shot image classification via coupled discriminative dictionary learning. In: ICSEE, pp. 363–372 (2017)

21.

Liu, A., Nie, W., Su, Y.: 3d object retrieval based on multi-view latent variable model. IEEE Trans. Circuits Syst. Video Technol. 29(3), 868–880 (2019)CrossRef

22.

Liu, A., Zhou, H., Nie, W., Liu, Z., Liu, W., Xie, H., Mao, Z., Li, X., Song, D.: Hierarchical multi-view context modelling for 3d object classification and retrieval. Inf. Sci. 547, 984–995 (2021)CrossRef

23.

Ma, Y., Yu, D., Wu, T., Wang, H.: Paddlepaddle: an open-source deep learning platform from industrial practice. Front. Data Domput. 1(1), 105–115 (2019)

24.

Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013)

25.

Paddlepaddle: Paddlepaddle: An Easy-to-Use, Easy-to-Learn Deep Learning Platform. http://www.paddlepaddle.org/

26.

Parikh, D., Grauman, K.: Relative attributes. In: ICCV, pp. 503–510 (2011)

27.

Qiao, R., Liu, L., Shen, C., van den Hengel, A.: Less is more: zero-shot learning from online textual documents with noise suppression. In: CVPR, pp. 2249–2257 (2016)

28.

Reed, S.E., Akata, Z., Lee, H., Schiele, B.: Learning deep representations of fine-grained visual descriptions. In: CVPR, pp. 49–58 (2016)

29.

Rohrbach, M., Stark, M., Szarvas, G., Gurevych, I., Schiele, B.: What helps where - and why? semantic relatedness for knowledge transfer. In: CVPR, pp. 910–917 (2010)

30.

Sariyildiz, M.B., Cinbis, R.G.: Gradient matching generative networks for zero-shot learning. In: CVPR, pp. 2168–2178 (2019)

31.

Schönfeld, E., Ebrahimi, S., Sinha, S., Darrell, T., Akata, Z.: Generalized zero- and few-shot learning via aligned variational autoencoders. In: CVPR, pp. 8247–8255 (2019)

32.

Sivic, J., Zisserman, A.: Efficient visual search of videos cast as text retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 31(4), 591–606 (2009)CrossRef

33.

Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.G.: Multi-view convolutional neural networks for 3d shape recognition. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7=-13, 2015, pp. 945–953 (2015)

34.

Tian, Y., Kong, Y., Ruan, Q., An, G., Fu, Y.: Aligned dynamic-preserving embedding for zero-shot action recognition. IEEE Trans. Circuits Syst. Video Technol. 30(6), 1597–1612 (2020)CrossRef

35.

Visualizing data using t-sne: Maaten, L.v.d., Hinton, G. Journal of machine learning research 9(Nov), 2579–2605 (2008)

36.

Wang, D., Li, Y., Lin, Y., Zhuang, Y.: Relational knowledge transfer for zero-shot learning. In: AAAI, pp. 2145–2151 (2016)

37.

Wang, X., Ye, Y., Gupta, A.: Zero-shot recognition via semantic embeddings and knowledge graphs. In: CVPR, pp. 6857–6866 (2018)

38.

Wang, W., Zheng, V.W., Yu, H., Miao, C.: A survey of zero-shot learning: settings, methods, and applications. ACM Trans. Intell. Syst. Technol. (TIST) 10(2), 13 (2019)

39.

Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3D shapenets: a deep representation for volumetric shapes. In: CVPR, pp. 1912–1920 (2015)

40.

Wu, T., Wang, H., Ma, Y., Yu, D.: Paddlepaddle: an open-source deep learning platform from industrial practice. Front. Data Comput. 1, 105–115 (2019)

41.

Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., Schiele, B.: Latent embeddings for zero-shot classification. In: CVPR, pp. 69–77 (2016)

42.

Xu, C., Li, Z., Qiu, Q., Leng, B., Jiang, J.: Enhancing 2D representation via adjacent views for 3D shape retrieval. In: ICCV, pp. 3732–3740 (2019)

43.

Zhang, L., Wang, P., Liu, L., Shen, C., Wei, W., Zhang, Y., van den Hengel, A.: Towards effective deep embedding for zero-shot learning. IEEE Trans. Circuits Syst. Video Technol. 30(9), 2843–2852 (2020)CrossRef

44.

Zhao, A., Ding, M., Guan, J., Lu, Z., Xiang, T., Wen, J.: Domain-invariant projection learning for zero-shot recognition. In: NIPS, pp. 1027–1038 (2018)

45.

Zhao, B., Wu, B., Wu, T., Wang, Y.: Zero-shot learning posed as a missing data problem. In: ICCV, pp. 2616–2622 (2017)

46.

Zheng, V.W., Hu, D.H., Yang, Q.: Cross-domain activity recognition. In: UbiComp 2009: Ubiquitous Computing, 11th International Conference, UbiComp 2009, Orlando, Florida, USA, September 30 - October 3, 2009, Proceedings, pp. 61–70 (2009)

Title: Semantically guided projection for zero-shot 3D model classification and retrieval
Authors: Yuting Su
Jiayu Li
Wenhui Li
Zan Gao
Haipeng Chen
Xuanya Li
An-An Liu
Publication date: 16-07-2022
Publisher: Springer Berlin Heidelberg
Published in: Multimedia Systems / Issue 6/2022
Print ISSN: 0942-4962
Electronic ISSN: 1432-1882
DOI: https://doi.org/10.1007/s00530-022-00970-2

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 6/2022

Improved SSD using deep multi-scale attention spatial–temporal features for action recognition

Semantic image segmentation algorithm in a deep learning computer network

Automated brain tumor malignancy detection via 3D MRI using adaptive-3-D U-Net and heuristic-based deep neural network

Micro-expression recognition based on SqueezeNet and C3D

BERT-based semi-supervised domain adaptation for disastrous classification

A fast CU partition algorithm based on sum of region-directional dispersion for virtual reality 360° video