Skip to main content
Top
Published in: Multimedia Systems 6/2022

16-07-2022 | Regular Paper

Semantically guided projection for zero-shot 3D model classification and retrieval

Authors: Yuting Su, Jiayu Li, Wenhui Li, Zan Gao, Haipeng Chen, Xuanya Li, An-An Liu

Published in: Multimedia Systems | Issue 6/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The most existing methods for 3D model classification and retrieval rely on the fully supervised training scheme, which are prohibitive and time-consuming to collect and label 3D models of wide different categories. How to make full use of the existing known data to represent the unknown data is a crucial topic. Inspired by the zero-shot learning in 2D image domain, we propose the semantically guided projection method to classify and retrieve unseen 3D models by exploring the semantic relationship between seen and unseen 3D models. First, we explore the multi-view information of 3D models to construct the semantic attributes as the prior knowledge to represent 3D models. Then, we learn bidirectional projections from visual features to semantics and from semantics to visual features, which can eliminate the gap between seen and unseen domains. Extensive experiments for zero-shot 3D model classification and retrieval on two popular datasets, ModelNet40 and ShapeNetCore55, have demonstrated the effectiveness and superiority of the proposed method.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for image classification. IEEE Trans. Pattern Anal. Mach. Intell. 38(7), 1425–1438 (2016)CrossRef Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for image classification. IEEE Trans. Pattern Anal. Mach. Intell. 38(7), 1425–1438 (2016)CrossRef
2.
go back to reference Chang, A.X., Funkhouser, T.A., Guibas, L.J., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., Yu, F.: Shapenet: An information-rich 3d model repository. CoRR abs/1512.03012 (2015) Chang, A.X., Funkhouser, T.A., Guibas, L.J., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., Yu, F.: Shapenet: An information-rich 3d model repository. CoRR abs/1512.03012 (2015)
3.
go back to reference Chi, J., Peng, Y.: Zero-shot cross-media embedding learning with dual adversarial distribution network. IEEE Trans. Circuits Syst. Video Technol. 30(4), 1173–1187 (2020)CrossRef Chi, J., Peng, Y.: Zero-shot cross-media embedding learning with dual adversarial distribution network. IEEE Trans. Circuits Syst. Video Technol. 30(4), 1173–1187 (2020)CrossRef
4.
go back to reference Dai, G., Xie, J., Fang, Y.: Siamese cnn-bilstm architecture for 3d shape representation learning. In: IJCAI, pp. 670–676 (2018) Dai, G., Xie, J., Fang, Y.: Siamese cnn-bilstm architecture for 3d shape representation learning. In: IJCAI, pp. 670–676 (2018)
5.
go back to reference Elhoseiny, M., Saleh, B., Elgammal, A.M.: Write a classifier: zero-shot learning using purely textual descriptions. In: ICCV, pp. 2584–2591 (2013) Elhoseiny, M., Saleh, B., Elgammal, A.M.: Write a classifier: zero-shot learning using purely textual descriptions. In: ICCV, pp. 2584–2591 (2013)
6.
go back to reference Feng, Y., Zhang, Z., Zhao, X., Ji, R., Gao, Y.: GVCNN: group-view convolutional neural networks for 3d shape recognition. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 264–272 (2018) Feng, Y., Zhang, Z., Zhao, X., Ji, R., Gao, Y.: GVCNN: group-view convolutional neural networks for 3d shape recognition. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 264–272 (2018)
7.
go back to reference Feng, Y., Zhang, Z., Zhao, X., Ji, R., Gao, Y.: GVCNN: group-view convolutional neural networks for 3d shape recognition. In: CVPR, pp. 264–272 (2018) Feng, Y., Zhang, Z., Zhao, X., Ji, R., Gao, Y.: GVCNN: group-view convolutional neural networks for 3d shape recognition. In: CVPR, pp. 264–272 (2018)
8.
go back to reference Han, Z., Shang, M., Liu, Y., Zwicker, M.: View inter-prediction GAN: unsupervised representation learning for 3d shapes by learning global shape memories to support local view predictions. In: The Thirty-Third AAAI Conference on Artificial Intelligence, pp. 8376–8384 (2019) Han, Z., Shang, M., Liu, Y., Zwicker, M.: View inter-prediction GAN: unsupervised representation learning for 3d shapes by learning global shape memories to support local view predictions. In: The Thirty-Third AAAI Conference on Artificial Intelligence, pp. 8376–8384 (2019)
9.
go back to reference Huang, H., Wang, C., Yu, P.S., Wang, C.: Generative dual adversarial network for generalized zero-shot learning. In: CVPR, pp. 801–810 (2019) Huang, H., Wang, C., Yu, P.S., Wang, C.: Generative dual adversarial network for generalized zero-shot learning. In: CVPR, pp. 801–810 (2019)
10.
go back to reference Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)CrossRef Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)CrossRef
11.
go back to reference Kampffmeyer, M., Chen, Y., Liang, X., Wang, H., Zhang, Y., Xing, E.P.: Rethinking knowledge graph propagation for zero-shot learning. In: CVPR, pp. 11487–11496 (2019) Kampffmeyer, M., Chen, Y., Liang, X., Wang, H., Zhang, Y., Xing, E.P.: Rethinking knowledge graph propagation for zero-shot learning. In: CVPR, pp. 11487–11496 (2019)
12.
go back to reference Ko, Y.: A study of term weighting schemes using class information for text classification. In: The 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’12, Portland, OR, USA, August 12–16, 2012, pp. 1029–1030 (2012) Ko, Y.: A study of term weighting schemes using class information for text classification. In: The 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’12, Portland, OR, USA, August 12–16, 2012, pp. 1029–1030 (2012)
13.
go back to reference Kodirov, E., Xiang, T., Gong, S.: Semantic autoencoder for zero-shot learning. In: CVPR, pp. 4447–4456 (2017) Kodirov, E., Xiang, T., Gong, S.: Semantic autoencoder for zero-shot learning. In: CVPR, pp. 4447–4456 (2017)
14.
go back to reference Kwon, G., Al Regib, G.: A gating model for bias calibration in generalized zero-shot learning. IEEE Transactions on Image Processing (2022) Kwon, G., Al Regib, G.: A gating model for bias calibration in generalized zero-shot learning. IEEE Transactions on Image Processing (2022)
15.
go back to reference Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR, pp. 951–958 (2009) Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR, pp. 951–958 (2009)
16.
go back to reference Larochelle, H., Erhan, D., Bengio, Y.: Zero-data learning of new tasks. In: AAAI, pp. 646–651 (2008) Larochelle, H., Erhan, D., Bengio, Y.: Zero-data learning of new tasks. In: AAAI, pp. 646–651 (2008)
17.
go back to reference Lei Ba, J., Swersky, K., Fidler, S., et al.: Predicting deep zero-shot convolutional neural networks using textual descriptions. In: ICCV, pp. 4247–4255 (2015) Lei Ba, J., Swersky, K., Fidler, S., et al.: Predicting deep zero-shot convolutional neural networks using textual descriptions. In: ICCV, pp. 4247–4255 (2015)
18.
go back to reference Li, J., Jing, M., Lu, K., Ding, Z., Zhu, L., Huang, Z.: Leveraging the invariant side of generative zero-shot learning. In: CVPR, pp. 7402–7411 (2019) Li, J., Jing, M., Lu, K., Ding, Z., Zhu, L., Huang, Z.: Leveraging the invariant side of generative zero-shot learning. In: CVPR, pp. 7402–7411 (2019)
19.
go back to reference Li, F., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: CVPR, pp. 524–531 (2005) Li, F., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: CVPR, pp. 524–531 (2005)
20.
go back to reference Liu, L., Wu, S., Chen, R., Zhou, M.: Zero-shot image classification via coupled discriminative dictionary learning. In: ICSEE, pp. 363–372 (2017) Liu, L., Wu, S., Chen, R., Zhou, M.: Zero-shot image classification via coupled discriminative dictionary learning. In: ICSEE, pp. 363–372 (2017)
21.
go back to reference Liu, A., Nie, W., Su, Y.: 3d object retrieval based on multi-view latent variable model. IEEE Trans. Circuits Syst. Video Technol. 29(3), 868–880 (2019)CrossRef Liu, A., Nie, W., Su, Y.: 3d object retrieval based on multi-view latent variable model. IEEE Trans. Circuits Syst. Video Technol. 29(3), 868–880 (2019)CrossRef
22.
go back to reference Liu, A., Zhou, H., Nie, W., Liu, Z., Liu, W., Xie, H., Mao, Z., Li, X., Song, D.: Hierarchical multi-view context modelling for 3d object classification and retrieval. Inf. Sci. 547, 984–995 (2021)CrossRef Liu, A., Zhou, H., Nie, W., Liu, Z., Liu, W., Xie, H., Mao, Z., Li, X., Song, D.: Hierarchical multi-view context modelling for 3d object classification and retrieval. Inf. Sci. 547, 984–995 (2021)CrossRef
23.
go back to reference Ma, Y., Yu, D., Wu, T., Wang, H.: Paddlepaddle: an open-source deep learning platform from industrial practice. Front. Data Domput. 1(1), 105–115 (2019) Ma, Y., Yu, D., Wu, T., Wang, H.: Paddlepaddle: an open-source deep learning platform from industrial practice. Front. Data Domput. 1(1), 105–115 (2019)
24.
go back to reference Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013)
26.
go back to reference Parikh, D., Grauman, K.: Relative attributes. In: ICCV, pp. 503–510 (2011) Parikh, D., Grauman, K.: Relative attributes. In: ICCV, pp. 503–510 (2011)
27.
go back to reference Qiao, R., Liu, L., Shen, C., van den Hengel, A.: Less is more: zero-shot learning from online textual documents with noise suppression. In: CVPR, pp. 2249–2257 (2016) Qiao, R., Liu, L., Shen, C., van den Hengel, A.: Less is more: zero-shot learning from online textual documents with noise suppression. In: CVPR, pp. 2249–2257 (2016)
28.
go back to reference Reed, S.E., Akata, Z., Lee, H., Schiele, B.: Learning deep representations of fine-grained visual descriptions. In: CVPR, pp. 49–58 (2016) Reed, S.E., Akata, Z., Lee, H., Schiele, B.: Learning deep representations of fine-grained visual descriptions. In: CVPR, pp. 49–58 (2016)
29.
go back to reference Rohrbach, M., Stark, M., Szarvas, G., Gurevych, I., Schiele, B.: What helps where - and why? semantic relatedness for knowledge transfer. In: CVPR, pp. 910–917 (2010) Rohrbach, M., Stark, M., Szarvas, G., Gurevych, I., Schiele, B.: What helps where - and why? semantic relatedness for knowledge transfer. In: CVPR, pp. 910–917 (2010)
30.
go back to reference Sariyildiz, M.B., Cinbis, R.G.: Gradient matching generative networks for zero-shot learning. In: CVPR, pp. 2168–2178 (2019) Sariyildiz, M.B., Cinbis, R.G.: Gradient matching generative networks for zero-shot learning. In: CVPR, pp. 2168–2178 (2019)
31.
go back to reference Schönfeld, E., Ebrahimi, S., Sinha, S., Darrell, T., Akata, Z.: Generalized zero- and few-shot learning via aligned variational autoencoders. In: CVPR, pp. 8247–8255 (2019) Schönfeld, E., Ebrahimi, S., Sinha, S., Darrell, T., Akata, Z.: Generalized zero- and few-shot learning via aligned variational autoencoders. In: CVPR, pp. 8247–8255 (2019)
32.
go back to reference Sivic, J., Zisserman, A.: Efficient visual search of videos cast as text retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 31(4), 591–606 (2009)CrossRef Sivic, J., Zisserman, A.: Efficient visual search of videos cast as text retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 31(4), 591–606 (2009)CrossRef
33.
go back to reference Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.G.: Multi-view convolutional neural networks for 3d shape recognition. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7=-13, 2015, pp. 945–953 (2015) Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.G.: Multi-view convolutional neural networks for 3d shape recognition. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7=-13, 2015, pp. 945–953 (2015)
34.
go back to reference Tian, Y., Kong, Y., Ruan, Q., An, G., Fu, Y.: Aligned dynamic-preserving embedding for zero-shot action recognition. IEEE Trans. Circuits Syst. Video Technol. 30(6), 1597–1612 (2020)CrossRef Tian, Y., Kong, Y., Ruan, Q., An, G., Fu, Y.: Aligned dynamic-preserving embedding for zero-shot action recognition. IEEE Trans. Circuits Syst. Video Technol. 30(6), 1597–1612 (2020)CrossRef
35.
go back to reference Visualizing data using t-sne: Maaten, L.v.d., Hinton, G. Journal of machine learning research 9(Nov), 2579–2605 (2008) Visualizing data using t-sne: Maaten, L.v.d., Hinton, G. Journal of machine learning research 9(Nov), 2579–2605 (2008)
36.
go back to reference Wang, D., Li, Y., Lin, Y., Zhuang, Y.: Relational knowledge transfer for zero-shot learning. In: AAAI, pp. 2145–2151 (2016) Wang, D., Li, Y., Lin, Y., Zhuang, Y.: Relational knowledge transfer for zero-shot learning. In: AAAI, pp. 2145–2151 (2016)
37.
go back to reference Wang, X., Ye, Y., Gupta, A.: Zero-shot recognition via semantic embeddings and knowledge graphs. In: CVPR, pp. 6857–6866 (2018) Wang, X., Ye, Y., Gupta, A.: Zero-shot recognition via semantic embeddings and knowledge graphs. In: CVPR, pp. 6857–6866 (2018)
38.
go back to reference Wang, W., Zheng, V.W., Yu, H., Miao, C.: A survey of zero-shot learning: settings, methods, and applications. ACM Trans. Intell. Syst. Technol. (TIST) 10(2), 13 (2019) Wang, W., Zheng, V.W., Yu, H., Miao, C.: A survey of zero-shot learning: settings, methods, and applications. ACM Trans. Intell. Syst. Technol. (TIST) 10(2), 13 (2019)
39.
go back to reference Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3D shapenets: a deep representation for volumetric shapes. In: CVPR, pp. 1912–1920 (2015) Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3D shapenets: a deep representation for volumetric shapes. In: CVPR, pp. 1912–1920 (2015)
40.
go back to reference Wu, T., Wang, H., Ma, Y., Yu, D.: Paddlepaddle: an open-source deep learning platform from industrial practice. Front. Data Comput. 1, 105–115 (2019) Wu, T., Wang, H., Ma, Y., Yu, D.: Paddlepaddle: an open-source deep learning platform from industrial practice. Front. Data Comput. 1, 105–115 (2019)
41.
go back to reference Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., Schiele, B.: Latent embeddings for zero-shot classification. In: CVPR, pp. 69–77 (2016) Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., Schiele, B.: Latent embeddings for zero-shot classification. In: CVPR, pp. 69–77 (2016)
42.
go back to reference Xu, C., Li, Z., Qiu, Q., Leng, B., Jiang, J.: Enhancing 2D representation via adjacent views for 3D shape retrieval. In: ICCV, pp. 3732–3740 (2019) Xu, C., Li, Z., Qiu, Q., Leng, B., Jiang, J.: Enhancing 2D representation via adjacent views for 3D shape retrieval. In: ICCV, pp. 3732–3740 (2019)
43.
go back to reference Zhang, L., Wang, P., Liu, L., Shen, C., Wei, W., Zhang, Y., van den Hengel, A.: Towards effective deep embedding for zero-shot learning. IEEE Trans. Circuits Syst. Video Technol. 30(9), 2843–2852 (2020)CrossRef Zhang, L., Wang, P., Liu, L., Shen, C., Wei, W., Zhang, Y., van den Hengel, A.: Towards effective deep embedding for zero-shot learning. IEEE Trans. Circuits Syst. Video Technol. 30(9), 2843–2852 (2020)CrossRef
44.
go back to reference Zhao, A., Ding, M., Guan, J., Lu, Z., Xiang, T., Wen, J.: Domain-invariant projection learning for zero-shot recognition. In: NIPS, pp. 1027–1038 (2018) Zhao, A., Ding, M., Guan, J., Lu, Z., Xiang, T., Wen, J.: Domain-invariant projection learning for zero-shot recognition. In: NIPS, pp. 1027–1038 (2018)
45.
go back to reference Zhao, B., Wu, B., Wu, T., Wang, Y.: Zero-shot learning posed as a missing data problem. In: ICCV, pp. 2616–2622 (2017) Zhao, B., Wu, B., Wu, T., Wang, Y.: Zero-shot learning posed as a missing data problem. In: ICCV, pp. 2616–2622 (2017)
46.
go back to reference Zheng, V.W., Hu, D.H., Yang, Q.: Cross-domain activity recognition. In: UbiComp 2009: Ubiquitous Computing, 11th International Conference, UbiComp 2009, Orlando, Florida, USA, September 30 - October 3, 2009, Proceedings, pp. 61–70 (2009) Zheng, V.W., Hu, D.H., Yang, Q.: Cross-domain activity recognition. In: UbiComp 2009: Ubiquitous Computing, 11th International Conference, UbiComp 2009, Orlando, Florida, USA, September 30 - October 3, 2009, Proceedings, pp. 61–70 (2009)
Metadata
Title
Semantically guided projection for zero-shot 3D model classification and retrieval
Authors
Yuting Su
Jiayu Li
Wenhui Li
Zan Gao
Haipeng Chen
Xuanya Li
An-An Liu
Publication date
16-07-2022
Publisher
Springer Berlin Heidelberg
Published in
Multimedia Systems / Issue 6/2022
Print ISSN: 0942-4962
Electronic ISSN: 1432-1882
DOI
https://doi.org/10.1007/s00530-022-00970-2

Other articles of this Issue 6/2022

Multimedia Systems 6/2022 Go to the issue