Skip to main content
Top
Published in: Multimedia Systems 6/2019

07-02-2019 | Special Issue Paper

Panorama based on multi-channel-attention CNN for 3D model recognition

Authors: Weizhi Nie, Kun Wang, Qi Liang, Roubing He

Published in: Multimedia Systems | Issue 6/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

With the development of 3D model reconstruction, manufacturing, and 3D model vision technologies, 3D model recognition has attracted much attention recently. To handle the 3D model recognition problem, in this paper, we propose a panorama based on multi-channel-attention (MCA) CNN network for the representation of the 3D model. The proposed method is composed of three parts: extracting views, transform function learning, and generating 3D model descriptor. Concretely, we first extract the 2D panoramic views for each 3D model, and we use the multi-channel-attention neural network to extract the descriptor for each 3D model. Here, the attention model is used to find the unequal weights of each panorama view to generate the more robust 3D model descriptor. Finally, The fusion feature is used to handle the 3D model classification and retrieval problem. The popular data sets ModelNet and ShapeNet are used to demonstrate the performance of our approach. The experiments also demonstrate the superiority of our proposed method over the state-of-art methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Jingyuan, C., Hanwang, Z., Xiangnan, H., Liqiang, N., Wei L., Tat Seng, C.: Attentive collaborative filtering: Multimedia recommendation with item- and component-level attention. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 335–344 (2017) Jingyuan, C., Hanwang, Z., Xiangnan, H., Liqiang, N., Wei L., Tat Seng, C.: Attentive collaborative filtering: Multimedia recommendation with item- and component-level attention. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 335–344 (2017)
2.
go back to reference He, X., Chua, T.-S.: Neural factorization machines for sparse predictive analytics. In: Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 355–364. ACM (2017) He, X., Chua, T.-S.: Neural factorization machines for sparse predictive analytics. In: Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 355–364. ACM (2017)
3.
go back to reference Zhang, H., Niu, Y., Chang, S.-F.: Grounding referring expressions in images by variational context (2018) Zhang, H., Niu, Y., Chang, S.-F.: Grounding referring expressions in images by variational context (2018)
4.
go back to reference Kanezaki, A., Matsushita, Y., Yoshifumi, N.: RotationNet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints. CVPR, pp. 5010–5019 (2018) Kanezaki, A., Matsushita, Y., Yoshifumi, N.: RotationNet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints. CVPR, pp. 5010–5019 (2018)
5.
go back to reference Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE international conference on computer vision, pp. 945–953 (2015) Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE international conference on computer vision, pp. 945–953 (2015)
6.
go back to reference Liu, A.-A., Nie, W.-Z., Su, Y.-T.: 3d object retrieval based on multi-view latent variable model. In: IEEE Transactions on Circuits Systems for Video Technology. (99):1–1 Liu, A.-A., Nie, W.-Z., Su, Y.-T.: 3d object retrieval based on multi-view latent variable model. In: IEEE Transactions on Circuits Systems for Video Technology. (99):1–1
7.
go back to reference Guo, H., Wang, J., Gao, Y., Li, J., Lu, H.: Multi-view 3d object retrieval with deep embedding network. In: IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society 25(12), 5526–5537 (2016)MathSciNetCrossRef Guo, H., Wang, J., Gao, Y., Li, J., Lu, H.: Multi-view 3d object retrieval with deep embedding network. In: IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society 25(12), 5526–5537 (2016)MathSciNetCrossRef
8.
go back to reference Wang, D., Wang, B., Zhao, S., Yao, H., Liu, H.: View-based 3d object retrieval with discriminative views. Neurocomputing 252, 58–66 (2017)CrossRef Wang, D., Wang, B., Zhao, S., Yao, H., Liu, H.: View-based 3d object retrieval with discriminative views. Neurocomputing 252, 58–66 (2017)CrossRef
9.
go back to reference Zhang, H., Kyaw, Z., Jinyang, Y., Shih F.C.: Weakly supervised visual relation detection via parallel pairwise r-fcn, Ppr-fcn (2017) Zhang, H., Kyaw, Z., Jinyang, Y., Shih F.C.: Weakly supervised visual relation detection via parallel pairwise r-fcn, Ppr-fcn (2017)
10.
go back to reference Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.G.: Multi-view convolutional neural networks for 3d shape recognition. ICCV, pp. 945–953 (2015) Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.G.: Multi-view convolutional neural networks for 3d shape recognition. ICCV, pp. 945–953 (2015)
11.
go back to reference Papadakis, Panagiotis, Pratikakis, Ioannis, Theoharis, Theoharis, Perantonis, Stavros: Panorama: a 3d shape descriptor based on panoramic views for unsupervised 3d object retrieval. Int. J. Comput. Vis. 89(2–3), 177–192 (2010)CrossRef Papadakis, Panagiotis, Pratikakis, Ioannis, Theoharis, Theoharis, Perantonis, Stavros: Panorama: a 3d shape descriptor based on panoramic views for unsupervised 3d object retrieval. Int. J. Comput. Vis. 89(2–3), 177–192 (2010)CrossRef
12.
go back to reference Zhang, H., Kyaw, Z., Chang, S.-F., Chua, T.-S.: Visual translation embedding network for visual relation detection. CVPR, pp. 3107–3115 (2017) Zhang, H., Kyaw, Z., Chang, S.-F., Chua, T.-S.: Visual translation embedding network for visual relation detection. CVPR, pp. 3107–3115 (2017)
13.
go back to reference Liu, A., Wang, Z., Nie, W., Yuting, S.: Graph-based characteristic view set extraction and matching for 3d model retrieval. Inf. Sci. 320, 429–442 (2015)CrossRef Liu, A., Wang, Z., Nie, W., Yuting, S.: Graph-based characteristic view set extraction and matching for 3d model retrieval. Inf. Sci. 320, 429–442 (2015)CrossRef
14.
go back to reference Yang, Luren, Albregtsen, Fritz: Fast and exact computation of cartesian geometric moments using discrete green’s theorem. Pattern Recogn. 29(7), 1061–1073 (1996)CrossRef Yang, Luren, Albregtsen, Fritz: Fast and exact computation of cartesian geometric moments using discrete green’s theorem. Pattern Recogn. 29(7), 1061–1073 (1996)CrossRef
15.
go back to reference Ke, L., Wang, Q., Xue, J., Pan, W.: 3d model retrieval and classification by semi-supervised learning with content-based similarity. Inf. Sci. 281, 703–713 (2014)MathSciNetCrossRef Ke, L., Wang, Q., Xue, J., Pan, W.: 3d model retrieval and classification by semi-supervised learning with content-based similarity. Inf. Sci. 281, 703–713 (2014)MathSciNetCrossRef
16.
go back to reference Polewski, P., Yao, W., Heurich, M., Krzystek, P., Stilla, U.: Detection of fallen trees in als point clouds of a temperate forest by combining point/primitive-level shape descriptors. Gemeinsame Tagung (2014) Polewski, P., Yao, W., Heurich, M., Krzystek, P., Stilla, U.: Detection of fallen trees in als point clouds of a temperate forest by combining point/primitive-level shape descriptors. Gemeinsame Tagung (2014)
17.
go back to reference Kobbelt, L., Schrder, P., Kazhdan, M., Funkhouser, T., Rusinkiewicz, S.: Rotation invariant spherical harmonic representation of 3d shape descriptors. In: Proc. eurographics/acm Siggraph Symp.on Geometry Processing 43(2), 156–164 (2003) Kobbelt, L., Schrder, P., Kazhdan, M., Funkhouser, T., Rusinkiewicz, S.: Rotation invariant spherical harmonic representation of 3d shape descriptors. In: Proc. eurographics/acm Siggraph Symp.on Geometry Processing 43(2), 156–164 (2003)
18.
go back to reference Sinha, Y., Bai, J., Ramani, K.: Deep learning 3d shape surfaces using geometry images. In: European Conference on Computer Vision, pp. 223–240 (2016)CrossRef Sinha, Y., Bai, J., Ramani, K.: Deep learning 3d shape surfaces using geometry images. In: European Conference on Computer Vision, pp. 223–240 (2016)CrossRef
19.
go back to reference Nie, Wei-Zhi, Liu, An-An, Yu-Ting, Su: 3d object retrieval based on sparse coding in weak supervision. J. Vis. Commun. Image Represent. 37, 40–45 (2016)CrossRef Nie, Wei-Zhi, Liu, An-An, Yu-Ting, Su: 3d object retrieval based on sparse coding in weak supervision. J. Vis. Commun. Image Represent. 37, 40–45 (2016)CrossRef
20.
go back to reference He, X., He, Z., Du, X., Chua, T.-S.: Adversarial personalized ranking for recommendation (2018) He, X., He, Z., Du, X., Chua, T.-S.: Adversarial personalized ranking for recommendation (2018)
21.
go back to reference He, X., He, Z., Song, J., Liu, Z., Jiang, Y.-G., Chua, T.-S.: NAIS: neural attentive item similarity model for recommendation. IEEE Trans. Knowl. Data Eng. 30(12), 2354–2366 (2018)CrossRef He, X., He, Z., Song, J., Liu, Z., Jiang, Y.-G., Chua, T.-S.: NAIS: neural attentive item similarity model for recommendation. IEEE Trans. Knowl. Data Eng. 30(12), 2354–2366 (2018)CrossRef
22.
go back to reference Ding-Yun, C., Xiao-Pei, T., Yu-Te, S., Ming, O.: On visual similarity based 3d model retrieval. In: Computer graphics forum, 22, pp. 223–232. Wiley Online Library (2003) Ding-Yun, C., Xiao-Pei, T., Yu-Te, S., Ming, O.: On visual similarity based 3d model retrieval. In: Computer graphics forum, 22, pp. 223–232. Wiley Online Library (2003)
23.
go back to reference Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3d shapenets: a deep representation for volumetric shapes. pp. 1912–1920 (2014) Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3d shapenets: a deep representation for volumetric shapes. pp. 1912–1920 (2014)
24.
go back to reference Maturana, D., Scherer, S.: Voxnet: A 3d convolutional neural network for real-time object recognition. In: Intelligent Robots and Systems (IROS), 2015. In: IEEE/RSJ International Conference on, pp. 922–928. IEEE (2015) Maturana, D., Scherer, S.: Voxnet: A 3d convolutional neural network for real-time object recognition. In: Intelligent Robots and Systems (IROS), 2015. In: IEEE/RSJ International Conference on, pp. 922–928. IEEE (2015)
25.
go back to reference Shi, Baoguang, Bai, Song, Zhou, Zhichao, Bai, Xiang: Deeppano: deep panoramic representation for 3-d shape recognition. IEEE Signal Process. Lett. 22(12), 2339–2343 (2015)CrossRef Shi, Baoguang, Bai, Song, Zhou, Zhichao, Bai, Xiang: Deeppano: deep panoramic representation for 3-d shape recognition. IEEE Signal Process. Lett. 22(12), 2339–2343 (2015)CrossRef
26.
go back to reference Sedaghat, N., Zolfaghari, M., Amiri, E., Brox, T.: Orientation-boosted voxel nets for 3d object recognition. arXiv preprint arXiv:1604.03351 (2016) Sedaghat, N., Zolfaghari, M., Amiri, E., Brox, T.: Orientation-boosted voxel nets for 3d object recognition. arXiv preprint arXiv:​1604.​03351 (2016)
27.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:​1409.​1556 (2014)
28.
go back to reference Srivastava, Nitish, Hinton, Geoffrey, Krizhevsky, Alex, Sutskever, Ilya, Salakhutdinov, Ruslan: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATH Srivastava, Nitish, Hinton, Geoffrey, Krizhevsky, Alex, Sutskever, Ilya, Salakhutdinov, Ruslan: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATH
29.
go back to reference Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3d shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1912–1920 (2015) Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3d shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1912–1920 (2015)
30.
go back to reference Sfikas, K., Theoharis, T., Pratikakis, I.: Exploiting the panorama representation for convolutional neural network classification and retrieval. In: Eurographics Workshop on 3D Object Retrieval (2017) Sfikas, K., Theoharis, T., Pratikakis, I.: Exploiting the panorama representation for convolutional neural network classification and retrieval. In: Eurographics Workshop on 3D Object Retrieval (2017)
31.
go back to reference Song, B., Xiang, B., Zhichao, Z., Zhaoxiang, Z., Longin Jan, L.: Gift: A real-time and scalable 3d shape search engine. In: Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on, pp. 5023–5032. IEEE (2016) Song, B., Xiang, B., Zhichao, Z., Zhaoxiang, Z., Longin Jan, L.: Gift: A real-time and scalable 3d shape search engine. In: Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on, pp. 5023–5032. IEEE (2016)
32.
go back to reference Sedaghat, N., Zolfaghari, M., Brox, T.: Orientation-boosted voxel nets for 3d object recognition (2017) Sedaghat, N., Zolfaghari, M., Brox, T.: Orientation-boosted voxel nets for 3d object recognition (2017)
33.
go back to reference Wu, J., Zhang, C., Xue, T., Freeman, B., Tenenbaum, J.: Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. In: Advances in Neural Information Processing Systems, pp. 82–90 (2016) Wu, J., Zhang, C., Xue, T., Freeman, B., Tenenbaum, J.: Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. In: Advances in Neural Information Processing Systems, pp. 82–90 (2016)
34.
go back to reference Alberto, G.-G., Francisco, G.-D., Jose, G.-R., Sergio, O.-E., Miguel, C., Azorin-Lopez, J.: Pointnet: A 3d convolutional neural network for real-time object class recognition. In: Neural Networks (IJCNN), 2016 International Joint Conference on, pp. 1578–1584. IEEE (2016) Alberto, G.-G., Francisco, G.-D., Jose, G.-R., Sergio, O.-E., Miguel, C., Azorin-Lopez, J.: Pointnet: A 3d convolutional neural network for real-time object class recognition. In: Neural Networks (IJCNN), 2016 International Joint Conference on, pp. 1578–1584. IEEE (2016)
35.
go back to reference Xu, X., Todorovic, S.: Beam search for learning a deep convolutional neural network of 3d shapes. In: ICPR, pp. 3506–3511 (2016) Xu, X., Todorovic, S.: Beam search for learning a deep convolutional neural network of 3d shapes. In: ICPR, pp. 3506–3511 (2016)
36.
go back to reference Savva, M., Yu, F., Su, H., Aono, M., Chen, B., Cohen-Or, D., Deng, W., Su, H., Bai, S., Bai, X., et al.: Large-scale 3D shape retrieval from ShapeNet core55[C]// Eurographics Workshop on 3d Object Retrieval. Eurographics Association (2016) Savva, M., Yu, F., Su, H., Aono, M., Chen, B., Cohen-Or, D., Deng, W., Su, H., Bai, S., Bai, X., et al.: Large-scale 3D shape retrieval from ShapeNet core55[C]// Eurographics Workshop on 3d Object Retrieval. Eurographics Association (2016)
37.
go back to reference Takahiko, F., Ryutarou, O.: Deep aggregation of local 3d geometric features for 3d model retrieval. In: BMVC (2016) Takahiko, F., Ryutarou, O.: Deep aggregation of local 3d geometric features for 3d model retrieval. In: BMVC (2016)
Metadata
Title
Panorama based on multi-channel-attention CNN for 3D model recognition
Authors
Weizhi Nie
Kun Wang
Qi Liang
Roubing He
Publication date
07-02-2019
Publisher
Springer Berlin Heidelberg
Published in
Multimedia Systems / Issue 6/2019
Print ISSN: 0942-4962
Electronic ISSN: 1432-1882
DOI
https://doi.org/10.1007/s00530-018-0600-2

Other articles of this Issue 6/2019

Multimedia Systems 6/2019 Go to the issue