Top

Multimedia Systems

Published in:

07-02-2019 | Special Issue Paper

Panorama based on multi-channel-attention CNN for 3D model recognition

Authors: Weizhi Nie, Kun Wang, Qi Liang, Roubing He

Published in: Multimedia Systems | Issue 6/2019

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

With the development of 3D model reconstruction, manufacturing, and 3D model vision technologies, 3D model recognition has attracted much attention recently. To handle the 3D model recognition problem, in this paper, we propose a panorama based on multi-channel-attention (MCA) CNN network for the representation of the 3D model. The proposed method is composed of three parts: extracting views, transform function learning, and generating 3D model descriptor. Concretely, we first extract the 2D panoramic views for each 3D model, and we use the multi-channel-attention neural network to extract the descriptor for each 3D model. Here, the attention model is used to find the unequal weights of each panorama view to generate the more robust 3D model descriptor. Finally, The fusion feature is used to handle the 3D model classification and retrieval problem. The popular data sets ModelNet and ShapeNet are used to demonstrate the performance of our approach. The experiments also demonstrate the superiority of our proposed method over the state-of-art methods.

previous article Toward efficient indexing structure for scalable content-based music retrieval

next article Multi-guiding long short-term memory for video captioning

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Jingyuan, C., Hanwang, Z., Xiangnan, H., Liqiang, N., Wei L., Tat Seng, C.: Attentive collaborative filtering: Multimedia recommendation with item- and component-level attention. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 335–344 (2017)

He, X., Chua, T.-S.: Neural factorization machines for sparse predictive analytics. In: Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 355–364. ACM (2017)

Zhang, H., Niu, Y., Chang, S.-F.: Grounding referring expressions in images by variational context (2018)

Kanezaki, A., Matsushita, Y., Yoshifumi, N.: RotationNet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints. CVPR, pp. 5010–5019 (2018)

Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE international conference on computer vision, pp. 945–953 (2015)

Liu, A.-A., Nie, W.-Z., Su, Y.-T.: 3d object retrieval based on multi-view latent variable model. In: IEEE Transactions on Circuits Systems for Video Technology. (99):1–1

Guo, H., Wang, J., Gao, Y., Li, J., Lu, H.: Multi-view 3d object retrieval with deep embedding network. In: IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society 25(12), 5526–5537 (2016)MathSciNetCrossRef

Wang, D., Wang, B., Zhao, S., Yao, H., Liu, H.: View-based 3d object retrieval with discriminative views. Neurocomputing 252, 58–66 (2017)CrossRef

Zhang, H., Kyaw, Z., Jinyang, Y., Shih F.C.: Weakly supervised visual relation detection via parallel pairwise r-fcn, Ppr-fcn (2017)

10.

Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.G.: Multi-view convolutional neural networks for 3d shape recognition. ICCV, pp. 945–953 (2015)

11.

Papadakis, Panagiotis, Pratikakis, Ioannis, Theoharis, Theoharis, Perantonis, Stavros: Panorama: a 3d shape descriptor based on panoramic views for unsupervised 3d object retrieval. Int. J. Comput. Vis. 89(2–3), 177–192 (2010)CrossRef

12.

Zhang, H., Kyaw, Z., Chang, S.-F., Chua, T.-S.: Visual translation embedding network for visual relation detection. CVPR, pp. 3107–3115 (2017)

13.

Liu, A., Wang, Z., Nie, W., Yuting, S.: Graph-based characteristic view set extraction and matching for 3d model retrieval. Inf. Sci. 320, 429–442 (2015)CrossRef

14.

Yang, Luren, Albregtsen, Fritz: Fast and exact computation of cartesian geometric moments using discrete green’s theorem. Pattern Recogn. 29(7), 1061–1073 (1996)CrossRef

15.

Ke, L., Wang, Q., Xue, J., Pan, W.: 3d model retrieval and classification by semi-supervised learning with content-based similarity. Inf. Sci. 281, 703–713 (2014)MathSciNetCrossRef

16.

Polewski, P., Yao, W., Heurich, M., Krzystek, P., Stilla, U.: Detection of fallen trees in als point clouds of a temperate forest by combining point/primitive-level shape descriptors. Gemeinsame Tagung (2014)

17.

Kobbelt, L., Schrder, P., Kazhdan, M., Funkhouser, T., Rusinkiewicz, S.: Rotation invariant spherical harmonic representation of 3d shape descriptors. In: Proc. eurographics/acm Siggraph Symp.on Geometry Processing 43(2), 156–164 (2003)

18.

Sinha, Y., Bai, J., Ramani, K.: Deep learning 3d shape surfaces using geometry images. In: European Conference on Computer Vision, pp. 223–240 (2016)CrossRef

19.

Nie, Wei-Zhi, Liu, An-An, Yu-Ting, Su: 3d object retrieval based on sparse coding in weak supervision. J. Vis. Commun. Image Represent. 37, 40–45 (2016)CrossRef

20.

He, X., He, Z., Du, X., Chua, T.-S.: Adversarial personalized ranking for recommendation (2018)

21.

He, X., He, Z., Song, J., Liu, Z., Jiang, Y.-G., Chua, T.-S.: NAIS: neural attentive item similarity model for recommendation. IEEE Trans. Knowl. Data Eng. 30(12), 2354–2366 (2018)CrossRef

22.

Ding-Yun, C., Xiao-Pei, T., Yu-Te, S., Ming, O.: On visual similarity based 3d model retrieval. In: Computer graphics forum, 22, pp. 223–232. Wiley Online Library (2003)

23.

Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3d shapenets: a deep representation for volumetric shapes. pp. 1912–1920 (2014)

24.

Maturana, D., Scherer, S.: Voxnet: A 3d convolutional neural network for real-time object recognition. In: Intelligent Robots and Systems (IROS), 2015. In: IEEE/RSJ International Conference on, pp. 922–928. IEEE (2015)

25.

Shi, Baoguang, Bai, Song, Zhou, Zhichao, Bai, Xiang: Deeppano: deep panoramic representation for 3-d shape recognition. IEEE Signal Process. Lett. 22(12), 2339–2343 (2015)CrossRef

26.

Sedaghat, N., Zolfaghari, M., Amiri, E., Brox, T.: Orientation-boosted voxel nets for 3d object recognition. arXiv preprint arXiv:1604.03351 (2016)

27.

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

28.

Srivastava, Nitish, Hinton, Geoffrey, Krizhevsky, Alex, Sutskever, Ilya, Salakhutdinov, Ruslan: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATH

29.

Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3d shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1912–1920 (2015)

30.

Sfikas, K., Theoharis, T., Pratikakis, I.: Exploiting the panorama representation for convolutional neural network classification and retrieval. In: Eurographics Workshop on 3D Object Retrieval (2017)

31.

Song, B., Xiang, B., Zhichao, Z., Zhaoxiang, Z., Longin Jan, L.: Gift: A real-time and scalable 3d shape search engine. In: Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on, pp. 5023–5032. IEEE (2016)

32.

Sedaghat, N., Zolfaghari, M., Brox, T.: Orientation-boosted voxel nets for 3d object recognition (2017)

33.

Wu, J., Zhang, C., Xue, T., Freeman, B., Tenenbaum, J.: Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. In: Advances in Neural Information Processing Systems, pp. 82–90 (2016)

34.

Alberto, G.-G., Francisco, G.-D., Jose, G.-R., Sergio, O.-E., Miguel, C., Azorin-Lopez, J.: Pointnet: A 3d convolutional neural network for real-time object class recognition. In: Neural Networks (IJCNN), 2016 International Joint Conference on, pp. 1578–1584. IEEE (2016)

35.

Xu, X., Todorovic, S.: Beam search for learning a deep convolutional neural network of 3d shapes. In: ICPR, pp. 3506–3511 (2016)

36.

Savva, M., Yu, F., Su, H., Aono, M., Chen, B., Cohen-Or, D., Deng, W., Su, H., Bai, S., Bai, X., et al.: Large-scale 3D shape retrieval from ShapeNet core55[C]// Eurographics Workshop on 3d Object Retrieval. Eurographics Association (2016)

37.

Takahiko, F., Ryutarou, O.: Deep aggregation of local 3d geometric features for 3d model retrieval. In: BMVC (2016)

Title: Panorama based on multi-channel-attention CNN for 3D model recognition
Authors: Weizhi Nie
Kun Wang
Qi Liang
Roubing He
Publication date: 07-02-2019
Publisher: Springer Berlin Heidelberg
Published in: Multimedia Systems / Issue 6/2019
Print ISSN: 0942-4962
Electronic ISSN: 1432-1882
DOI: https://doi.org/10.1007/s00530-018-0600-2

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 6/2019

Special issue on multimedia recommendation and multi-modal data analysis

Toward efficient indexing structure for scalable content-based music retrieval

A novel depth perception prediction metric for advanced multimedia applications

Fashion clothes matching scheme based on Siamese Network and AutoEncoder

Lifetime-aware solid-state disk (SSD) cache management for video servers

Analyzing autostereoscopic environment configurations for the design of videogames