Skip to main content
Erschienen in: Cognitive Computation 6/2018

10.10.2018

Multi-View CNN Feature Aggregation with ELM Auto-Encoder for 3D Shape Recognition

verfasst von: Zhi-Xin Yang, Lulu Tang, Kun Zhang, Pak Kin Wong

Erschienen in: Cognitive Computation | Ausgabe 6/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Fast and accurate detection of 3D shapes is a fundamental task of robotic systems for intelligent tracking and automatic control. View-based 3D shape recognition has attracted increasing attention because human perceptions of 3D objects mainly rely on multiple 2D observations from different viewpoints. However, most existing multi-view-based cognitive computation methods use straightforward pairwise comparisons among the projected images then follow with weak aggregation mechanism, which results in heavy computation cost and low recognition accuracy. To address such problems, a novel network structure combining multi-view convolutional neural networks (M-CNNs), extreme learning machine auto-encoder (ELM-AE), and ELM classifer, named as MCEA, is proposed for comprehensive feature learning, effective feature aggregation, and efficient classification of 3D shapes. Such novel framework exploits the advantages of deep CNN architecture with the robust ELM-AE feature representation, as well as the fast ELM classifier for 3D model recognition. Compared with the existing set-to-set image comparison methods, the proposed shape-to-shape matching strategy could convert each high informative 3D model into a single compact feature descriptor via cognitive computation. Moreover, the proposed method runs much faster and obtains a good balance between classification accuracy and computational efficiency. Experimental results on the benchmarking Princeton ModelNet, ShapeNet Core 55, and PSB datasets show that the proposed framework achieves higher classification and retrieval accuracy in much shorter time than the state-of-the-art methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Wang F, Kang L, Li Y. 2015. Sketch-based 3D shape retrieval using convolutional neural networks, Computer Science. Wang F, Kang L, Li Y. 2015. Sketch-based 3D shape retrieval using convolutional neural networks, Computer Science.
2.
Zurück zum Zitat Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L. Imagenet: a large-scale hierarchical image database. IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009 IEEE, pp. 248–255; 2009. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L. Imagenet: a large-scale hierarchical image database. IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009 IEEE, pp. 248–255; 2009.
3.
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Adv Neural Inf Proces Syst 2012;25(2):2012. Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Adv Neural Inf Proces Syst 2012;25(2):2012.
4.
Zurück zum Zitat Su H, Maji S, Kalogerakis E, et al. Multi-view convolutional neural networks for 3D shape recognition. IEEE International Conference on Computer Vision. IEEE Computer Society; 2015. p. 945–953. Su H, Maji S, Kalogerakis E, et al. Multi-view convolutional neural networks for 3D shape recognition. IEEE International Conference on Computer Vision. IEEE Computer Society; 2015. p. 945–953.
5.
Zurück zum Zitat Rifai S, Vincent P, Muller X, Glorot X, Bengio Y. Contractive auto-encoders: explicit invariance during feature extraction. Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 833–840; 2011. Rifai S, Vincent P, Muller X, Glorot X, Bengio Y. Contractive auto-encoders: explicit invariance during feature extraction. Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 833–840; 2011.
6.
Zurück zum Zitat Achlioptas D. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of computer and System Sciences 2003;66(4):671–687.CrossRef Achlioptas D. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of computer and System Sciences 2003;66(4):671–687.CrossRef
7.
Zurück zum Zitat Lee DD, Seung HS. Learning the parts of objects by non-negative matrix factorization. Nature 1999;401 (6755):788–791.PubMedCrossRef Lee DD, Seung HS. Learning the parts of objects by non-negative matrix factorization. Nature 1999;401 (6755):788–791.PubMedCrossRef
8.
Zurück zum Zitat Kasun LLC, Yang Y, Huang G-B, Zhang Z. Dimension reduction with extreme learning machine. IEEE Trans Image Process 2016;25(8):3906–3918.PubMedCrossRef Kasun LLC, Yang Y, Huang G-B, Zhang Z. Dimension reduction with extreme learning machine. IEEE Trans Image Process 2016;25(8):3906–3918.PubMedCrossRef
9.
Zurück zum Zitat Qi CR, Su H, Mo K, Guibas LJ. Pointnet: deep learning on point sets for 3D classification and segmentation. Proc. Computer Vision and Pattern Recognition (CVPR), IEEE 2017;1(2):4. Qi CR, Su H, Mo K, Guibas LJ. Pointnet: deep learning on point sets for 3D classification and segmentation. Proc. Computer Vision and Pattern Recognition (CVPR), IEEE 2017;1(2):4.
11.
Zurück zum Zitat Wu J, Zhang C, Xue T, Freeman B, Tenenbaum J. Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. Advances in Neural Information Processing Systems, pp. 82–90; 2016. Wu J, Zhang C, Xue T, Freeman B, Tenenbaum J. Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. Advances in Neural Information Processing Systems, pp. 82–90; 2016.
12.
Zurück zum Zitat Tatarchenko M, Dosovitskiy A, Brox T. 2017. Octree generating networks: efficient convolutional architectures for high-resolution 3d outputs. arXiv:1703.09438. Tatarchenko M, Dosovitskiy A, Brox T. 2017. Octree generating networks: efficient convolutional architectures for high-resolution 3d outputs. arXiv:1703.​09438.
13.
Zurück zum Zitat Yi L, Kim VG, Ceylan D, Shen I, Yan M, Su H, Lu C, Huang Q, Sheffer A, Guibas L, et al. A scalable active framework for region annotation in 3D shape collections. ACM Trans Graph (TOG) 2016;35(6):210.CrossRef Yi L, Kim VG, Ceylan D, Shen I, Yan M, Su H, Lu C, Huang Q, Sheffer A, Guibas L, et al. A scalable active framework for region annotation in 3D shape collections. ACM Trans Graph (TOG) 2016;35(6):210.CrossRef
14.
Zurück zum Zitat Maturana D, Scherer S. Voxnet: a 3D convolutional neural network for real-time object recognition. International Conference on Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ. IEEE, pp. 922–928; 2015. Maturana D, Scherer S. Voxnet: a 3D convolutional neural network for real-time object recognition. International Conference on Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ. IEEE, pp. 922–928; 2015.
15.
Zurück zum Zitat Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J. 2015. 3D shapenets: a deep representation for volumetric shapes, Eprint Arxiv, pp. 1912–1920. Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J. 2015. 3D shapenets: a deep representation for volumetric shapes, Eprint Arxiv, pp. 1912–1920.
16.
Zurück zum Zitat Nchez J, Perronnin F, Mensink T, Verbeek J. Image classification with the fisher vector: theory and practice. Int J Comput Vis 2013;105(3):222–245.CrossRef Nchez J, Perronnin F, Mensink T, Verbeek J. Image classification with the fisher vector: theory and practice. Int J Comput Vis 2013;105(3):222–245.CrossRef
17.
Zurück zum Zitat Eitz M, Richter R, Boubekeur T, Hildebrand K, Alexa M. Sketch-based shape retrieval. ACM Trans Graph 2012;31(4):31–1. Eitz M, Richter R, Boubekeur T, Hildebrand K, Alexa M. Sketch-based shape retrieval. ACM Trans Graph 2012;31(4):31–1.
18.
Zurück zum Zitat Chen D-Y, Tian X-P, Shen Y-T, Ouhyoung M. On visual similarity based 3d model retrieval. Computer Graphics Forum, vol. 122, no. 3. Wiley Online Library, pp. 223–232; 2003. Chen D-Y, Tian X-P, Shen Y-T, Ouhyoung M. On visual similarity based 3d model retrieval. Computer Graphics Forum, vol. 122, no. 3. Wiley Online Library, pp. 223–232; 2003.
19.
Zurück zum Zitat Shih J-L, Lee C-H, Wang JT. A new 3D model retrieval approach based on the elevation descriptor. Pattern Recogn 2007;40(1):283–295.CrossRef Shih J-L, Lee C-H, Wang JT. A new 3D model retrieval approach based on the elevation descriptor. Pattern Recogn 2007;40(1):283–295.CrossRef
20.
Zurück zum Zitat Kazhdan M, Funkhouser T, Rusinkiewicz S. Rotation invariant spherical harmonic representation of 3D shape descriptors. Eurographics/acm SIGGRAPH Symposium on Geometry Processing, pp. 156–164; 2003. Kazhdan M, Funkhouser T, Rusinkiewicz S. Rotation invariant spherical harmonic representation of 3D shape descriptors. Eurographics/acm SIGGRAPH Symposium on Geometry Processing, pp. 156–164; 2003.
21.
Zurück zum Zitat Bai S, Bai X, Zhou Z, Zhang Z, Latecki LJ. Gift: a real-time and scalable 3D shape search engine. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. IEEE, pp. 5023–5032; 2016. Bai S, Bai X, Zhou Z, Zhang Z, Latecki LJ. Gift: a real-time and scalable 3D shape search engine. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. IEEE, pp. 5023–5032; 2016.
22.
Zurück zum Zitat Huang GB, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B Cybern 2012;42(2):513–529.CrossRef Huang GB, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B Cybern 2012;42(2):513–529.CrossRef
23.
Zurück zum Zitat Huang GB, Cambria E, Toh KA, Widrow B, Xu Z. New trends of learning in computational intelligence. IEEE Comput Intell Mag 2015;10(2):16–17.CrossRef Huang GB, Cambria E, Toh KA, Widrow B, Xu Z. New trends of learning in computational intelligence. IEEE Comput Intell Mag 2015;10(2):16–17.CrossRef
24.
Zurück zum Zitat Zhang PB, Yang ZX. A novel adaboost framework with robust threshold and structural optimization. IEEE Transactions on Cybernetics 2018;48(1):64–76.PubMedCrossRef Zhang PB, Yang ZX. A novel adaboost framework with robust threshold and structural optimization. IEEE Transactions on Cybernetics 2018;48(1):64–76.PubMedCrossRef
25.
Zurück zum Zitat Huang G, Huang GB, Song S, You K. Trends in extreme learning machines: a review. Neural Netw 2015;61:32–48.PubMedCrossRef Huang G, Huang GB, Song S, You K. Trends in extreme learning machines: a review. Neural Netw 2015;61:32–48.PubMedCrossRef
26.
Zurück zum Zitat Yang ZX, Wang XB, Wong PK. Single and simultaneous fault diagnosis with application to a multistage gearbox: a versatile dual-elm network approach. IEEE Trans Ind Inf 2018;PP(99):1–1. Yang ZX, Wang XB, Wong PK. Single and simultaneous fault diagnosis with application to a multistage gearbox: a versatile dual-elm network approach. IEEE Trans Ind Inf 2018;PP(99):1–1.
27.
Zurück zum Zitat Zhou H, Huang GB, Lin Z, Wang H, Soh YC. Stacked extreme learning machines. IEEE Transactions on Cybernetics 2015;45(9):2013–2025.PubMedCrossRef Zhou H, Huang GB, Lin Z, Wang H, Soh YC. Stacked extreme learning machines. IEEE Transactions on Cybernetics 2015;45(9):2013–2025.PubMedCrossRef
28.
Zurück zum Zitat Phong BT. Illumination for computer generated pictures. Commun ACM 1975;18(6):311–317.CrossRef Phong BT. Illumination for computer generated pictures. Commun ACM 1975;18(6):311–317.CrossRef
30.
Zurück zum Zitat Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M. Imagenet large scale visual recognition challenge. Int J Comput Vis 2015;115(3):211–252.CrossRef Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M. Imagenet large scale visual recognition challenge. Int J Comput Vis 2015;115(3):211–252.CrossRef
31.
Zurück zum Zitat Tang J, Deng C, Huang GB. Extreme learning machine for multilayer perceptron. IEEE Transactions on Neural Networks and Learning Systems 2016;27(4):809–821.PubMedCrossRef Tang J, Deng C, Huang GB. Extreme learning machine for multilayer perceptron. IEEE Transactions on Neural Networks and Learning Systems 2016;27(4):809–821.PubMedCrossRef
32.
Zurück zum Zitat Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P-A. Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 2010;11(Dec): 3371–3408. Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P-A. Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 2010;11(Dec): 3371–3408.
33.
Zurück zum Zitat Shilane P, Min P, Kazhdan M, Funkhouser T. The princeton shape benchmark. Shape modeling applications, 2004. Proceedings. IEEE, pp. 167–178; 2004. Shilane P, Min P, Kazhdan M, Funkhouser T. The princeton shape benchmark. Shape modeling applications, 2004. Proceedings. IEEE, pp. 167–178; 2004.
34.
Zurück zum Zitat Savva M, Yu F, Su H, Kanezaki A, Furuya T, Ohbuchi R, Zhou Z, Yu R, Bai S, Bai X, Aono M, Tatsuma A, Thermos S, Axenopoulos A, Papadopoulos GT, Daras P, Deng X, Lian Z, Li B, Johan H, Lu Y, Mk S. Large-scale 3D shape retrieval from shapenet Core55. Eurographics Workshop on 3D Object Retrieval. In: Pratikakis I, Dupont F, and Ovsjanikov M, editors; 2017. The Eurographics Association. Savva M, Yu F, Su H, Kanezaki A, Furuya T, Ohbuchi R, Zhou Z, Yu R, Bai S, Bai X, Aono M, Tatsuma A, Thermos S, Axenopoulos A, Papadopoulos GT, Daras P, Deng X, Lian Z, Li B, Johan H, Lu Y, Mk S. Large-scale 3D shape retrieval from shapenet Core55. Eurographics Workshop on 3D Object Retrieval. In: Pratikakis I, Dupont F, and Ovsjanikov M, editors; 2017. The Eurographics Association.
36.
Zurück zum Zitat Shi B, Bai S, Zhou Z, Bai X. Deeppano: Deep panoramic representation for 3-D shape recognition. IEEE Signal Process Lett 2015;22(12):2339–2343.CrossRef Shi B, Bai S, Zhou Z, Bai X. Deeppano: Deep panoramic representation for 3-D shape recognition. IEEE Signal Process Lett 2015;22(12):2339–2343.CrossRef
37.
Zurück zum Zitat Xie Z, Xu K, Shan W, Liu L, Xiong Y, Huang H. Projective feature learning for 3D shapes with multi-view depth images. Comput Graphics Forum 2015;7:34. Xie Z, Xu K, Shan W, Liu L, Xiong Y, Huang H. Projective feature learning for 3D shapes with multi-view depth images. Comput Graphics Forum 2015;7:34.
38.
Zurück zum Zitat Maaten Lvd, Hinton G. Visualizing data using t-sne. J Mach Learn Res 2008;9(Nov):2579–2605. Maaten Lvd, Hinton G. Visualizing data using t-sne. J Mach Learn Res 2008;9(Nov):2579–2605.
Metadaten
Titel
Multi-View CNN Feature Aggregation with ELM Auto-Encoder for 3D Shape Recognition
verfasst von
Zhi-Xin Yang
Lulu Tang
Kun Zhang
Pak Kin Wong
Publikationsdatum
10.10.2018
Verlag
Springer US
Erschienen in
Cognitive Computation / Ausgabe 6/2018
Print ISSN: 1866-9956
Elektronische ISSN: 1866-9964
DOI
https://doi.org/10.1007/s12559-018-9598-1

Weitere Artikel der Ausgabe 6/2018

Cognitive Computation 6/2018 Zur Ausgabe