Skip to main content
Erschienen in: International Journal of Computer Vision 3/2024

05.10.2023

Multi-Modal Meta-Transfer Fusion Network for Few-Shot 3D Model Classification

verfasst von: He-Yu Zhou, An-An Liu, Chen-Yu Zhang, Ping Zhu, Qian-Yi Zhang, Mohan Kankanhalli

Erschienen in: International Journal of Computer Vision | Ausgabe 3/2024

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Nowadays, driven by the increasing concern on 3D techniques, resulting in the large-scale 3D data, 3D model classification has attracted enormous attention from both research and industry communities. Most of the current methods highly depend on sufficient labeled 3D models, which substantially restricts their scalability to novel classes with few annotated training data since it can increase the chance of overfitting. Besides, they only leverage single-modal information (either point cloud or multi-view information), and few works integrate these complementary information for 3D model representation. To overcome these problems, we propose a multi-modal meta-transfer fusion network (M\(^{3}\)TF), the key of which is to perform few-shot multi-modal representation for 3D model classification. Specifically, we first convert the original 3D data into both multi-view and point cloud modalities, and pre-train individual encoding networks on a large-scale dataset to obtain the optimal initial parameters, which is beneficial to few-shot learning tasks. Then, to enable the network to adjust to few-shot learning tasks, we update the parameters in Scaling and Shifting operation (SS), multi-modal representation fusion (MMRF) and the 3D model classifier to obtain optimal initialization parameters. Since the large-scale training parameters in feature extractors will increase the chance of overfitting, we freeze the feature extractor and introduce a SS operation to adjust its weights. Specifically, SS can reduce the number of training parameters up to 20%, which can effectively avoid overfitting. MMRF can adaptively integrate the multi-modal information based on their significance to the 3D model for a more robust 3D representation. Since there is no available dataset for evaluation, we build three 3D CAD datasets, Meta-ModalNet, Meta-ShapeNet and Meta-RGBD, for this new task and implement the representative methods for fair comparisons. Extensive experimental results can demonstrate the superiority of the proposed method.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Bai, S., Bai, X., Zhou, Z., et al. (2017). Gift: Towards scalable 3d shape retrieval. IEEE Transactions on Multimedia, 19(6), 1257–1271.CrossRef Bai, S., Bai, X., Zhou, Z., et al. (2017). Gift: Towards scalable 3d shape retrieval. IEEE Transactions on Multimedia, 19(6), 1257–1271.CrossRef
Zurück zum Zitat Bertinetto, L., Henriques, JF., Torr, PH., & et al. (2018). Meta-learning with differentiable closed-form solvers. arXiv preprint arXiv:1805.08136 Bertinetto, L., Henriques, JF., Torr, PH., & et al. (2018). Meta-learning with differentiable closed-form solvers. arXiv preprint arXiv:​1805.​08136
Zurück zum Zitat Chang, AX., Funkhouser, T., Guibas, L., et al. (2015). Shapenet: An information-rich 3d model repository Chang, AX., Funkhouser, T., Guibas, L., et al. (2015). Shapenet: An information-rich 3d model repository
Zurück zum Zitat Chen, DY., Tian, XP., Shen, YT., et al. (2003). On visual similarity based 3d model retrieval. In: Computer graphics forum, pp 223–232 Chen, DY., Tian, XP., Shen, YT., et al. (2003). On visual similarity based 3d model retrieval. In: Computer graphics forum, pp 223–232
Zurück zum Zitat Chen, X., Ma, H., Wan, J., & et al. (2017). Multi-view 3d object detection network for autonomous driving. In: CVPR, pp 1907–1915 Chen, X., Ma, H., Wan, J., & et al. (2017). Multi-view 3d object detection network for autonomous driving. In: CVPR, pp 1907–1915
Zurück zum Zitat Chen, Z., Fu, Y., Zhang, Y., et al. (2019). Multi-level semantic feature augmentation for one-shot learning. IEEE Transactions on Image Processing, 28(9), 4594–4605.MathSciNetCrossRefADS Chen, Z., Fu, Y., Zhang, Y., et al. (2019). Multi-level semantic feature augmentation for one-shot learning. IEEE Transactions on Image Processing, 28(9), 4594–4605.MathSciNetCrossRefADS
Zurück zum Zitat Dan, Z. (2017). Study on interior decoration system design based on 3d scene modeling technology. In: ICSGEA, pp 380–383 Dan, Z. (2017). Study on interior decoration system design based on 3d scene modeling technology. In: ICSGEA, pp 380–383
Zurück zum Zitat Deng, J., Dong, W., Socher, R., et al. (2009). Imagenet: A large-scale hierarchical image database. In: CVPR, pp 248–255 Deng, J., Dong, W., Socher, R., et al. (2009). Imagenet: A large-scale hierarchical image database. In: CVPR, pp 248–255
Zurück zum Zitat Dyn, N., Levine, D., & Gregory, J. A. (1990). A butterfly subdivision scheme for surface interpolation with tension control. ACM Transactions on Graphics (TOG), 9(2), 160–169.CrossRef Dyn, N., Levine, D., & Gregory, J. A. (1990). A butterfly subdivision scheme for surface interpolation with tension control. ACM Transactions on Graphics (TOG), 9(2), 160–169.CrossRef
Zurück zum Zitat Feng, Y., Zhang, Z., Zhao, X., et al. (2018). Gvcnn: Group-view convolutional neural networks for 3d shape recognition. In: CVPR, pp 264–272 Feng, Y., Zhang, Z., Zhao, X., et al. (2018). Gvcnn: Group-view convolutional neural networks for 3d shape recognition. In: CVPR, pp 264–272
Zurück zum Zitat Finn, C., Abbeel, P., Levine, S. (2017). Model-agnostic meta-learning for fast adaptation of deep networks. In: ICML, pp 1126–1135 Finn, C., Abbeel, P., Levine, S. (2017). Model-agnostic meta-learning for fast adaptation of deep networks. In: ICML, pp 1126–1135
Zurück zum Zitat Finn, C., Xu, K., Levine, S. (2018). Probabilistic model-agnostic meta-learning. In: NIPs, pp 9516–9527 Finn, C., Xu, K., Levine, S. (2018). Probabilistic model-agnostic meta-learning. In: NIPs, pp 9516–9527
Zurück zum Zitat Gao, Y., Tang, J., Hong, R., et al. (2011). Camera constraint-free view-based 3-d object retrieval. IEEE Transactions on Image Processing, 21(4), 2269–2281.MathSciNetCrossRefPubMedADS Gao, Y., Tang, J., Hong, R., et al. (2011). Camera constraint-free view-based 3-d object retrieval. IEEE Transactions on Image Processing, 21(4), 2269–2281.MathSciNetCrossRefPubMedADS
Zurück zum Zitat Grant, E., Finn, C., Levine, S., & et al. (2018). Recasting gradient-based meta-learning as hierarchical bayes. arXiv preprint arXiv:1801.08930 Grant, E., Finn, C., Levine, S., & et al. (2018). Recasting gradient-based meta-learning as hierarchical bayes. arXiv preprint arXiv:​1801.​08930
Zurück zum Zitat He, K., Zhang, X., Ren, S., & et al. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: ICCV, pp 1026–1034 He, K., Zhang, X., Ren, S., & et al. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: ICCV, pp 1026–1034
Zurück zum Zitat Hegde, V., & Zadeh, R. (2016). Fusionnet: 3d object classification using multiple data representations. arXiv preprint arXiv:1607.05695 Hegde, V., & Zadeh, R. (2016). Fusionnet: 3d object classification using multiple data representations. arXiv preprint arXiv:​1607.​05695
Zurück zum Zitat Hinton, GE., & Plaut, DC. (1987). Using fast weights to deblur old memories. In: CCSs, pp 177–186 Hinton, GE., & Plaut, DC. (1987). Using fast weights to deblur old memories. In: CCSs, pp 177–186
Zurück zum Zitat Hou, R., Chang, H., Bingpeng, M., & et al. (2019). Cross attention network for few-shot classification. In: NIPs, pp 4003–4014 Hou, R., Chang, H., Bingpeng, M., & et al. (2019). Cross attention network for few-shot classification. In: NIPs, pp 4003–4014
Zurück zum Zitat Hu, SX., Moreno, PG., Xiao, Y., & et al. (2020). Empirical bayes transductive meta-learning with synthetic gradients. arXiv preprint arXiv:2004.12696 Hu, SX., Moreno, PG., Xiao, Y., & et al. (2020). Empirical bayes transductive meta-learning with synthetic gradients. arXiv preprint arXiv:​2004.​12696
Zurück zum Zitat Jaritz, M., Vu, TH., Charette, Rd., & et al. (2020). xmuda: Cross-modal unsupervised domain adaptation for 3d semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12,605–12,614 Jaritz, M., Vu, TH., Charette, Rd., & et al. (2020). xmuda: Cross-modal unsupervised domain adaptation for 3d semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12,605–12,614
Zurück zum Zitat Jiang, J., Bao, D., Chen, Z., & et al. (2019). Mlvcnn: Multi-loop-view convolutional neural network for 3d shape retrieval. In: AAAI, pp 8513–8520 Jiang, J., Bao, D., Chen, Z., & et al. (2019). Mlvcnn: Multi-loop-view convolutional neural network for 3d shape retrieval. In: AAAI, pp 8513–8520
Zurück zum Zitat Lee, Y., & Choi, S. (2018). Gradient-based meta-learning with learned layerwise metric and subspace. arXiv preprint arXiv:1801.05558 Lee, Y., & Choi, S. (2018). Gradient-based meta-learning with learned layerwise metric and subspace. arXiv preprint arXiv:​1801.​05558
Zurück zum Zitat Li, H., Dong, W., Mei, X., & et al. (2019). Lgm-net: Learning to generate matching networks for few-shot learning. arXiv preprint arXiv:1905.06331 Li, H., Dong, W., Mei, X., & et al. (2019). Lgm-net: Learning to generate matching networks for few-shot learning. arXiv preprint arXiv:​1905.​06331
Zurück zum Zitat Li, J., Chen, BM., & Hee Lee, G. (2018). So-net: Self-organizing network for point cloud analysis. In: CVPR, pp 9397–9406 Li, J., Chen, BM., & Hee Lee, G. (2018). So-net: Self-organizing network for point cloud analysis. In: CVPR, pp 9397–9406
Zurück zum Zitat Lin, TY., Maire, M., Belongie, SJ., & et al. (2014). Microsoft COCO: Common objects in context. In: ECCV, pp 740–755 Lin, TY., Maire, M., Belongie, SJ., & et al. (2014). Microsoft COCO: Common objects in context. In: ECCV, pp 740–755
Zurück zum Zitat Liu, A., Xiang, S., Li, W., & et al. (2018a). Cross-domain 3d model retrieval via visual domain adaptation. In: IJCAI, pp 828–834 Liu, A., Xiang, S., Li, W., & et al. (2018a). Cross-domain 3d model retrieval via visual domain adaptation. In: IJCAI, pp 828–834
Zurück zum Zitat Liu, X., Han, Z., Liu, YS., & et al. (2019). Point2sequence: Learning the shape representation of 3d point clouds with an attention-based sequence to sequence network. In: AAAI, pp 8778–8785 Liu, X., Han, Z., Liu, YS., & et al. (2019). Point2sequence: Learning the shape representation of 3d point clouds with an attention-based sequence to sequence network. In: AAAI, pp 8778–8785
Zurück zum Zitat Liu, Y., Lee, J., Park, M., & et al. (2018b). Learning to propagate labels: Transductive propagation network for few-shot learning. arXiv preprint arXiv:1805.10002 Liu, Y., Lee, J., Park, M., & et al. (2018b). Learning to propagate labels: Transductive propagation network for few-shot learning. arXiv preprint arXiv:​1805.​10002
Zurück zum Zitat Mishra, N., Rohaninejad, M., Chen, X., & et al, (2017). A simple neural attentive meta-learner. arXiv preprint arXiv:1707.03141 Mishra, N., Rohaninejad, M., Chen, X., & et al, (2017). A simple neural attentive meta-learner. arXiv preprint arXiv:​1707.​03141
Zurück zum Zitat Munkhdalai, T., & Yu, H. (2017). Meta networks. In: ICML, pp 2554–2563 Munkhdalai, T., & Yu, H. (2017). Meta networks. In: ICML, pp 2554–2563
Zurück zum Zitat Nie, W., Liang, Q., Liu, AA., & et al. (2019). Mmjn: Multi-modal joint networks for 3d shape recognition. In: MM, pp 908–916 Nie, W., Liang, Q., Liu, AA., & et al. (2019). Mmjn: Multi-modal joint networks for 3d shape recognition. In: MM, pp 908–916
Zurück zum Zitat Ohbuchi, R., Osada, K., Furuya, T., & et al. (2008). Salient local visual features for shape-based 3d model retrieval. In: ICSMAMA, pp 93–102 Ohbuchi, R., Osada, K., Furuya, T., & et al. (2008). Salient local visual features for shape-based 3d model retrieval. In: ICSMAMA, pp 93–102
Zurück zum Zitat Oreshkin, B., Lopez, PR., & Lacoste, A. (2018). Tadam: Task dependent adaptive metric for improved few-shot learning. In: NIPs, pp 721–731 Oreshkin, B., Lopez, PR., & Lacoste, A. (2018). Tadam: Task dependent adaptive metric for improved few-shot learning. In: NIPs, pp 721–731
Zurück zum Zitat Pham, Q. H., Tran, M. K., Li, W., et al. (2018). Shrec’18: Rgb-d object-to-cad retrieval. Proc 3DOR, 2, 2. Pham, Q. H., Tran, M. K., Li, W., et al. (2018). Shrec’18: Rgb-d object-to-cad retrieval. Proc 3DOR, 2, 2.
Zurück zum Zitat Phong, B. T. (1975). Illumination for computer generated pictures. Communications of the ACM, 18(6), 311–317.CrossRef Phong, B. T. (1975). Illumination for computer generated pictures. Communications of the ACM, 18(6), 311–317.CrossRef
Zurück zum Zitat Poria, S., Cambria, E., Hazarika, D., & et al. (2017). Multi-level multiple attentions for contextual multimodal sentiment analysis. In: ICDM, pp 1033–1038 Poria, S., Cambria, E., Hazarika, D., & et al. (2017). Multi-level multiple attentions for contextual multimodal sentiment analysis. In: ICDM, pp 1033–1038
Zurück zum Zitat Qi, CR., Su, H., Niener, M., & et al. (2016). Volumetric and multi-view cnns for object classification on 3d data. In: CVPR, pp 5648–5656 Qi, CR., Su, H., Niener, M., & et al. (2016). Volumetric and multi-view cnns for object classification on 3d data. In: CVPR, pp 5648–5656
Zurück zum Zitat Qi, CR., Su, H., Mo, K., & et al. (2017a). Pointnet: Deep learning on point sets for 3d classification and segmentation. In: CVPR, pp 652–660 Qi, CR., Su, H., Mo, K., & et al. (2017a). Pointnet: Deep learning on point sets for 3d classification and segmentation. In: CVPR, pp 652–660
Zurück zum Zitat Qi, CR., Yi, L., Su, H., & et al. (2017b). Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In: NIPs, pp 5099–5108 Qi, CR., Yi, L., Su, H., & et al. (2017b). Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In: NIPs, pp 5099–5108
Zurück zum Zitat Ranzuglia, G., Callieri, M., Dellepiane, M., & et al. (2013). Meshlab as a complete tool for the integration of photos and color with high resolution 3d geometry data. In: CAA, pp 406–416 Ranzuglia, G., Callieri, M., Dellepiane, M., & et al. (2013). Meshlab as a complete tool for the integration of photos and color with high resolution 3d geometry data. In: CAA, pp 406–416
Zurück zum Zitat Ravi, S., & Larochelle, H. (2017). Optimization as a model for few-shot learning. In: ICLR Ravi, S., & Larochelle, H. (2017). Optimization as a model for few-shot learning. In: ICLR
Zurück zum Zitat Santoro, A., Bartunov, S., Botvinick, M., & et al. (2016). Meta-learning with memory-augmented neural networks. In: ICML, pp 1842–1850 Santoro, A., Bartunov, S., Botvinick, M., & et al. (2016). Meta-learning with memory-augmented neural networks. In: ICML, pp 1842–1850
Zurück zum Zitat Sharma, C., & Kaul, M. (2020). Self-supervised few-shot learning on point clouds. Advances in Neural Information Processing Systems, 33, 7212–7221. Sharma, C., & Kaul, M. (2020). Self-supervised few-shot learning on point clouds. Advances in Neural Information Processing Systems, 33, 7212–7221.
Zurück zum Zitat Shen, Y., Feng, C., Yang, Y., et al. (2018). Mining point cloud local structures by kernel correlation and graph pooling. In: CVPR, pp 4548–4557 Shen, Y., Feng, C., Yang, Y., et al. (2018). Mining point cloud local structures by kernel correlation and graph pooling. In: CVPR, pp 4548–4557
Zurück zum Zitat Shih, J. L., Lee, C. H., & Wang, J. T. (2007). A new 3d model retrieval approach based on the elevation descriptor. Pattern Recognition, 40(1), 283–295.CrossRefADS Shih, J. L., Lee, C. H., & Wang, J. T. (2007). A new 3d model retrieval approach based on the elevation descriptor. Pattern Recognition, 40(1), 283–295.CrossRefADS
Zurück zum Zitat Shu, Z., Xin, S., Xu, H., et al. (2016). 3D model classification via principal thickness images. Computer-Aided Design, 78, 199–208.CrossRef Shu, Z., Xin, S., Xu, H., et al. (2016). 3D model classification via principal thickness images. Computer-Aided Design, 78, 199–208.CrossRef
Zurück zum Zitat Snell, J., Swersky, K., &Zemel, R. (2017). Prototypical networks for few-shot learning. In: NIPs, pp 4077–4087 Snell, J., Swersky, K., &Zemel, R. (2017). Prototypical networks for few-shot learning. In: NIPs, pp 4077–4087
Zurück zum Zitat Song, R., Zhang, W., Zhao, Y., et al. (2022). Unsupervised multi-view CNN for salient view selection and 3d interest point detection. International Journal of Computer Vision, 130(5), 1210–1227.CrossRef Song, R., Zhang, W., Zhao, Y., et al. (2022). Unsupervised multi-view CNN for salient view selection and 3d interest point detection. International Journal of Computer Vision, 130(5), 1210–1227.CrossRef
Zurück zum Zitat Su, H., Maji, S., Kalogerakis, E., & et al. (2015). Multi-view convolutional neural networks for 3d shape recognition. In: ICCV, pp 945–953 Su, H., Maji, S., Kalogerakis, E., & et al. (2015). Multi-view convolutional neural networks for 3d shape recognition. In: ICCV, pp 945–953
Zurück zum Zitat Su, JC., Gadelha, M., Wang, R., & et al. (2018). A deeper look at 3d shape classifiers. In: ECCV Su, JC., Gadelha, M., Wang, R., & et al. (2018). A deeper look at 3d shape classifiers. In: ECCV
Zurück zum Zitat Sun, Q., Liu, Y., Chua, TS., & et al. (2019). Meta-transfer learning for few-shot learning. In: CVPR, pp 403–412 Sun, Q., Liu, Y., Chua, TS., & et al. (2019). Meta-transfer learning for few-shot learning. In: CVPR, pp 403–412
Zurück zum Zitat Sung, F., Yang, Y., Zhang, L., & et al. (2018). Learning to compare: Relation network for few-shot learning. In: CVPR, pp 1199–1208 Sung, F., Yang, Y., Zhang, L., & et al. (2018). Learning to compare: Relation network for few-shot learning. In: CVPR, pp 1199–1208
Zurück zum Zitat Uy, MA., Pham, QH., Hua, BS., & et al. (2019). Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1588–1597 Uy, MA., Pham, QH., Hua, BS., & et al. (2019). Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1588–1597
Zurück zum Zitat Vinyals, O., Blundell, C., Lillicrap, T., & et al. (2016). Matching networks for one shot learning. In: NIPs, pp 3630–3638 Vinyals, O., Blundell, C., Lillicrap, T., & et al. (2016). Matching networks for one shot learning. In: NIPs, pp 3630–3638
Zurück zum Zitat Wang, Y., Sun, Y., Liu, Z., et al. (2019). Dynamic graph CNN for learning on point clouds. TOG, 38(5), 1–12.CrossRef Wang, Y., Sun, Y., Liu, Z., et al. (2019). Dynamic graph CNN for learning on point clouds. TOG, 38(5), 1–12.CrossRef
Zurück zum Zitat Wang, Y., Sun, Y., Liu, Z., et al. (2019). Dynamic graph CNN for learning on point clouds. ACM Transactions on Graphics, 38(5), 1–12.CrossRef Wang, Y., Sun, Y., Liu, Z., et al. (2019). Dynamic graph CNN for learning on point clouds. ACM Transactions on Graphics, 38(5), 1–12.CrossRef
Zurück zum Zitat Wang, YX., Girshick, R., Hebert, M., & et al. (2018). Low-shot learning from imaginary data. In: CVPR, pp 7278–7286 Wang, YX., Girshick, R., Hebert, M., & et al. (2018). Low-shot learning from imaginary data. In: CVPR, pp 7278–7286
Zurück zum Zitat Wei, X., Yu, R., & Sun, J. (2020). View-gcn: View-based graph convolutional network for 3d shape analysis. In: CVPR, pp 1847–1856 Wei, X., Yu, R., & Sun, J. (2020). View-gcn: View-based graph convolutional network for 3d shape analysis. In: CVPR, pp 1847–1856
Zurück zum Zitat Wu, Z., Song, S., Khosla, A., & et al. (2015). 3d shapenets: A deep representation for volumetric shapes. In: CVPR, pp 1912–1920 Wu, Z., Song, S., Khosla, A., & et al. (2015). 3d shapenets: A deep representation for volumetric shapes. In: CVPR, pp 1912–1920
Zurück zum Zitat Xu, Y., Fan, T., Xu, M., & et al. (2018). Spidercnn: Deep learning on point sets with parameterized convolutional filters. In: ECCV, pp 87–102 Xu, Y., Fan, T., Xu, M., & et al. (2018). Spidercnn: Deep learning on point sets with parameterized convolutional filters. In: ECCV, pp 87–102
Zurück zum Zitat Ye, C., Zhu, H., Liao, Y., & et al. (2022). What makes for effective few-shot point cloud classification? In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1829–1838 Ye, C., Zhu, H., Liao, Y., & et al. (2022). What makes for effective few-shot point cloud classification? In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1829–1838
Zurück zum Zitat Ye, H., Hu, H., & Zhan, D. (2021). Learning adaptive classifiers synthesis for generalized few-shot learning. International Journal of Computer Vision, 129(6), 1930–1953.CrossRef Ye, H., Hu, H., & Zhan, D. (2021). Learning adaptive classifiers synthesis for generalized few-shot learning. International Journal of Computer Vision, 129(6), 1930–1953.CrossRef
Zurück zum Zitat You, H., Feng, Y., Ji, R., & et al. (2018). Pvnet: A joint convolutional network of point cloud and multi-view for 3d shape recognition. In: MM, pp 1310–1318 You, H., Feng, Y., Ji, R., & et al. (2018). Pvnet: A joint convolutional network of point cloud and multi-view for 3d shape recognition. In: MM, pp 1310–1318
Zurück zum Zitat You, H., Feng, Y., Zhao, X., & et al. (2019). Pvrnet: Point-view relation neural network for 3d shape recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 9119–9126 You, H., Feng, Y., Zhao, X., & et al. (2019). Pvrnet: Point-view relation neural network for 3d shape recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 9119–9126
Zurück zum Zitat Yu, T., Meng, J., & Yuan, J. (2018). Multi-view harmonized bilinear network for 3d object recognition. In: CVPR, pp 186–194 Yu, T., Meng, J., & Yuan, J. (2018). Multi-view harmonized bilinear network for 3d object recognition. In: CVPR, pp 186–194
Zurück zum Zitat Zhang, Z., Hua, B., & Yeung, S. (2022). Riconv++: Effective rotation invariant convolutions for 3d point clouds deep learning. International Journal of Computer Vision, 130(5), 1228–1243.CrossRef Zhang, Z., Hua, B., & Yeung, S. (2022). Riconv++: Effective rotation invariant convolutions for 3d point clouds deep learning. International Journal of Computer Vision, 130(5), 1228–1243.CrossRef
Zurück zum Zitat Zhou, H., Liu, A. A., Nie, W., et al. (2020). Multi-view saliency guided deep neural network for 3-d object retrieval and classification. IEEE Transactions on Multimedia, 22(6), 1496–1506.CrossRef Zhou, H., Liu, A. A., Nie, W., et al. (2020). Multi-view saliency guided deep neural network for 3-d object retrieval and classification. IEEE Transactions on Multimedia, 22(6), 1496–1506.CrossRef
Metadaten
Titel
Multi-Modal Meta-Transfer Fusion Network for Few-Shot 3D Model Classification
verfasst von
He-Yu Zhou
An-An Liu
Chen-Yu Zhang
Ping Zhu
Qian-Yi Zhang
Mohan Kankanhalli
Publikationsdatum
05.10.2023
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 3/2024
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-023-01905-8

Weitere Artikel der Ausgabe 3/2024

International Journal of Computer Vision 3/2024 Zur Ausgabe

Premium Partner