Skip to main content
Top

2016 | OriginalPaper | Chapter

Heat Diffusion Long-Short Term Memory Learning for 3D Shape Analysis

Authors : Fan Zhu, Jin Xie, Yi Fang

Published in: Computer Vision – ECCV 2016

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The heat kernel is a fundamental solution in mathematical physics to distribution measurement of heat energy within a fixed region over time, and due to its unique property of being invariant to isometric transformations, the heat kernel has been an effective feature descriptor for spectral shape analysis. The majority of prior heat kernel-based strategies of building 3D shape representations fail to investigate the temporal dynamics of heat flows on 3D shape surfaces over time. In this work, we address the temporal dynamics of heat flows on 3D shapes using the long-short term memory (LSTM). We guide 3D shape descriptors toward discriminative representations by feeding heat distributions throughout time as inputs to units of heat diffusion LSTM (HD-LSTM) blocks with a supervised learning structure. We further extend HD-LSTM to a cross-domain structure (CDHD-LSTM) for learning domain-invariant representations of multi-view data. We evaluate the effectiveness of both HD-LSTM and CDHD-LSTM on 3D shape retrieval and sketch-based 3D shape retrieval tasks respectively. Experimental results on McGill dataset and SHREC 2014 dataset suggest that both methods can achieve state-of-the-art performance.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Agathos, A., Pratikakis, I., Papadakis, P., Perantonis, S.J., Azariadis, P.N., Sapidis, N.S.: Retrieval of 3D articulated objects using a graph-based representation. In: 3DOR 2009, pp. 29–36 (2009) Agathos, A., Pratikakis, I., Papadakis, P., Perantonis, S.J., Azariadis, P.N., Sapidis, N.S.: Retrieval of 3D articulated objects using a graph-based representation. In: 3DOR 2009, pp. 29–36 (2009)
2.
go back to reference Belongie, S., Malik, J., Puzicha, J.: Shape context: a new descriptor for shape matching and object recognition. In: Advances in Neural Information Processing Systems, vol. 2, p. 3 (2000) Belongie, S., Malik, J., Puzicha, J.: Shape context: a new descriptor for shape matching and object recognition. In: Advances in Neural Information Processing Systems, vol. 2, p. 3 (2000)
3.
go back to reference Bengio, Y., Boulanger-Lewandowski, N., Pascanu, R.: Advances in optimizing recurrent networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8624–8628 (2013) Bengio, Y., Boulanger-Lewandowski, N., Pascanu, R.: Advances in optimizing recurrent networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8624–8628 (2013)
4.
go back to reference Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)MATH Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)MATH
5.
go back to reference Bronstein, A.M., Bronstein, M.M., Kimmel, R.: Efficient computation of isometry-invariant distances between surfaces. SIAM J. Sci. Comput. 28(5), 1812–1836 (2006)MathSciNetCrossRefMATH Bronstein, A.M., Bronstein, M.M., Kimmel, R.: Efficient computation of isometry-invariant distances between surfaces. SIAM J. Sci. Comput. 28(5), 1812–1836 (2006)MathSciNetCrossRefMATH
6.
go back to reference Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint (2014). arXiv:1406.1078 Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint (2014). arXiv:​1406.​1078
7.
go back to reference Darom, T., Keller, Y.: Scale-invariant features for 3-D mesh models. IEEE Trans. Image Process. 21(5), 2758–2769 (2012)MathSciNetCrossRef Darom, T., Keller, Y.: Scale-invariant features for 3-D mesh models. IEEE Trans. Image Process. 21(5), 2758–2769 (2012)MathSciNetCrossRef
8.
go back to reference Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009) Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
9.
go back to reference Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015) Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)
10.
go back to reference Eitz, M., Hays, J., Alexa, M.: How do humans sketch objects? ACM Trans. Graph. 31(4), 44 (2012) Eitz, M., Hays, J., Alexa, M.: How do humans sketch objects? ACM Trans. Graph. 31(4), 44 (2012)
11.
go back to reference Gal, R., Shamir, A., Cohen-Or, D.: Pose-oblivious shape signature. IEEE Trans. Vis. Comput. Graph. 13(2), 261–271 (2007)CrossRef Gal, R., Shamir, A., Cohen-Or, D.: Pose-oblivious shape signature. IEEE Trans. Vis. Comput. Graph. 13(2), 261–271 (2007)CrossRef
12.
go back to reference Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Comput. 12(10), 2451–2471 (2000)CrossRef Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Comput. 12(10), 2451–2471 (2000)CrossRef
13.
go back to reference Graves, A., Mohamed, A.r., Hinton, G.: Speech recognition with deep recurrent neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649 (2013) Graves, A., Mohamed, A.r., Hinton, G.: Speech recognition with deep recurrent neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649 (2013)
14.
go back to reference Hilaga, M., Shinagawa, Y., Kohmura, T., Kunii, T.L.: Topology matching for fully automatic similarity estimation of 3D shapes. In: Annual Conference on Computer Graphics and Interactive Techniques, pp. 203–212. ACM (2001) Hilaga, M., Shinagawa, Y., Kohmura, T., Kunii, T.L.: Topology matching for fully automatic similarity estimation of 3D shapes. In: Annual Conference on Computer Graphics and Interactive Techniques, pp. 203–212. ACM (2001)
15.
go back to reference Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef
16.
go back to reference Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)CrossRef Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)CrossRef
17.
go back to reference Jia, X., Gavves, E., Fernando, B., Tuytelaars, T.: Guiding the long-short term memory model for image caption generation. In: IEEE International Conference on Computer Vision, pp. 2407–2415 (2015) Jia, X., Gavves, E., Fernando, B., Tuytelaars, T.: Guiding the long-short term memory model for image caption generation. In: IEEE International Conference on Computer Vision, pp. 2407–2415 (2015)
18.
go back to reference Johnson, A.E.: Spin-images: a representation for 3-D surface matching. Ph.D. thesis, Citeseer (1997) Johnson, A.E.: Spin-images: a representation for 3-D surface matching. Ph.D. thesis, Citeseer (1997)
19.
go back to reference Jolliffe, I.: Principal Component Analysis. Wiley Online Library (2002) Jolliffe, I.: Principal Component Analysis. Wiley Online Library (2002)
20.
go back to reference Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137 (2015) Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137 (2015)
21.
go back to reference Knopp, J., Prasad, M., Willems, G., Timofte, R., Gool, L.: Hough transform and 3D SURF for robust three dimensional classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6316, pp. 589–602. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15567-3_43 CrossRef Knopp, J., Prasad, M., Willems, G., Timofte, R., Gool, L.: Hough transform and 3D SURF for robust three dimensional classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6316, pp. 589–602. Springer, Heidelberg (2010). doi:10.​1007/​978-3-642-15567-3_​43 CrossRef
22.
go back to reference Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
23.
go back to reference Lavoué, G.: Combination of bag-of-words descriptors for robust partial shape retrieval. Vis. Comput. 28(9), 931–942 (2012)CrossRef Lavoué, G.: Combination of bag-of-words descriptors for robust partial shape retrieval. Vis. Comput. 28(9), 931–942 (2012)CrossRef
24.
go back to reference Li, B., Lu, Y., Godil, A., Schreck, T., Bustos, B., Ferreira, A., Furuya, T., Fonseca, M.J., Johan, H., Matsuda, T., et al.: A comparison of methods for sketch-based 3D shape retrieval. Comput. Vis. Image Underst. 119, 57–80 (2014)CrossRef Li, B., Lu, Y., Godil, A., Schreck, T., Bustos, B., Ferreira, A., Furuya, T., Fonseca, M.J., Johan, H., Matsuda, T., et al.: A comparison of methods for sketch-based 3D shape retrieval. Comput. Vis. Image Underst. 119, 57–80 (2014)CrossRef
25.
go back to reference Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRef Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRef
26.
go back to reference Maturana, D., Scherer, S.: Voxnet: a 3D convolutional neural network for real-time object recognition. In: IEEE International Conference on Intelligent Robots and Systems, pp. 922–928. IEEE (2015) Maturana, D., Scherer, S.: Voxnet: a 3D convolutional neural network for real-time object recognition. In: IEEE International Conference on Intelligent Robots and Systems, pp. 922–928. IEEE (2015)
27.
go back to reference Van den Oord, A., Dieleman, S., Schrauwen, B.: Deep content-based music recommendation. In: Advances in Neural Information Processing Systems, pp. 2643–2651 (2013) Van den Oord, A., Dieleman, S., Schrauwen, B.: Deep content-based music recommendation. In: Advances in Neural Information Processing Systems, pp. 2643–2651 (2013)
28.
go back to reference Rustamov, R.M.: Laplace-Beltrami eigenfunctions for deformation invariant shape representation. In: Eurographics Symposium on Geometry processing, pp. 225–233. Eurographics Association (2007) Rustamov, R.M.: Laplace-Beltrami eigenfunctions for deformation invariant shape representation. In: Eurographics Symposium on Geometry processing, pp. 225–233. Eurographics Association (2007)
29.
go back to reference Sedaghat, N., Zolfaghari, M., Brox, T.: Orientation-boosted voxel nets for 3D object recognition. arXiv preprint (2016). arXiv:1604.03351 Sedaghat, N., Zolfaghari, M., Brox, T.: Orientation-boosted voxel nets for 3D object recognition. arXiv preprint (2016). arXiv:​1604.​03351
30.
go back to reference Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint (2013). arXiv:1312.6229 Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint (2013). arXiv:​1312.​6229
31.
go back to reference Shi, B., Bai, S., Zhou, Z., Bai, X.: DeepPano: deep panoramic representation for 3-D shape recognition. IEEE Sig. Process. Lett. 22(12), 2339–2343 (2015)CrossRef Shi, B., Bai, S., Zhou, Z., Bai, X.: DeepPano: deep panoramic representation for 3-D shape recognition. IEEE Sig. Process. Lett. 22(12), 2339–2343 (2015)CrossRef
32.
go back to reference Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRef Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRef
33.
go back to reference Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: IEEE International Conference on Computer Vision, pp. 945–953 (2015) Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: IEEE International Conference on Computer Vision, pp. 945–953 (2015)
34.
go back to reference Sun, J., Ovsjanikov, M., Guibas, L.: A concise and provably informative multi-scale signature based on heat diffusion. In: Computer Graphics Forum, vol. 28, pp. 1383–1392. Wiley Online Library (2009) Sun, J., Ovsjanikov, M., Guibas, L.: A concise and provably informative multi-scale signature based on heat diffusion. In: Computer Graphics Forum, vol. 28, pp. 1383–1392. Wiley Online Library (2009)
35.
go back to reference Tabia, H., Laga, H., Picard, D., Gosselin, P.H.: Covariance descriptors for 3D shape matching and retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4185–4192 (2014) Tabia, H., Laga, H., Picard, D., Gosselin, P.H.: Covariance descriptors for 3D shape matching and retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4185–4192 (2014)
36.
go back to reference Wang, F., Kang, L., Li, Y.: Sketch-based 3D shape retrieval using convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1875–1883 (2015) Wang, F., Kang, L., Li, Y.: Sketch-based 3D shape retrieval using convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1875–1883 (2015)
37.
go back to reference Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)CrossRef Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)CrossRef
38.
go back to reference Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3D shapenets: a deep representation for volumetric shapes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015) Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3D shapenets: a deep representation for volumetric shapes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
39.
go back to reference Xie, J., Fang, Y., Zhu, F., Wong, E.: Deepshape: deep learned shape descriptor for 3D shape matching and retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1275–1283 (2015) Xie, J., Fang, Y., Zhu, F., Wong, E.: Deepshape: deep learned shape descriptor for 3D shape matching and retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1275–1283 (2015)
40.
go back to reference Zhang, Y., Shao, M., Wong, E., Fu, Y.: Random faces guided sparse many-to-one encoder for pose-invariant face recognition. In: IEEE International Conference on Computer Vision, pp. 2416–2423 (2013) Zhang, Y., Shao, M., Wong, E., Fu, Y.: Random faces guided sparse many-to-one encoder for pose-invariant face recognition. In: IEEE International Conference on Computer Vision, pp. 2416–2423 (2013)
41.
go back to reference Zhu, F., Xie, J., Fang, Y.: learning cross-domain neural networks for sketch-based 3D shape retrieval. In: AAAI (2016) Zhu, F., Xie, J., Fang, Y.: learning cross-domain neural networks for sketch-based 3D shape retrieval. In: AAAI (2016)
Metadata
Title
Heat Diffusion Long-Short Term Memory Learning for 3D Shape Analysis
Authors
Fan Zhu
Jin Xie
Yi Fang
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-46478-7_19

Premium Partner