Skip to main content

2016 | OriginalPaper | Buchkapitel

Heat Diffusion Long-Short Term Memory Learning for 3D Shape Analysis

verfasst von : Fan Zhu, Jin Xie, Yi Fang

Erschienen in: Computer Vision – ECCV 2016

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The heat kernel is a fundamental solution in mathematical physics to distribution measurement of heat energy within a fixed region over time, and due to its unique property of being invariant to isometric transformations, the heat kernel has been an effective feature descriptor for spectral shape analysis. The majority of prior heat kernel-based strategies of building 3D shape representations fail to investigate the temporal dynamics of heat flows on 3D shape surfaces over time. In this work, we address the temporal dynamics of heat flows on 3D shapes using the long-short term memory (LSTM). We guide 3D shape descriptors toward discriminative representations by feeding heat distributions throughout time as inputs to units of heat diffusion LSTM (HD-LSTM) blocks with a supervised learning structure. We further extend HD-LSTM to a cross-domain structure (CDHD-LSTM) for learning domain-invariant representations of multi-view data. We evaluate the effectiveness of both HD-LSTM and CDHD-LSTM on 3D shape retrieval and sketch-based 3D shape retrieval tasks respectively. Experimental results on McGill dataset and SHREC 2014 dataset suggest that both methods can achieve state-of-the-art performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Agathos, A., Pratikakis, I., Papadakis, P., Perantonis, S.J., Azariadis, P.N., Sapidis, N.S.: Retrieval of 3D articulated objects using a graph-based representation. In: 3DOR 2009, pp. 29–36 (2009) Agathos, A., Pratikakis, I., Papadakis, P., Perantonis, S.J., Azariadis, P.N., Sapidis, N.S.: Retrieval of 3D articulated objects using a graph-based representation. In: 3DOR 2009, pp. 29–36 (2009)
2.
Zurück zum Zitat Belongie, S., Malik, J., Puzicha, J.: Shape context: a new descriptor for shape matching and object recognition. In: Advances in Neural Information Processing Systems, vol. 2, p. 3 (2000) Belongie, S., Malik, J., Puzicha, J.: Shape context: a new descriptor for shape matching and object recognition. In: Advances in Neural Information Processing Systems, vol. 2, p. 3 (2000)
3.
Zurück zum Zitat Bengio, Y., Boulanger-Lewandowski, N., Pascanu, R.: Advances in optimizing recurrent networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8624–8628 (2013) Bengio, Y., Boulanger-Lewandowski, N., Pascanu, R.: Advances in optimizing recurrent networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8624–8628 (2013)
4.
Zurück zum Zitat Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)MATH Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)MATH
5.
Zurück zum Zitat Bronstein, A.M., Bronstein, M.M., Kimmel, R.: Efficient computation of isometry-invariant distances between surfaces. SIAM J. Sci. Comput. 28(5), 1812–1836 (2006)MathSciNetCrossRefMATH Bronstein, A.M., Bronstein, M.M., Kimmel, R.: Efficient computation of isometry-invariant distances between surfaces. SIAM J. Sci. Comput. 28(5), 1812–1836 (2006)MathSciNetCrossRefMATH
6.
Zurück zum Zitat Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint (2014). arXiv:1406.1078 Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint (2014). arXiv:​1406.​1078
7.
Zurück zum Zitat Darom, T., Keller, Y.: Scale-invariant features for 3-D mesh models. IEEE Trans. Image Process. 21(5), 2758–2769 (2012)MathSciNetCrossRef Darom, T., Keller, Y.: Scale-invariant features for 3-D mesh models. IEEE Trans. Image Process. 21(5), 2758–2769 (2012)MathSciNetCrossRef
8.
Zurück zum Zitat Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009) Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
9.
Zurück zum Zitat Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015) Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)
10.
Zurück zum Zitat Eitz, M., Hays, J., Alexa, M.: How do humans sketch objects? ACM Trans. Graph. 31(4), 44 (2012) Eitz, M., Hays, J., Alexa, M.: How do humans sketch objects? ACM Trans. Graph. 31(4), 44 (2012)
11.
Zurück zum Zitat Gal, R., Shamir, A., Cohen-Or, D.: Pose-oblivious shape signature. IEEE Trans. Vis. Comput. Graph. 13(2), 261–271 (2007)CrossRef Gal, R., Shamir, A., Cohen-Or, D.: Pose-oblivious shape signature. IEEE Trans. Vis. Comput. Graph. 13(2), 261–271 (2007)CrossRef
12.
Zurück zum Zitat Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Comput. 12(10), 2451–2471 (2000)CrossRef Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Comput. 12(10), 2451–2471 (2000)CrossRef
13.
Zurück zum Zitat Graves, A., Mohamed, A.r., Hinton, G.: Speech recognition with deep recurrent neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649 (2013) Graves, A., Mohamed, A.r., Hinton, G.: Speech recognition with deep recurrent neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649 (2013)
14.
Zurück zum Zitat Hilaga, M., Shinagawa, Y., Kohmura, T., Kunii, T.L.: Topology matching for fully automatic similarity estimation of 3D shapes. In: Annual Conference on Computer Graphics and Interactive Techniques, pp. 203–212. ACM (2001) Hilaga, M., Shinagawa, Y., Kohmura, T., Kunii, T.L.: Topology matching for fully automatic similarity estimation of 3D shapes. In: Annual Conference on Computer Graphics and Interactive Techniques, pp. 203–212. ACM (2001)
15.
Zurück zum Zitat Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef
16.
Zurück zum Zitat Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)CrossRef Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)CrossRef
17.
Zurück zum Zitat Jia, X., Gavves, E., Fernando, B., Tuytelaars, T.: Guiding the long-short term memory model for image caption generation. In: IEEE International Conference on Computer Vision, pp. 2407–2415 (2015) Jia, X., Gavves, E., Fernando, B., Tuytelaars, T.: Guiding the long-short term memory model for image caption generation. In: IEEE International Conference on Computer Vision, pp. 2407–2415 (2015)
18.
Zurück zum Zitat Johnson, A.E.: Spin-images: a representation for 3-D surface matching. Ph.D. thesis, Citeseer (1997) Johnson, A.E.: Spin-images: a representation for 3-D surface matching. Ph.D. thesis, Citeseer (1997)
19.
Zurück zum Zitat Jolliffe, I.: Principal Component Analysis. Wiley Online Library (2002) Jolliffe, I.: Principal Component Analysis. Wiley Online Library (2002)
20.
Zurück zum Zitat Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137 (2015) Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137 (2015)
21.
Zurück zum Zitat Knopp, J., Prasad, M., Willems, G., Timofte, R., Gool, L.: Hough transform and 3D SURF for robust three dimensional classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6316, pp. 589–602. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15567-3_43 CrossRef Knopp, J., Prasad, M., Willems, G., Timofte, R., Gool, L.: Hough transform and 3D SURF for robust three dimensional classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6316, pp. 589–602. Springer, Heidelberg (2010). doi:10.​1007/​978-3-642-15567-3_​43 CrossRef
22.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
23.
Zurück zum Zitat Lavoué, G.: Combination of bag-of-words descriptors for robust partial shape retrieval. Vis. Comput. 28(9), 931–942 (2012)CrossRef Lavoué, G.: Combination of bag-of-words descriptors for robust partial shape retrieval. Vis. Comput. 28(9), 931–942 (2012)CrossRef
24.
Zurück zum Zitat Li, B., Lu, Y., Godil, A., Schreck, T., Bustos, B., Ferreira, A., Furuya, T., Fonseca, M.J., Johan, H., Matsuda, T., et al.: A comparison of methods for sketch-based 3D shape retrieval. Comput. Vis. Image Underst. 119, 57–80 (2014)CrossRef Li, B., Lu, Y., Godil, A., Schreck, T., Bustos, B., Ferreira, A., Furuya, T., Fonseca, M.J., Johan, H., Matsuda, T., et al.: A comparison of methods for sketch-based 3D shape retrieval. Comput. Vis. Image Underst. 119, 57–80 (2014)CrossRef
25.
Zurück zum Zitat Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRef Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRef
26.
Zurück zum Zitat Maturana, D., Scherer, S.: Voxnet: a 3D convolutional neural network for real-time object recognition. In: IEEE International Conference on Intelligent Robots and Systems, pp. 922–928. IEEE (2015) Maturana, D., Scherer, S.: Voxnet: a 3D convolutional neural network for real-time object recognition. In: IEEE International Conference on Intelligent Robots and Systems, pp. 922–928. IEEE (2015)
27.
Zurück zum Zitat Van den Oord, A., Dieleman, S., Schrauwen, B.: Deep content-based music recommendation. In: Advances in Neural Information Processing Systems, pp. 2643–2651 (2013) Van den Oord, A., Dieleman, S., Schrauwen, B.: Deep content-based music recommendation. In: Advances in Neural Information Processing Systems, pp. 2643–2651 (2013)
28.
Zurück zum Zitat Rustamov, R.M.: Laplace-Beltrami eigenfunctions for deformation invariant shape representation. In: Eurographics Symposium on Geometry processing, pp. 225–233. Eurographics Association (2007) Rustamov, R.M.: Laplace-Beltrami eigenfunctions for deformation invariant shape representation. In: Eurographics Symposium on Geometry processing, pp. 225–233. Eurographics Association (2007)
29.
Zurück zum Zitat Sedaghat, N., Zolfaghari, M., Brox, T.: Orientation-boosted voxel nets for 3D object recognition. arXiv preprint (2016). arXiv:1604.03351 Sedaghat, N., Zolfaghari, M., Brox, T.: Orientation-boosted voxel nets for 3D object recognition. arXiv preprint (2016). arXiv:​1604.​03351
30.
Zurück zum Zitat Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint (2013). arXiv:1312.6229 Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint (2013). arXiv:​1312.​6229
31.
Zurück zum Zitat Shi, B., Bai, S., Zhou, Z., Bai, X.: DeepPano: deep panoramic representation for 3-D shape recognition. IEEE Sig. Process. Lett. 22(12), 2339–2343 (2015)CrossRef Shi, B., Bai, S., Zhou, Z., Bai, X.: DeepPano: deep panoramic representation for 3-D shape recognition. IEEE Sig. Process. Lett. 22(12), 2339–2343 (2015)CrossRef
32.
Zurück zum Zitat Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRef Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRef
33.
Zurück zum Zitat Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: IEEE International Conference on Computer Vision, pp. 945–953 (2015) Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: IEEE International Conference on Computer Vision, pp. 945–953 (2015)
34.
Zurück zum Zitat Sun, J., Ovsjanikov, M., Guibas, L.: A concise and provably informative multi-scale signature based on heat diffusion. In: Computer Graphics Forum, vol. 28, pp. 1383–1392. Wiley Online Library (2009) Sun, J., Ovsjanikov, M., Guibas, L.: A concise and provably informative multi-scale signature based on heat diffusion. In: Computer Graphics Forum, vol. 28, pp. 1383–1392. Wiley Online Library (2009)
35.
Zurück zum Zitat Tabia, H., Laga, H., Picard, D., Gosselin, P.H.: Covariance descriptors for 3D shape matching and retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4185–4192 (2014) Tabia, H., Laga, H., Picard, D., Gosselin, P.H.: Covariance descriptors for 3D shape matching and retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4185–4192 (2014)
36.
Zurück zum Zitat Wang, F., Kang, L., Li, Y.: Sketch-based 3D shape retrieval using convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1875–1883 (2015) Wang, F., Kang, L., Li, Y.: Sketch-based 3D shape retrieval using convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1875–1883 (2015)
37.
Zurück zum Zitat Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)CrossRef Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)CrossRef
38.
Zurück zum Zitat Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3D shapenets: a deep representation for volumetric shapes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015) Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3D shapenets: a deep representation for volumetric shapes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
39.
Zurück zum Zitat Xie, J., Fang, Y., Zhu, F., Wong, E.: Deepshape: deep learned shape descriptor for 3D shape matching and retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1275–1283 (2015) Xie, J., Fang, Y., Zhu, F., Wong, E.: Deepshape: deep learned shape descriptor for 3D shape matching and retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1275–1283 (2015)
40.
Zurück zum Zitat Zhang, Y., Shao, M., Wong, E., Fu, Y.: Random faces guided sparse many-to-one encoder for pose-invariant face recognition. In: IEEE International Conference on Computer Vision, pp. 2416–2423 (2013) Zhang, Y., Shao, M., Wong, E., Fu, Y.: Random faces guided sparse many-to-one encoder for pose-invariant face recognition. In: IEEE International Conference on Computer Vision, pp. 2416–2423 (2013)
41.
Zurück zum Zitat Zhu, F., Xie, J., Fang, Y.: learning cross-domain neural networks for sketch-based 3D shape retrieval. In: AAAI (2016) Zhu, F., Xie, J., Fang, Y.: learning cross-domain neural networks for sketch-based 3D shape retrieval. In: AAAI (2016)
Metadaten
Titel
Heat Diffusion Long-Short Term Memory Learning for 3D Shape Analysis
verfasst von
Fan Zhu
Jin Xie
Yi Fang
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-46478-7_19

Premium Partner