Skip to main content
Top

2018 | OriginalPaper | Chapter

Learning SO(3) Equivariant Representations with Spherical CNNs

Authors : Carlos Esteves, Christine Allen-Blanchette, Ameesh Makadia, Kostas Daniilidis

Published in: Computer Vision – ECCV 2018

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We address the problem of 3D rotation equivariance in convolutional neural networks. 3D rotations have been a challenging nuisance in 3D classification tasks requiring higher capacity and extended data augmentation in order to tackle it. We model 3D data with multi-valued spherical functions and we propose a novel spherical convolutional network that implements exact convolutions on the sphere by realizing them in the spherical harmonic domain. Resulting filters have local symmetry and are localized by enforcing smooth spectra. We apply a novel pooling on the spectral domain and our operations are independent of the underlying spherical resolution throughout the network. We show that networks with much lower capacity and without requiring data augmentation can exhibit performance comparable to the state of the art in standard retrieval and classification benchmarks.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
The first version of this work was submitted to CVPR on 11/15/2017, shortly after we became aware of Cohen et al. [5] ICLR submission on 10/27/2017.
 
Literature
2.
go back to reference Worrall, D.E., Garbin, S.J., Turmukhambetov, D., Brostow, G.J.: Harmonic networks: deep translation and rotation equivariance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2 (2017) Worrall, D.E., Garbin, S.J., Turmukhambetov, D., Brostow, G.J.: Harmonic networks: deep translation and rotation equivariance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2 (2017)
3.
go back to reference Bruna, J., Szlam, A., LeCun, Y.: Learning stable group invariant representations with convolutional networks (2013). arXiv preprint: arXiv:1301.3537 Bruna, J., Szlam, A., LeCun, Y.: Learning stable group invariant representations with convolutional networks (2013). arXiv preprint: arXiv:​1301.​3537
4.
go back to reference Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P.: Geometric deep learning: going beyond Euclidean data. IEEE Signal Process. Mag. 34(4), 18–42 (2017)CrossRef Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P.: Geometric deep learning: going beyond Euclidean data. IEEE Signal Process. Mag. 34(4), 18–42 (2017)CrossRef
5.
go back to reference Cohen, T.S., Geiger, M., Khler, J., Welling, M.: Spherical CNNs. In: International Conference on Learning Representations (2018) Cohen, T.S., Geiger, M., Khler, J., Welling, M.: Spherical CNNs. In: International Conference on Learning Representations (2018)
6.
go back to reference Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: Advances in Neural Information Processing Systems, pp. 2017–2025 (2015) Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: Advances in Neural Information Processing Systems, pp. 2017–2025 (2015)
7.
go back to reference Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), vol. 1(2), p. 4. IEEE (2017) Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), vol. 1(2), p. 4. IEEE (2017)
8.
go back to reference Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., Guibas, L.J.: Volumetric and multi-view CNNs for object classification on 3D data. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, pp. 5648–5656, 27–30 June 2016 Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., Guibas, L.J.: Volumetric and multi-view CNNs for object classification on 3D data. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, pp. 5648–5656, 27–30 June 2016
9.
go back to reference Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 945–953 (2015) Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 945–953 (2015)
10.
go back to reference Savva, M., et al.: Shrec’17 track: large-scale 3D shape retrieval from shapenet core55. In: 10th Eurographics workshop on 3D Object retrieval, pp. 1–11 (2017) Savva, M., et al.: Shrec’17 track: large-scale 3D shape retrieval from shapenet core55. In: 10th Eurographics workshop on 3D Object retrieval, pp. 1–11 (2017)
11.
go back to reference Wu, Z., et al.: 3D shapenets: a deep representation for volumetric shapes. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, pp. 1912–1920, 7–12 June 2015 Wu, Z., et al.: 3D shapenets: a deep representation for volumetric shapes. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, pp. 1912–1920, 7–12 June 2015
12.
go back to reference Segman, J., Rubinstein, J., Zeevi, Y.Y.: The canonical coordinates method for pattern deformation: theoretical and computational considerations. IEEE Trans. Pattern Anal. Mach. Intell. 14(12), 1171–1183 (1992)CrossRef Segman, J., Rubinstein, J., Zeevi, Y.Y.: The canonical coordinates method for pattern deformation: theoretical and computational considerations. IEEE Trans. Pattern Anal. Mach. Intell. 14(12), 1171–1183 (1992)CrossRef
13.
go back to reference Hel-Or, Y., Teo, P.C.: Canonical decomposition of steerable functions. In: Proceedings of the 1996 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 1996, pp. 809–816. IEEE (1996) Hel-Or, Y., Teo, P.C.: Canonical decomposition of steerable functions. In: Proceedings of the 1996 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 1996, pp. 809–816. IEEE (1996)
14.
go back to reference Worrall, D.E., Garbin, S.J., Turmukhambetov, D., Brostow, G.J.: Harmonic networks: deep translation and rotation equivariance (2016). arXiv preprint: arXiv:1612.04642 Worrall, D.E., Garbin, S.J., Turmukhambetov, D., Brostow, G.J.: Harmonic networks: deep translation and rotation equivariance (2016). arXiv preprint: arXiv:​1612.​04642
15.
go back to reference Dieleman, S., Willett, K.W., Dambre, J.: Rotation-invariant convolutional neural networks for galaxy morphology prediction. Mon. Not. R. Astron. Soc. 450(2), 1441–1459 (2015)CrossRef Dieleman, S., Willett, K.W., Dambre, J.: Rotation-invariant convolutional neural networks for galaxy morphology prediction. Mon. Not. R. Astron. Soc. 450(2), 1441–1459 (2015)CrossRef
16.
go back to reference Gens, R., Domingos, P.M.: Deep symmetry networks. In: Advances in Neural Information Processing Systems, pp. 2537–2545 (2014) Gens, R., Domingos, P.M.: Deep symmetry networks. In: Advances in Neural Information Processing Systems, pp. 2537–2545 (2014)
17.
go back to reference Zhou, Y., Ye, Q., Qiu, Q., Jiao, J.: Oriented response networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 Zhou, Y., Ye, Q., Qiu, Q., Jiao, J.: Oriented response networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
18.
go back to reference Marcos, D., Volpi, M., Komodakis, N., Tuia, D.: Rotation equivariant vector field networks. CoRR (2016) Marcos, D., Volpi, M., Komodakis, N., Tuia, D.: Rotation equivariant vector field networks. CoRR (2016)
19.
go back to reference Lenc, K., Vedaldi, A.: Understanding image representations by measuring their equivariance and equivalence. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 991–999 (2015) Lenc, K., Vedaldi, A.: Understanding image representations by measuring their equivariance and equivalence. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 991–999 (2015)
20.
go back to reference Bruna, J., Zaremba, W., Szlam, A., LeCun, Y.: Spectral networks and locally connected networks on graphs (2013). arXiv preprint: arXiv:1312.6203 Bruna, J., Zaremba, W., Szlam, A., LeCun, Y.: Spectral networks and locally connected networks on graphs (2013). arXiv preprint: arXiv:​1312.​6203
21.
go back to reference Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in Neural Information Processing Systems, pp. 3844–3852 (2016) Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in Neural Information Processing Systems, pp. 3844–3852 (2016)
22.
23.
go back to reference Boscaini, D., Masci, J., Rodolà, E., Bronstein, M.: Learning shape correspondence with anisotropic convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 3189–3197 (2016) Boscaini, D., Masci, J., Rodolà, E., Bronstein, M.: Learning shape correspondence with anisotropic convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 3189–3197 (2016)
24.
go back to reference Masci, J., Boscaini, D., Bronstein, M., Vandergheynst, P.: Geodesic convolutional neural networks on Riemannian manifolds. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 37–45 (2015) Masci, J., Boscaini, D., Bronstein, M., Vandergheynst, P.: Geodesic convolutional neural networks on Riemannian manifolds. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 37–45 (2015)
25.
go back to reference Monti, F., Boscaini, D., Masci, J., Rodolà, E., Svoboda, J., Bronstein, M.M.: Geometric deep learning on graphs and manifolds using mixture model CNNs (2016). arXiv preprint: arXiv:1611.08402 Monti, F., Boscaini, D., Masci, J., Rodolà, E., Svoboda, J., Bronstein, M.M.: Geometric deep learning on graphs and manifolds using mixture model CNNs (2016). arXiv preprint: arXiv:​1611.​08402
27.
go back to reference Kazhdan, M., Funkhouser, T.: Harmonic 3D shape matching. In: ACM SIGGRAPH 2002 Conference Abstracts and Applications, p. 191. ACM (2002) Kazhdan, M., Funkhouser, T.: Harmonic 3D shape matching. In: ACM SIGGRAPH 2002 Conference Abstracts and Applications, p. 191. ACM (2002)
28.
go back to reference Makadia, A., Daniilidis, K.: Spherical correlation of visual representations for 3D model retrieval. Int. J. Comput. Vis. 89(2), 193–210 (2010)CrossRef Makadia, A., Daniilidis, K.: Spherical correlation of visual representations for 3D model retrieval. Int. J. Comput. Vis. 89(2), 193–210 (2010)CrossRef
29.
go back to reference Maturana, D., Scherer, S.: Voxnet: a 3D convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2015, Hamburg, Germany, 28 September–2 October 2015, pp. 922–928 (2015) Maturana, D., Scherer, S.: Voxnet: a 3D convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2015, Hamburg, Germany, 28 September–2 October 2015, pp. 922–928 (2015)
30.
go back to reference Kanezaki, A., Matsushita, Y., Nishida, Y.: Rotationnet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Kanezaki, A., Matsushita, Y., Nishida, Y.: Rotationnet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
31.
go back to reference Bai, S., Bai, X., Zhou, Z., Zhang, Z., Jan Latecki, L.: Gift: a real-time and scalable 3D shape search engine. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5023–5032 (2016) Bai, S., Bai, X., Zhou, Z., Zhang, Z., Jan Latecki, L.: Gift: a real-time and scalable 3D shape search engine. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5023–5032 (2016)
32.
go back to reference Thurston, W.P.: Three-Dimensional Geometry and Topology, vol. 1. Princeton University Press, Princeton (1997)CrossRef Thurston, W.P.: Three-Dimensional Geometry and Topology, vol. 1. Princeton University Press, Princeton (1997)CrossRef
33.
go back to reference Arfken, G.: Mathematical Methods for Physicists, vol. 2. Academic Press, London (1966)MATH Arfken, G.: Mathematical Methods for Physicists, vol. 2. Academic Press, London (1966)MATH
34.
go back to reference Driscoll, J.R., Healy, D.M.: Computing fourier transforms and convolutions on the 2-sphere. Adv. Appl. Math. 15(2), 202–250 (1994)MathSciNetCrossRef Driscoll, J.R., Healy, D.M.: Computing fourier transforms and convolutions on the 2-sphere. Adv. Appl. Math. 15(2), 202–250 (1994)MathSciNetCrossRef
35.
go back to reference Healy, D.M., Rockmore, D.N., Kostelec, P.J., Moore, S.: Ffts for the 2-sphere-improvements and variations. J. Fourier Anal. Appl. 9(4), 341–385 (2003)MathSciNetCrossRef Healy, D.M., Rockmore, D.N., Kostelec, P.J., Moore, S.: Ffts for the 2-sphere-improvements and variations. J. Fourier Anal. Appl. 9(4), 341–385 (2003)MathSciNetCrossRef
36.
go back to reference Bruna, J., Zaremba, W., Szlam, A., LeCun, Y.: Spectral networks and locally connected networks on graphs. CoRR (2013) Bruna, J., Zaremba, W., Szlam, A., LeCun, Y.: Spectral networks and locally connected networks on graphs. CoRR (2013)
37.
go back to reference Rippel, O., Snoek, J., Adams, R.P.: Spectral representations for convolutional neural networks. CoRR (2015) Rippel, O., Snoek, J., Adams, R.P.: Spectral representations for convolutional neural networks. CoRR (2015)
38.
go back to reference Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, pp. 5105–5114 (2017) Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, pp. 5105–5114 (2017)
39.
go back to reference Chang, A.X., et al.: Shapenet: An information-rich 3D model repository. CoRR (2015) Chang, A.X., et al.: Shapenet: An information-rich 3D model repository. CoRR (2015)
40.
go back to reference Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015) Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)
41.
go back to reference Furuya, T., Ohbuchi, R.: Deep aggregation of local 3D geometric features for 3D model retrieval. In: BMVC (2016) Furuya, T., Ohbuchi, R.: Deep aggregation of local 3D geometric features for 3D model retrieval. In: BMVC (2016)
42.
go back to reference Tatsuma, A., Aono, M.: Multi-fourier spectra descriptor and augmentation with spectral clustering for 3D shape retrieval. Vis. Comput. 25(8), 785–804 (2009)CrossRef Tatsuma, A., Aono, M.: Multi-fourier spectra descriptor and augmentation with spectral clustering for 3D shape retrieval. Vis. Comput. 25(8), 785–804 (2009)CrossRef
Metadata
Title
Learning SO(3) Equivariant Representations with Spherical CNNs
Authors
Carlos Esteves
Christine Allen-Blanchette
Ameesh Makadia
Kostas Daniilidis
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-030-01261-8_4

Premium Partner