Top

International Journal of Computer Vision

Published in:

25-09-2023

Correspondence Distillation from NeRF-Based GAN

Authors: Yushi Lan, Chen Change Loy, Bo Dai

Published in: International Journal of Computer Vision | Issue 3/2024

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

The neural radiance field (NeRF) has shown promising results in preserving the fine details of objects and scenes. However, unlike explicit shape representations e.g., mesh, it remains an open problem to build dense correspondences across different NeRFs of the same category, which is essential in many downstream tasks. The main difficulties of this problem lie in the implicit nature of NeRF and the lack of ground-truth correspondence annotations. In this paper, we show it is possible to bypass these challenges by leveraging the rich semantics and structural priors encapsulated in a pre-trained NeRF-based GAN. Specifically, we exploit such priors from three aspects, namely (1) a dual deformation field that takes latent codes as global structural indicators, (2) a learning objective that regards generator features as geometric-aware local descriptors, and (3) a source of infinite object-specific NeRF samples. Our experiments demonstrate that such priors lead to 3D dense correspondence that is accurate, smooth, and robust. We also show that established dense correspondence across NeRFs can effectively enable many NeRF-based downstream applications such as texture transfer.

next article Editor’s Note: Special Issue on Physics-Based Vision Meets Deep Learning

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Bau, D., Zhu, J. Y., Strobelt, H., Lapedriza, A., Zhou, B., & Torralba, A. (2020). Understanding the role of individual units in a deep neural network. PNAS https://doi.org/10.1073/pnas.1907375117, https://www.pnas.org/content/early/2020/08/31/1907375117.

Blanz, V., & Vetter, T. (1999). A morphable model for the synthesis of 3d faces. In SIGGRAPH.

Brock, A., Donahue, J., & Simonyan, K. (2019a). Large scale GAN training for high fidelity natural image synthesis. In ICLR.

Brock, A., Donahue, J., & Simonyan, K. (2019b). Large scale GAN training for high fidelity natural image synthesis. In ICLR, OpenReview.net, https://openreview.net/forum?id=B1xsqj09Fm.

Chan, E., Monteiro, M., Kellnhofer, P., Wu, J., & Wetzstein, G. (2021a). pi-gan: Periodic implicit generative adversarial networks for 3d-aware image synthesis. In CVPR.

Chan, E. R., Lin, C. Z., Chan, M. A., Nagano, K., Pan, B., Mello, S. D., Gallo, O., Guibas, L., Tremblay, J., Khamis, S., Karras, T., & Wetzstein, G. (2021b). Efficient geometry-aware 3D generative adversarial networks. In arXiv.

Chan, E. R., Monteiro, M., Kellnhofer, P., Wu, J., & Wetzstein, G. (2021c). pi-GAN: Periodic implicit generative adversarial networks for 3D-aware image synthesis. In CVPR.

Chan, E. R., Lin, C. Z., Chan, M. A., Nagano, K., Pan, B., Mello, S. D., Gallo, O., Guibas, L., Tremblay, J., Khamis, S., Karras, T., & Wetzstein, G. (2022a). Efficient geometry-aware 3D generative adversarial networks. In CVPR.

Chan, E. R., Lin, C. Z., Chan, M. A., Nagano, K., Pan, B., Mello, S. D., Gallo, O., Guibas, L., Tremblay, J., Khamis, S., Karras, T., & Wetzstein, G. (2022b). Efficient geometry-aware 3D generative adversarial networks. In CVPR.

Chang, A. X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., & Yu, F. (2015). ShapeNet: An Information-Rich 3D Model Repository. Tech. Rep. arXiv:1512.03012 [cs.GR], Stanford University — Princeton University — Toyota Technological Institute at Chicago.

Chen, L. C., Papandreou, G., Schroff, F., & Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587.

Chen, Z., & Zhang, H. (2019). Learning implicit fields for generative shape modeling. In CVPR, pp 5932–5941.

Cho, S., Hong, S., Jeon, S., Lee, Y., Sohn, K., & Kim, S. (2021). Cats: Cost aggregation transformers for visual correspondence. In Thirty-Fifth Conference on Neural Information Processing Systems.

Deng, Y., Yang, J., & Tong, X. (2021). Deformed implicit field: Modeling 3d shapes with learned dense correspondence. In CVPR, pp 10286–10296.

Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., & Koltun, V. (2017). Carla: An open urban driving simulator. In Proc. CoRL.

Dumoulin, V., Perez, E., Schucher, N., Strub, F., Vries, H. d., Courville, A., & Bengio, Y. (2018). Feature-wise transformations. Distill https://doi.org/10.23915/distill.00011, https://distill.pub/2018/feature-wise-transformations.

Egger, B., Smith, W., Tewari, A., Wuhrer, S., Zollhöfer, M., Beeler, T., Bernard, F., Bolkart, T., Kortylewski, A., Romdhani, S., Theobalt, C., Blanz, V., & Vetter, T. (2020). 3d morphable face models–past, present, and future. TOG, 39, 1–38.CrossRef

Eisenberger, M., Novotný, D., Kerchenbaum, G., Labatut, P., Neverova, N., Cremers, D., & Vedaldi, A. (2021). Neuromorph: Unsupervised shape interpolation and correspondence in one go. In CVPR.

Eslami, S. M. A., Rezende, D. J., Besse, F., Viola, F., Morcos, A. S., Garnelo, M., Ruderman, A., Rusu, A. A., Danihelka, I., Gregor, K., Reichert, D. P., Buesing, L., Weber, T., Vinyals, O., Rosenbaum, D., Rabinowitz, N. C., King, H., Hillier, C., Botvinick, M. M., … Hassabis, D. (2018). Neural scene representation and rendering. Science, 360, 1204–1210.CrossRefPubMedADS

Fan, Z., Hu, X., Chen, C., & Peng, S. (2019). Boosting local shape matching for dense 3d face correspondence. In CVPR, pp 10936–10946.

Gafni, G., Thies, J., Zollhöfer, M., & Nießner, M. (2021). Dynamic neural radiance fields for monocular 4d facial avatar reconstruction. In CVPR, pp 8649–8658

Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. C., & Bengio, Y. (2014). Generative adversarial nets. In NIPS.

Gu, J., Liu, L., Wang, P., & Theobalt, C. (2022). Stylenerf: A style-based 3d aware generator for high-resolution image synthesis. In ICLR, https://openreview.net/forum?id=iUuzzTMUw9K.

Guo, Y., Chen, K., Liang, S., Liu, Y., Bao, H., & Zhang, J. (2021). Ad-nerf: Audio driven neural radiance fields for talking head synthesis. In ICCV.

Halimi, O., Litany, O., Rodola, E., Bronstein, A. M., & Kimmel, R. (2019). Unsupervised learning of dense shape correspondence. In CVPR, pp 4370–4379.

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In CVPR.

He, K., Fan, H., Wu, Y., Xie, S., & Girshick, R. (2019). Momentum contrast for unsupervised visual representation learning. arXiv.

Hong, S., Cho, S., Nam, J., Lin, S., & Kim, S. (2022a). Cost aggregation with 4d convolutional swin transformer for few-shot segmentation. In ECCV, Springer, pp 108–126.

Hong, Y., Peng, B., Xiao, H., Liu, L., & Zhang, J. (2022b). Headnerf: A real-time nerf-based parametric head model. In CVPR.

Jahanian, A., Chai, L., & Isola, P. (2020). On the “steerability” of generative adversarial networks. In ICLR.

Kaick, O. V., Zhang, H., Hamarneh, G., & Cohen-Or, D. (2011). A survey on shape correspondence. Computer Graphics Forum 30.

Karras, T., Laine, S., & Aila, T. (2019a). A style-based generator architecture for generative adversarial networks. In CVPR.

Karras, T., Laine, S., & Aila, T. (2019b). A style-based generator architecture for generative adversarial networks. In CVPR, pp 4396–4405.

Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. In ICLR, arXiv:1412.6980.

Lan, Y., Meng, X., Yang, S., Loy, C. C., & Dai, B. (2023). E3dge: Self-supervised geometry-aware encoder for style-based 3d gan inversion. In Computer Vision and Pattern Recognition (CVPR).

Li, Z., & Snavely, N. (2018). Megadepth: Learning single-view depth prediction from internet photos. In ICCV, pp 2041–2050.

Li, Z., Niklaus, S., Snavely, N., & Wang, O. (2021). Neural scene flow fields for space-time view synthesis of dynamic scenes. In CVPR.

Litany, O., Remez, T., Rodolà, E., Bronstein, A., & Bronstein, M. (2017). Deep functional maps: Structured prediction for dense shape correspondence. In ICCV, pp 5660–5668.

Liu, C., Yuen, J., & Torralba, A. (2011). Sift flow: Dense correspondence across scenes and its applications. PAMI, 33, 978–994.CrossRef

Liu, F., & Liu, X. (2020). Learning implicit functions for topology-varying dense 3d shape correspondence. In NIPS, Virtual.

Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In ICCV.

Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., & Black, M. J. (2015). Smpl: A skinned multi-person linear model. TOG, 34(6), 1–16.CrossRef

Lowe, D. G. (1999). Object recognition from local scale-invariant features. In Int. Conf. Comput. Vis., IEEE, vol 2, pp 1150–1157.

Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., & Geiger, A. (2019). Occupancy networks: Learning 3d reconstruction in function space. In CVPR, pp 4460–4470.

Mescheder, L. M., Geiger, A., & Nowozin, S. (2018). Which training methods for gans do actually converge? In ICML.

Mildenhall, B., Srinivasan, P. P., Tancik, M., Barron, J. T., Ramamoorthi, R., & Ng, R. (2020). Nerf: Representing scenes as neural radiance fields for view synthesis. In ECCV, Springer, pp 405–421.

Min, J., Lee, J., Ponce, J., & Cho, M. (2019). Spair-71k: A large-scale benchmark for semantic correspondence. arXiv:1908.10543.

Mu, J., De Mello, S., Yu, Z., Vasconcelos, N., Wang, X., Kautz, J., & Liu, S. (2022). Coordgan: Self-supervised dense correspondences emerge from gans. In CVPR.

Niemeyer, M., & Geiger, A. (2021). Giraffe: Representing scenes as compositional generative neural feature fields. In CVPR.

Noguchi, A., Sun, X., Lin, S., & Harada, T. (2021). Neural articulated radiance field. In ICCV.

Or-El, R., Luo, X., Shan, M., Shechtman, E., Park, J. J., & Kemelmacher-Shlizerman, I. (2021). StyleSDF: High-Resolution 3D-Consistent Image and Geometry Generation. In CVPR.

Pan, X., Dai, B., Liu, Z., Loy, C. C., & Luo, P. (2021). Do 2d gans know 3d shape? unsupervised 3d shape reconstruction from 2d image gans. In ICLR

Park, J. J., Florence, P., Straub, J., Newcombe, R., & Lovegrove, S. (2019). DeepSDF: Learning continuous signed distance functions for shape representation. In CVPR, IEEE, pp 165–174, https://doi.org/10.1109/CVPR.2019.00025, https://ieeexplore.ieee.org/document/8954065/.

Park, K., Sinha, U., Barron, J.T., Bouaziz, S., Goldman, D. B., Seitz, S. M., & Martin-Brualla, R. (2021a). Nerfies: Deformable neural radiance fields. In ICCV.

Park, K., Sinha, U., Barron, J. T., Bouaziz, S., Goldman, D. B., Seitz, S. M., & Martin-Brualla, R. (2021b). Nerfies: Deformable neural radiance fields. In ICCV.

Park, K., Sinha, U., Hedman, P., Barron, J. T., Bouaziz, S., Goldman, D. B., Martin-Brualla, R., & Seitz, S. M. (2021c). Hypernerf: A higher-dimensional representation for topologically varying neural radiance fields. TOG 40(6).

Park, T., Zhu, J. Y., Wang, O., Lu, J., Shechtman, E., Efros, A. A., & Zhang, R. (2020). Swapping autoencoder for deep image manipulation. In NIPS.

Peebles, W., Zhu, J. Y., Zhang, R., Torralba, A., Efros, A., & Shechtman, E. (2022). Gan-supervised dense visual alignment. In CVPR.

Perez, E., Strub, F., De Vries, H., Dumoulin, V., & Courville, A. (2018). Film: Visual reasoning with a general conditioning layer. In AAAI, vol 32.

Pumarola, A., Corona, E., Pons-Moll, G., & Moreno-Noguer, F. (2020). D-NeRF: Neural Radiance Fields for Dynamic Scenes. In CVPR.

Qi, C., Su, H., Mo, K., & Guibas, L. (2017). Pointnet: Deep learning on point sets for 3d classification and segmentation. In CVPR, pp 77–85

Richardson, E., Alaluf, Y., Patashnik, O., Nitzan, Y., Azar, Y., Shapiro, S., & Cohen-Or, D. (2021). Encoding in style: a stylegan encoder for image-to-image translation. In CVPR.

Sahillioglu, Y. (2019). Recent advances in shape correspondence. The Visual Computer, 36, 1705–1721.CrossRef

Schwarz, K., Liao, Y., Niemeyer, M., & Geiger, A. (2020). Graf: Generative radiance fields for 3d-aware image synthesis. In NIPS.

Shen, Y., Yang, C., Tang, X., & Zhou, B. (2020). Interfacegan: Interpreting the disentangled face representation learned by gans. PAMI PP.

Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In CoRR, arXiv:1409.1556.

Sitzmann, V., Martel, J. N., Bergman, A. W., Lindell, D. B., & Wetzstein, G. (2020). Implicit neural representations with periodic activation functions. In NIPS.

Tancik, M., Srinivasan, P. P., Mildenhall, B., Fridovich-Keil, S., Raghavan, N., Singhal, U., Ramamoorthi, R., Barron, J. T., & Ng, R. (2020). Fourier features let networks learn high frequency functions in low dimensional domains. In NIPS.

Teed, Z., & Deng, J. (2020). Raft: Recurrent all-pairs field transforms for optical flow. In ECCV.

Tewari, A., Fried, O., Thies, J., Sitzmann, V., Lombardi, S., Sunkavalli, K., Martin-Brualla, R., Simon, T., Saragih, J., Nießner, M., Pandey, R., Fanello, S., Wetzstein, G., Zhu, J. Y., Theobalt, C., Agrawala, M., Shechtman, E., Goldman, D. B., & Zollhöfer, M. (2020). State of the Art on Neural Rendering. Computer Graphics Forum.

Tov, O., Alaluf, Y., Nitzan, Y., Patashnik, O., & Cohen-Or, D. (2021). Designing an encoder for stylegan image manipulation. arXiv.

Tritrong, N., Rewatbowornwong, P., & Suwajanakorn, S. (2021). Repurposing gans for one-shot semantic part segmentation. In CVPR.

Truong, P., Danelljan, M., Gool, L. V., & Timofte, R. (2020a). GOCor: Bringing globally optimized correspondence volumes into your neural network. In NIPS.

Truong, P., Danelljan, M., & Timofte, R. (2020b). Glu-net: Global-local universal network for dense flow and correspondences. In CVPR, pp 6257–6267.

Truong, P., Danelljan, M., Gool, L. V., & Timofte, R. (2021). Learning accurate dense correspondences and when to trust them. In CVPR, pp 5710–5720.

Wang, X., Bo, L., & Fuxin, L. (2019a). Adaptive wing loss for robust face alignment via heatmap regression. In ICCV

Wang, Y., Sun, Y., Liu, Z., Sarma, S. E., Bronstein, M., & Solomon, J. (2019). Dynamic graph cnn for learning on point clouds. TOG, 38, 1–12.

Wang, Z., Bagautdinov, T., Lombardi, S., Simon, T., Saragih, J., Hodgins, J., & Zollhofer, M. (2021). Learning compositional radiance fields of dynamic human heads. In CVPR, pp 5704–5713.

Wood, E., Baltrušaitis, T., Hewitt, C., Dziadzio, S., Johnson, M., Estellers, V., Cashman, T. J., & Shotton, J. (2021). Fake it till you make it: Face analysis in the wild using synthetic data alone. In Int. Conf. Comput. Vis.

Xu, J., & Wang, X. (2021). Rethinking self-supervised correspondence learning: A video frame-level similarity perspective. arXiv.

Yang, G., Belongie, S., Hariharan, B., & Koltun, V. (2021). Geometry processing with neural fields. In Thirty-Fifth Conference on Neural Information Processing Systems.

Yang, S., Jiang, L., Liu, Z., & Loy, C. C. (2022). Unsupervised image-to-image translation with generative prior. In CVPR.

Yu, A., Ye, V., Tancik, M., & Kanazawa, A. (2021). pixelnerf: Neural radiance fields from one or few images. In CVPR, pp 4578–4587.

Zhang, K., Zhang, Z., Li, Z., & Qiao, Y. (2016). Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23, 1499–1503.CrossRefADS

Zhang, K., Riegler, G., Snavely, N., & Koltun, V. (2020). Nerf++: Analyzing and improving neural radiance fields. arXiv.

Zhang, R., Isola, P., Efros, A. A., Shechtman, E., & Wang, O. (2018). The unreasonable effectiveness of deep features as a perceptual metric. In CVPR.

Zhang, W., Sun, J., & Tang, X. (2008). Cat head detection - how to effectively exploit shape and texture features. In ECCV.

Zhang, Y., Chen, W., Ling, H., Gao, J., Zhang, Y., Torralba, A., & Fidler, S. (2021a). Image gans meet differentiable rendering for inverse graphics and interpretable 3d neural rendering. In ICLR

Zhang, Y., Ling, H., Gao, J., Yin, K., Lafleche, J. F., Barriuso, A., Torralba, A., & Fidler, S. (2021b). Datasetgan: Efficient labeled data factory with minimal human effort. In CVPR.

Zheng, Y., Abrevaya, V. F., Bühler, M. C., Chen, X., Black, M. J., & Hilliges, O. (2022). I M Avatar: Implicit morphable head avatars from videos. In CVPR.

Zheng, Z., Yu, T., Dai, Q., & Liu, Y. (2021). Deep implicit templates for 3d shape representation. In CVPR, pp 1429–1439.

Zhou, P., Xie, L., Ni, B., & Tian, Q. (2021). CIPS-3D: A 3D-Aware Generator of GANs Based on Conditionally-Independent Pixel Synthesis. arXiv:2110.09788.

Zhou, T., Krähenbühl, P., Aubry, M., Huang, Q., & Efros, A. A. (2016). Learning dense correspondence via 3d-guided cycle consistency. In CVPR, pp 117–126

Title: Correspondence Distillation from NeRF-Based GAN
Authors: Yushi Lan
Chen Change Loy
Bo Dai
Publication date: 25-09-2023
Publisher: Springer US
Published in: International Journal of Computer Vision / Issue 3/2024
Print ISSN: 0920-5691
Electronic ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-023-01903-w

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 3/2024

Are Vision Transformers Robust to Spurious Correlations?

Dual Graph Networks for Pose Estimation in Crowded Scenes

In the Eye of Transformer: Global–Local Correlation for Egocentric Gaze Estimation and Beyond

MixStyle Neural Networks for Domain Generalization and Adaptation

Source-Free Domain Adaptation via Target Prediction Distribution Searching

SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels

Premium Partner