nach oben

Erschienen in:

2016 | OriginalPaper | Buchkapitel

How Useful Is Photo-Realistic Rendering for Visual Learning?

verfasst von : Yair Movshovitz-Attias, Takeo Kanade, Yaser Sheikh

Erschienen in: Computer Vision – ECCV 2016 Workshops

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Data seems cheap to get, and in many ways it is, but the process of creating a high quality labeled dataset from a mass of data is time-consuming and expensive.

With the advent of rich 3D repositories, photo-realistic rendering systems offer the opportunity to provide nearly limitless data. Yet, their primary value for visual learning may be the quality of the data they can provide rather than the quantity. Rendering engines offer the promise of perfect labels in addition to the data: what the precise camera pose is; what the precise lighting location, temperature, and distribution is; what the geometry of the object is.

In this work we focus on semi-automating dataset creation through use of synthetic data and apply this method to an important task – object viewpoint estimation. Using state-of-the-art rendering software we generate a large labeled dataset of cars rendered densely in viewpoint space. We investigate the effect of rendering parameters on estimation performance and show realism is important. We show that generalizing from synthetic data is not harder than the domain adaptation required between two real-image datasets and that combining synthetic images with a small amount of real data improves estimation accuracy.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Deep Kinematic Pose Regression

Nächstes Kapitel Learning the Structure of Objects from Web Supervision

Bileschi, S.M.: StreetScenes: towards scene understanding in still images. Ph.D. thesis, Massachusetts Institute of Technology (2006)

Boddeti, V.N., Kanade, T., Kumar, B.: Correlation filters for object alignment. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2013)

Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., Ranzato, M., Senior, A., Tucker, P., Yang, K., Le, Q.V., Ng, A.Y.: Large scale distributed deep networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 1223–1231 (2012)

Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2009)

Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A.: The pascal visual object classes challenge 2011 (VOC2011) results (2011)

Fatahalian, K.: Enolving the real-time graphics pipeline for micropolygon rendering. Ph.D. thesis, Stanford University (2011)

Goodfellow, I.J., Bulatov, Y., Ibarz, J., Arnoud, S., Shet, V.: Multi-digit number recognition from street view imagery using deep convolutional neural networks. arXiv preprint arXiv:1312.6082 (2013)

Group, C: Vray rendering engine (2015). http://www.chaosgroup.com

Hattori, H., Naresh Boddeti, V., Kitani, K.M., Kanade, T.: Learning scene-specific pedestrian detectors without real data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

10.

Heimerl, K., Gawalt, B., Chen, K., Parikh, T., Hartmann, B.: CommunitySourcing: engaging local crowds to perform expert work via physical kiosks. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM (2012)

11.

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)

12.

Law, E., Settles, B., Snook, A., Surana, H., Von Ahn, L., Mitchell, T.: Human computation for attribute and attribute value acquisition. In: Proceedings of the First Workshop on Fine-Grained Visual Categorization (FGVC) (2011)

13.

Lepetit, V., Moreno-Noguer, F., Fua, P.: EP\(n\)P: an accurate \(O(n)\) solution to the P\(n\)P problem. International Journal Computer Vision 81, 155–166 (2009)CrossRef

14.

Movshovitz-Attias, Y., Naresh Boddeti, V., Wei, Z., Sheikh, Y.: 3D pose-by-detection of vehicles via discriminatively reduced ensembles of correlation filters. In: Proceedings of the British Machine Vision Conference (BMVC), Nottingham, UK, September 2014

15.

Movshovitz-Attias, Y., Yu, Q., Stumpe, M., Shet, V., Arnoud, S., Yatziv, L.: Ontological supervision for fine grained classification of street view storefronts. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

16.

Nene, S.A., Nayar, S.K., Murase, H., et al.: Columbia object image library (coil-20). Technical report CUCS-005-96 (1996)

17.

Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)CrossRefMATH

18.

Pepik, B., Benenson, R., Ritschel, T., Schiele, B.: What is holding back convnets for detection? CoRR (2015). http://arxiv.org/abs/1508.02844

19.

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2014)MathSciNetCrossRef

20.

Stark, M., Goesele, M., Schiele, B.: Back to the future: learning shape models from 3D cad data. In: Proceedings of the British Machine Vision Conference (BMVC) (2010)

21.

Su, H., Qi, C.R., Li, Y., Guibas, L.J.: Render for CNN: viewpoint estimation in images using CNNs trained with rendered 3D model views. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)

22.

Sun, B., Saenko, K.: From virtual to reality: fast adaptation of virtual object detectors to real domains. In: Proceedings of the British Machine Vision Conference (BMVC) (2014)

23.

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. arXiv:1409.4842 [cs], September 2014

24.

Torralba, A., Efros, A.: Unbiased look at dataset bias. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2011)

25.

Vazquez, D., Lopez, A.M., Marin, J., Ponsa, D., Geronimo, D.: Virtual and real world adaptation for pedestrian detection. IEEE Trans. Pattern Anal. Mach. Intell. 36, 797–809 (2014)CrossRef

26.

Von Ahn, L., Dabbish, L.: Labeling images with a computer game. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM (2004)

27.

Wu, R., Yan, S., Shan, Y., Dang, Q., Sun, G.: Deep image: scaling up image recognition. arXiv preprint arXiv:1501.02876 (2015). http://arxiv.org/abs/1501.02876

28.

Xiang, Y., Mottaghi, R., Savarese, S.: Beyond pascal: a benchmark for 3D object detection in the wild. In: Winter Conference on Applications of Computer Vision (WACV) (2014)

Titel: How Useful Is Photo-Realistic Rendering for Visual Learning?
verfasst von: Yair Movshovitz-Attias
Takeo Kanade
Yaser Sheikh
Verlag: Springer International Publishing
Buch: Computer Vision – ECCV 2016 Workshops
Print ISBN: 978-3-319-49408-1

Electronic ISBN: 978-3-319-49409-8

Copyright-Jahr: 2016
DOI: https://doi.org/10.1007/978-3-319-49409-8_18

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner