nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

Towards Visual Training Set Generation Framework

verfasst von : Jan Hůla, Irina Perfilieva, Ali Ahsan Muhummad Muzaheed

Erschienen in: Advances in Computational Intelligence

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Performance of trained computer vision algorithms is largely dependent on amounts of data, on which it is trained. Creating large labeled datasets is very expensive, and therefore many researchers use synthetically generated images with automatic annotations. To this purpose we have created a general framework, which allows researchers to generate practically infinite amount of images from a set of 3D models, textures and material settings. We leverage Voxel Cone Tracing technology implemented by NVIDIA to render photorealistic images in realtime without any kind of precomputation. We have build this framework with two use cases in mind: (i) for real world applications, where a database with synthetically generated images could compensate for small or non existent datasets, and (ii) for empirical testing of theoretical ideas by creating training sets with known inner structure.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel General Noise SVRs and Uncertainty Intervals

[AAB+16]

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., et al.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016)

[AR15]

Aubry, M., Russell, B.C.: Understanding deep features with computer-generated imagery. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2875–2883 (2015)

[BCP+16]

Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)

[BLCW09]

Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 41–48. ACM (2009)

[BRM+16]

Bitterli, B., Rousselle, F., Moon, B., Iglesias-Guitián, J.A., Adler, D., Mitchell, K., Jarosz, W., Novák, J.: Nonlinearly weighted first-order regression for denoising monte carlo renderings. Comput. Graph. Forum 35, 107–117 (2016). Wiley Online LibraryCrossRef

[BSD+16]

Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., Krishnan, D.: Unsupervised pixel-level domain adaptation with generative adversarial networks. arXiv preprint arXiv:1612.05424 (2016)

[CFG+15]

Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Hao, S., et al.: Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012 (2015)

[CNS+11]

Crassin, C., Neyret, F., Sainz, M., Green, S., Eisemann, E.: Interactive indirect illumination using voxel cone tracing. Comput. Graph. Forum 30, 1921–1930 (2011). Wiley Online LibraryCrossRef

[dSGCP16]

de Souza, C.R., Gaidon, A., Cabon, Y., López Peóa, A.M.: Procedural generation of videos to train deep action recognition networks. arXiv preprint arXiv:1612.00881 (2016)

[GAGM15]

Gupta, S., Arbeláez, P., Girshick, R., Malik, J.: Aligning 3D models to RGB-D images of cluttered scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4731–4740 (2015)

[GPAM+14]

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

[GVZ16]

Gupta, A., Vedaldi, A., Zisserman, A.: Synthetic data for text localisation in natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2315–2324 (2016)

[HNBKK15]

Hattori, H., Boddeti, V.N., Kitani, K.M., Kanade, T.: Learning scene-specific pedestrian detectors without real data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3819–3827 (2015)

[HPB+15]

Handa, A., Patraucean, V., Badrinarayanan, V., Stent, S., Cipolla, R.: Scenenet: Understanding real world indoor scenes with synthetic data. arXiv preprint arXiv:1511.07041 (2015)

[IMS+16]

Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: Flownet 2.0: Evolution of optical flow estimation with deep networks. arXiv preprint arXiv:1612.01925 (2016)

[KWR+16]

Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaśkowski, W.: Vizdoom: A doom-based AI research platform for visual reinforcement learning. arXiv preprint arXiv:1605.02097 (2016)

[LGF16]

Lerer, A., Gross, S., Fergus, R.: Learning physical intuition of block towers by example. arXiv preprint arXiv:1603.01312 (2016)

[LVVG16]

Lettry, L., Vanhoey, K., Van Gool, L.: Darn: a deep adversial residual network for intrinsic image decomposition. arXiv preprint arXiv:1612.07899 (2016)

[MHLD16]

McCormac, J., Handa, A., Leutenegger, S., Davison, A.J.: Scenenet RGB-D: 5m photorealistic images of synthetic indoor trajectories with ground truth. arXiv preprint arXiv:1612.05079 (2016)

[PSAS14]

Peng, X., Sun, B., Ali, K., Saenko, K.: Exploring invariances in deep convolutional neural networks using synthetic images. CoRR, abs/1412.7122 2(4) (2014)

[QY16]

Qiu, W., Yuille, A.: UnrealCV: connecting computer vision to unreal engine. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 909–916. Springer, Cham (2016). doi:10.1007/978-3-319-49409-8_75

[RDS+15]

Russakovsky, O., Deng, J., Hao, S., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)MathSciNetCrossRef

[RSM+16]

Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The synthia dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3234–3243 (2016)

[RVRK16]

Richter, S.R., Vineet, V., Roth, S., Koltun, V.: Playing for data: ground truth from computer games. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 102–118. Springer, Cham (2016). doi:10.1007/978-3-319-46475-6_7 CrossRef

[SDLK17]

Shah, S., Dey, D., Lovett, C., Kapoor, A.: Aerial informatics and robotics platform. Technical report MSR-TR-9, Microsoft Research (2017)

[SLX15]

Song, S., Lichtenberg, S.P., Xiao, J.: Sun RGB-D: a RGB-D scene understanding benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 567–576 (2015)

[SPT+16]

Shrivastava, A., Pfister, T., Tuzel, O., Susskind, J., Wang, W., Webb, R.: Learning from simulated and unsupervised images through adversarial training. arXiv preprint arXiv:1612.07828 (2016)

[SQLG15]

Su, H., Qi, C.R., Li, Y., Guibas, L.J.: Render for CNN: viewpoint estimation in images using CNNs trained with rendered 3D model views. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2686–2694 (2015)

[VLM+14]

Vazquez, D., Lopez, A.M., Marin, J., Ponsa, D., Geronimo, D.: Virtual and real world adaptation for pedestrian detection. IEEE Trans. Pattern Anal. Mach. Intell. 36(4), 797–809 (2014)CrossRef

[WWCN12]

Wang, T., Wu, D.J., Coates, A., Ng, A.Y.: End-to-end text recognition with convolutional neural networks. In: 21st International Conference on Pattern Recognition (ICPR), pp. 3304–3308. IEEE (2012)

[XVL+13]

Xu, J., Vázquez, D., López, A.M., Marin, J., Ponsa, D.: Learning a multiview part-based model in virtual world for pedestrian detection. In: IEEE Intelligent Vehicles Symposium (IV), pp. 467–472. IEEE (2013)

[YYT+11]

Yu, L.F., Yeung, S.K., Tang, C.K., Terzopoulos, D., Chan, T.F., Osher, S.J.: Make it home: automatic optimization of furniture arrangement (2011)

[YYW+12]

Yeh, Y.-T., Yang, L., Watson, M., Goodman, N.D., Hanrahan, P.: Synthesizing open worlds with constraints using locally annealed reversible jump MCMC. ACM Trans. Graph. (TOG) 31(4), 56 (2012)CrossRef

Titel: Towards Visual Training Set Generation Framework
verfasst von: Jan Hůla
Irina Perfilieva
Ali Ahsan Muhummad Muzaheed
Verlag: Springer International Publishing
Buch: Advances in Computational Intelligence
Print ISBN: 978-3-319-59146-9

Electronic ISBN: 978-3-319-59147-6

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-3-319-59147-6_63

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"