Skip to main content
Erschienen in: Artificial Intelligence Review 3/2017

18.07.2017

Parallel vision for perception and understanding of complex scenes: methods, framework, and perspectives

verfasst von: Kunfeng Wang, Chao Gou, Nanning Zheng, James M. Rehg, Fei-Yue Wang

Erschienen in: Artificial Intelligence Review | Ausgabe 3/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In the study of image and vision computing, the generalization capability of an algorithm often determines whether it is able to work well in complex scenes. The goal of this review article is to survey the use of photorealistic image synthesis methods in addressing the problems of visual perception and understanding. Currently, the ACP Methodology comprising artificial systems, computational experiments, and parallel execution is playing an essential role in modeling and control of complex systems. This paper extends the ACP Methodology into the computer vision field, by proposing the concept and basic framework of Parallel Vision. In this paper, we first review previous works related to Parallel Vision, in terms of synthetic data generation and utilization. We detail the utility of synthetic data for feature analysis, object analysis, scene analysis, and other analyses. Then we propose the basic framework of Parallel Vision, which is composed of an ACP trilogy (artificial scenes, computational experiments, and parallel execution). We also present some in-depth thoughts and perspectives on Parallel Vision. This paper emphasizes the significance of synthetic data to vision system design and suggests a novel research methodology for perception and understanding of complex scenes.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Allain P, Courty N, Corpetti T (2012) AGORASET: a dataset for crowd video analysis. In: 2012 ICPR international workshop on pattern recognition and crowd analysis Allain P, Courty N, Corpetti T (2012) AGORASET: a dataset for crowd video analysis. In: 2012 ICPR international workshop on pattern recognition and crowd analysis
Zurück zum Zitat Aubry M, Russell BC (2015) Understanding deep features with computer-generated imagery. In: IEEE international conference on computer vision, pp 2875–2883. doi:10.1109/ICCV.2015.329 Aubry M, Russell BC (2015) Understanding deep features with computer-generated imagery. In: IEEE international conference on computer vision, pp 2875–2883. doi:10.​1109/​ICCV.​2015.​329
Zurück zum Zitat Bertozzi M, Broggi A (1998) GOLD: a parallel real-time stereo vision system for generic obstacle and lane detection. IEEE Trans Image Process 7(1):62–81. doi:10.1109/83.650851 CrossRef Bertozzi M, Broggi A (1998) GOLD: a parallel real-time stereo vision system for generic obstacle and lane detection. IEEE Trans Image Process 7(1):62–81. doi:10.​1109/​83.​650851 CrossRef
Zurück zum Zitat Brutzer S, Höferlin B, Heidemann G (2011) Evaluation of background subtraction techniques for video surveillance. IEEE conference on computer vision and pattern recognition, pp 1937–1944. doi:10.1109/CVPR.2011.5995508 Brutzer S, Höferlin B, Heidemann G (2011) Evaluation of background subtraction techniques for video surveillance. IEEE conference on computer vision and pattern recognition, pp 1937–1944. doi:10.​1109/​CVPR.​2011.​5995508
Zurück zum Zitat Butler DJ, Wulff J, Stanley GB, Black MJ (2012) A naturalistic open source movie for optical flow evaluation. In: 2012 European conference on computer vision, pp 611–625. doi:10.1007/978-3-642-33783-3_44 Butler DJ, Wulff J, Stanley GB, Black MJ (2012) A naturalistic open source movie for optical flow evaluation. In: 2012 European conference on computer vision, pp 611–625. doi:10.​1007/​978-3-642-33783-3_​44
Zurück zum Zitat Cappelli R (2015) Fingerprint sample synthesis. In: Li SZ, Jain AK (eds) Encyclopedia of biometrics, 2nd ed. Springer, New York, pp 668–679 Cappelli R (2015) Fingerprint sample synthesis. In: Li SZ, Jain AK (eds) Encyclopedia of biometrics, 2nd ed. Springer, New York, pp 668–679
Zurück zum Zitat Charalambous CC, Bharath AA (2016) A data augmentation methodology for training machine/deep learning gait recognition algorithms. In: 2016 British Machine Vision conference. doi:10.5244/C.30.110 Charalambous CC, Bharath AA (2016) A data augmentation methodology for training machine/deep learning gait recognition algorithms. In: 2016 British Machine Vision conference. doi:10.​5244/​C.​30.​110
Zurück zum Zitat Chen C, Seff A, Kornhauser A, Xiao J (2015) DeepDriving: learning affordance for direct perception in autonomous driving. In: 2015 IEEE international conference on computer vision, pp 2722–2730. doi:10.1109/ICCV.2015.312 Chen C, Seff A, Kornhauser A, Xiao J (2015) DeepDriving: learning affordance for direct perception in autonomous driving. In: 2015 IEEE international conference on computer vision, pp 2722–2730. doi:10.​1109/​ICCV.​2015.​312
Zurück zum Zitat Chen W, Wang H, Li Y, Su H, Wang Z, Tu C, Lischinski D, Cohen-Or D, Chen B (2016) Synthesizing training images for boosting human 3D pose estimation. arXiv:1604.02703 Chen W, Wang H, Li Y, Su H, Wang Z, Tu C, Lischinski D, Cohen-Or D, Chen B (2016) Synthesizing training images for boosting human 3D pose estimation. arXiv:​1604.​02703
Zurück zum Zitat Cheung E, Wong TK, Beral A, Wang X, Manocha D (2016) LCrowdV: generating labeled videos for simulation-based crowd behavior learning. arXiv:1606.08998 Cheung E, Wong TK, Beral A, Wang X, Manocha D (2016) LCrowdV: generating labeled videos for simulation-based crowd behavior learning. arXiv:​1606.​08998
Zurück zum Zitat Creusot C, Courty N (2013) Ground truth for pedestrian analysis and application to camera calibration. In: IEEE conference on computer vision and pattern recognition workshops, pp 712–718. doi:10.1109/CVPRW.2013.108 Creusot C, Courty N (2013) Ground truth for pedestrian analysis and application to camera calibration. In: IEEE conference on computer vision and pattern recognition workshops, pp 712–718. doi:10.​1109/​CVPRW.​2013.​108
Zurück zum Zitat Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE conference on computer vision and pattern recognition, pp 886–893. doi:10.1109/CVPR.2005.177 Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE conference on computer vision and pattern recognition, pp 886–893. doi:10.​1109/​CVPR.​2005.​177
Zurück zum Zitat Dosovitskiy A, Springenberg JT, Tatarchenko M, Brox T (2017) Learning to generate chairs, tables and cars with convolutional networks. IEEE Trans Pattern Anal Mach Intell 39(4):692–705. doi:10.1109/TPAMI.2016.2567384 Dosovitskiy A, Springenberg JT, Tatarchenko M, Brox T (2017) Learning to generate chairs, tables and cars with convolutional networks. IEEE Trans Pattern Anal Mach Intell 39(4):692–705. doi:10.​1109/​TPAMI.​2016.​2567384
Zurück zum Zitat Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645. doi:10.1109/TPAMI.2009.167 CrossRef Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645. doi:10.​1109/​TPAMI.​2009.​167 CrossRef
Zurück zum Zitat Gaidon A, Wang Q, Cabon Y, Vig E (2016) Virtual worlds as proxy for multi-object tracking analysis. In: IEEE conference on computer vision and pattern recognition, pp 4340–4349. doi:10.1109/CVPR.2016.470 Gaidon A, Wang Q, Cabon Y, Vig E (2016) Virtual worlds as proxy for multi-object tracking analysis. In: IEEE conference on computer vision and pattern recognition, pp 4340–4349. doi:10.​1109/​CVPR.​2016.​470
Zurück zum Zitat Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V (2016) Domain-adversarial training of neural networks. J Mach Learn Res 17(59):1–35MathSciNetMATH Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V (2016) Domain-adversarial training of neural networks. J Mach Learn Res 17(59):1–35MathSciNetMATH
Zurück zum Zitat Ghifary M (2016) Domain adaptation and domain generalization with representation learning. Dissertation, Victoria University of Wellington, New Zealand Ghifary M (2016) Domain adaptation and domain generalization with representation learning. Dissertation, Victoria University of Wellington, New Zealand
Zurück zum Zitat Gopalan R, Li R, Patel VM, Chellappa R (2015) Domain adaptation for visual recognition. Found Trends\({\textregistered }\) in Comput Graph Vis 8(4):285–378. doi:10.1561/0600000057 Gopalan R, Li R, Patel VM, Chellappa R (2015) Domain adaptation for visual recognition. Found Trends\({\textregistered }\) in Comput Graph Vis 8(4):285–378. doi:10.​1561/​0600000057
Zurück zum Zitat Gupta A, Vedaldi A, Zisserman A (2016) Synthetic data for text localisation in natural images. In: 2016 IEEE conference on computer vision and pattern recognition, pp 2315–2324. doi:10.1109/CVPR.2016.254 Gupta A, Vedaldi A, Zisserman A (2016) Synthetic data for text localisation in natural images. In: 2016 IEEE conference on computer vision and pattern recognition, pp 2315–2324. doi:10.​1109/​CVPR.​2016.​254
Zurück zum Zitat Haltakov V, Unger C, Ilic S (2013) Framework for generation of synthetic ground truth data for driver assistance applications. In: 35th German conference on pattern recognition. doi:10.1007/978-3-642-40602-7_35 Haltakov V, Unger C, Ilic S (2013) Framework for generation of synthetic ground truth data for driver assistance applications. In: 35th German conference on pattern recognition. doi:10.​1007/​978-3-642-40602-7_​35
Zurück zum Zitat Handa A, Pătrăucean V, Badrinarayanan V, Stent S, Cipolla R (2015) SceneNet: understanding real world indoor scenes with synthetic data. arXiv:1511.07041 Handa A, Pătrăucean V, Badrinarayanan V, Stent S, Cipolla R (2015) SceneNet: understanding real world indoor scenes with synthetic data. arXiv:​1511.​07041
Zurück zum Zitat Handa A, Pătrăucean V, Badrinarayanan V, Stent S, Cipolla R (2016) Understanding real world indoor scenes with synthetic data. In: IEEE conference on computer vision and pattern recognition, pp 4077-4085. doi:10.1109/CVPR.2016.442 Handa A, Pătrăucean V, Badrinarayanan V, Stent S, Cipolla R (2016) Understanding real world indoor scenes with synthetic data. In: IEEE conference on computer vision and pattern recognition, pp 4077-4085. doi:10.​1109/​CVPR.​2016.​442
Zurück zum Zitat Hattori H, Boddeti VN, Kitani K, Kanade T (2015) Learning scene-specific pedestrian detectors without real data. In: IEEE conference on computer vision and pattern recognition, pp 3819–3827. doi:10.1109/CVPR.2015.7299006 Hattori H, Boddeti VN, Kitani K, Kanade T (2015) Learning scene-specific pedestrian detectors without real data. In: IEEE conference on computer vision and pattern recognition, pp 3819–3827. doi:10.​1109/​CVPR.​2015.​7299006
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition, pp 770–778. doi:10.1109/CVPR.2016.90 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition, pp 770–778. doi:10.​1109/​CVPR.​2016.​90
Zurück zum Zitat Jaderberg M, Simonyan K, Vedaldi A, Zisserman A (2014) Synthetic data and artificial neural networks for natural scene text recognition. arXiv:1406.2227 Jaderberg M, Simonyan K, Vedaldi A, Zisserman A (2014) Synthetic data and artificial neural networks for natural scene text recognition. arXiv:​1406.​2227
Zurück zum Zitat Johnson-Roberson M, Barto C, Mehta R, Sridhar SN, Vasudevan R (2016) Driving in the matrix: can virtual worlds replace human-generated annotations for real world tasks? arXiv:1610.01983 Johnson-Roberson M, Barto C, Mehta R, Sridhar SN, Vasudevan R (2016) Driving in the matrix: can virtual worlds replace human-generated annotations for real world tasks? arXiv:​1610.​01983
Zurück zum Zitat Jones N (2014) Computer science: the learning machines. Nature 505(7482):146–148CrossRef Jones N (2014) Computer science: the learning machines. Nature 505(7482):146–148CrossRef
Zurück zum Zitat Kaneva B, Torralba A, Freeman WT (2011) Evaluation of image features using a photorealistic virtual world. In: 2011 IEEE international conference on computer vision, pp 2282–2289. doi:10.1109/ICCV.2011.6126508 Kaneva B, Torralba A, Freeman WT (2011) Evaluation of image features using a photorealistic virtual world. In: 2011 IEEE international conference on computer vision, pp 2282–2289. doi:10.​1109/​ICCV.​2011.​6126508
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, vol 25 (NIPS 2012). doi:10.1145/3065386 Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, vol 25 (NIPS 2012). doi:10.​1145/​3065386
Zurück zum Zitat Mahendran A, Bilen H, Henriques JF, Vedaldi A (2016) ResearchDoom and CocoDoom: learning computer vision with games. arXiv:1610.02431 Mahendran A, Bilen H, Henriques JF, Vedaldi A (2016) ResearchDoom and CocoDoom: learning computer vision with games. arXiv:​1610.​02431
Zurück zum Zitat Marín J, Vázquez D, Gerónimo D, López AM (2010) Learning appearance in virtual scenarios for pedestrian detection. In: 2010 IEEE conference on computer vision and pattern recognition, pp 137–144. doi:10.1109/CVPR.2010.5540218 Marín J, Vázquez D, Gerónimo D, López AM (2010) Learning appearance in virtual scenarios for pedestrian detection. In: 2010 IEEE conference on computer vision and pattern recognition, pp 137–144. doi:10.​1109/​CVPR.​2010.​5540218
Zurück zum Zitat Mayer N, Ilg E, Häusser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: IEEE conference on computer vision and pattern recognition, pp 4040–4048. doi:10.1109/CVPR.2016.438 Mayer N, Ilg E, Häusser P, Fischer P, Cremers D, Dosovitskiy A, Brox T (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: IEEE conference on computer vision and pattern recognition, pp 4040–4048. doi:10.​1109/​CVPR.​2016.​438
Zurück zum Zitat Peng X, Sun B, Ali K, Saenko K (2015) Learning deep object detectors from 3D models. In: 2015 IEEE international conference on computer vision, pp 1278–1286. doi:10.1109/ICCV.2015.151 Peng X, Sun B, Ali K, Saenko K (2015) Learning deep object detectors from 3D models. In: 2015 IEEE international conference on computer vision, pp 1278–1286. doi:10.​1109/​ICCV.​2015.​151
Zurück zum Zitat Pinto N, Barhomi Y, Cox DD, DiCarlo JJ (2011) Comparing state-of-the-art visual features on invariant object recognition tasks. In: IEEE workshop on applications of computer vision, pp 463–470. doi:10.1109/WACV.2011.5711540 Pinto N, Barhomi Y, Cox DD, DiCarlo JJ (2011) Comparing state-of-the-art visual features on invariant object recognition tasks. In: IEEE workshop on applications of computer vision, pp 463–470. doi:10.​1109/​WACV.​2011.​5711540
Zurück zum Zitat Prendinger H, Gajananan K, Zaki AB, Fares A, Molenaar R, Urbano D, van Lint H, Gomaa W (2013) Tokyo Virtual Living Lab: designing smart cities based on the 3D Internet. IEEE Internet Comput 17(6):30–38. doi:10.1109/MIC.2013.87 CrossRef Prendinger H, Gajananan K, Zaki AB, Fares A, Molenaar R, Urbano D, van Lint H, Gomaa W (2013) Tokyo Virtual Living Lab: designing smart cities based on the 3D Internet. IEEE Internet Comput 17(6):30–38. doi:10.​1109/​MIC.​2013.​87 CrossRef
Zurück zum Zitat Qiu W, Yuille A (2016) UnrealCV: connecting computer vision to Unreal Engine. In: 2016 ECCV workshop on virtual/augmented reality for visual artificial intelligence, pp 909-916. doi:10.1007/978-3-319-49409-8_75 Qiu W, Yuille A (2016) UnrealCV: connecting computer vision to Unreal Engine. In: 2016 ECCV workshop on virtual/augmented reality for visual artificial intelligence, pp 909-916. doi:10.​1007/​978-3-319-49409-8_​75
Zurück zum Zitat Ragheb H, Velastin S, Remagnino P, Ellis T (2008) ViHASi: virtual human action silhouette data for the performance evaluation of silhouette-based action recognition methods. In: ACM/IEEE international conference on distributed smart cameras, pp 1–10. doi:10.1109/ICDSC.2008.4635730 Ragheb H, Velastin S, Remagnino P, Ellis T (2008) ViHASi: virtual human action silhouette data for the performance evaluation of silhouette-based action recognition methods. In: ACM/IEEE international conference on distributed smart cameras, pp 1–10. doi:10.​1109/​ICDSC.​2008.​4635730
Zurück zum Zitat Rematas K, Ritschel T, Fritz M, Tuytelaars T (2014) Image-based synthesis and re-synthesis of viewpoints guided by 3D models. In: 2014 IEEE conference on computer vision and pattern recognition, pp 3898–3905. doi:10.1109/CVPR.2014.498 Rematas K, Ritschel T, Fritz M, Tuytelaars T (2014) Image-based synthesis and re-synthesis of viewpoints guided by 3D models. In: 2014 IEEE conference on computer vision and pattern recognition, pp 3898–3905. doi:10.​1109/​CVPR.​2014.​498
Zurück zum Zitat Richter SR, Vineet V, Roth S, Koltun V (2016) Playing for data: ground truth from computer games. In: 2016 European conference on computer vision, pp 102–118. doi:10.1007/978-3-319-46475-6_7 Richter SR, Vineet V, Roth S, Koltun V (2016) Playing for data: ground truth from computer games. In: 2016 European conference on computer vision, pp 102–118. doi:10.​1007/​978-3-319-46475-6_​7
Zurück zum Zitat Ros G, Sellart L, Materzynska J, Vazquez D, Lopez AM (2016) The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: 2016 IEEE conference on computer vision and pattern recognition, pp 3234–3243. doi:10.1109/CVPR.2016.352 Ros G, Sellart L, Materzynska J, Vazquez D, Lopez AM (2016) The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: 2016 IEEE conference on computer vision and pattern recognition, pp 3234–3243. doi:10.​1109/​CVPR.​2016.​352
Zurück zum Zitat Shafaei A, Little JJ, Schmidt M (2016) Play and learn: using video games to train computer vision models. In: 2016 The British machine vision conference. doi:10.5244/C.30.26 Shafaei A, Little JJ, Schmidt M (2016) Play and learn: using video games to train computer vision models. In: 2016 The British machine vision conference. doi:10.​5244/​C.​30.​26
Zurück zum Zitat Shotton J, Girshick R, Fitzgibbon A, Sharp T, Cook M, Finocchio M, Moore R, Kohli P, Criminisi A, Kipman A, Blake A (2013) Efficient human pose estimation from single depth images. IEEE Trans Pattern Anal Mach Intell 35(12):2821–2840. doi:10.1109/TPAMI.2012.241 CrossRef Shotton J, Girshick R, Fitzgibbon A, Sharp T, Cook M, Finocchio M, Moore R, Kohli P, Criminisi A, Kipman A, Blake A (2013) Efficient human pose estimation from single depth images. IEEE Trans Pattern Anal Mach Intell 35(12):2821–2840. doi:10.​1109/​TPAMI.​2012.​241 CrossRef
Zurück zum Zitat Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. arXiv:1604.03540 Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. arXiv:​1604.​03540
Zurück zum Zitat Sizikova E, Singh VK, Georgescu B, Halber M, Ma K, Chen T (2016) Enhancing place recognition using joint intensity—depth analysis and synthetic data. In: ECCV workshop on virtual/augmented reality for visual artificial intelligence, pp 901–908. doi:10.1007/978-3-319-49409-8_74 Sizikova E, Singh VK, Georgescu B, Halber M, Ma K, Chen T (2016) Enhancing place recognition using joint intensity—depth analysis and synthetic data. In: ECCV workshop on virtual/augmented reality for visual artificial intelligence, pp 901–908. doi:10.​1007/​978-3-319-49409-8_​74
Zurück zum Zitat Sun B, Peng X, Saenko K (2015) Generating large scale image datasets from 3D CAD models. In: CVPR 2015 Workshop on the future of datasets in vision Sun B, Peng X, Saenko K (2015) Generating large scale image datasets from 3D CAD models. In: CVPR 2015 Workshop on the future of datasets in vision
Zurück zum Zitat Sun B, Saenko K (2014) From virtual to reality: fast adaptation of virtual object detectors to real domains. In: 2014 British machine vision conference. doi:10.5244/C.28.82 Sun B, Saenko K (2014) From virtual to reality: fast adaptation of virtual object detectors to real domains. In: 2014 British machine vision conference. doi:10.​5244/​C.​28.​82
Zurück zum Zitat Su H, Qi CR, Li Y, Guibas L (2015) Render for CNN: viewpoint estimation in images using CNNs trained with rendered 3D model views. In: IEEE international conference on computer vision, pp 2686–2694. doi:10.1109/ICCV.2015.308 Su H, Qi CR, Li Y, Guibas L (2015) Render for CNN: viewpoint estimation in images using CNNs trained with rendered 3D model views. In: IEEE international conference on computer vision, pp 2686–2694. doi:10.​1109/​ICCV.​2015.​308
Zurück zum Zitat Szeliski R (2010) Computer vision: algorithms and applications. Springer, New YorkMATH Szeliski R (2010) Computer vision: algorithms and applications. Springer, New YorkMATH
Zurück zum Zitat Taylor GR, Chosak AJ, Brewer PC (2007) OVVV: using virtual worlds to design and evaluate surveillance systems. In: 2007 IEEE conference on computer vision and pattern recognition, pp 1–8. doi:10.1109/CVPR.2007.383518 Taylor GR, Chosak AJ, Brewer PC (2007) OVVV: using virtual worlds to design and evaluate surveillance systems. In: 2007 IEEE conference on computer vision and pattern recognition, pp 1–8. doi:10.​1109/​CVPR.​2007.​383518
Zurück zum Zitat Thacker NA, Clark AF, Barron JL, Beveridge JR, Courtney P, Crum WR, Ramesh V, Clark C (2008) Performance characterization in computer vision: a guide to best practices. Comput Vis Image Underst 109(3):305–334. doi:10.1016/j.cviu.2007.04.006 CrossRef Thacker NA, Clark AF, Barron JL, Beveridge JR, Courtney P, Crum WR, Ramesh V, Clark C (2008) Performance characterization in computer vision: a guide to best practices. Comput Vis Image Underst 109(3):305–334. doi:10.​1016/​j.​cviu.​2007.​04.​006 CrossRef
Zurück zum Zitat Vacavant A, Chateau T, Wilhelm A, Lequièvre L (2013) A benchmark dataset for outdoor foreground background extraction. In: ACCV 2012 workshops, Lecture Notes in Computer Science vol 7728, pp 291–300. doi:10.1007/978-3-642-37410-4_25 Vacavant A, Chateau T, Wilhelm A, Lequièvre L (2013) A benchmark dataset for outdoor foreground background extraction. In: ACCV 2012 workshops, Lecture Notes in Computer Science vol 7728, pp 291–300. doi:10.​1007/​978-3-642-37410-4_​25
Zurück zum Zitat Vázquez D (2013) Domain adaptation of virtual and real worlds for pedestrian detection. Dissertation, Universitat de Barcelona, Spain Vázquez D (2013) Domain adaptation of virtual and real worlds for pedestrian detection. Dissertation, Universitat de Barcelona, Spain
Zurück zum Zitat Veeravasarapu VSR, Hota RN, Rothkopf C, Visvanathan R (2015a) Model validation for vision systems via graphics simulation. arXiv:1512.01401 Veeravasarapu VSR, Hota RN, Rothkopf C, Visvanathan R (2015a) Model validation for vision systems via graphics simulation. arXiv:​1512.​01401
Zurück zum Zitat Veeravasarapu VSR, Rothkopf C, Visvanathan R (2016) Model-driven simulations for deep convolutional neural networks. arXiv:1605.09582 Veeravasarapu VSR, Rothkopf C, Visvanathan R (2016) Model-driven simulations for deep convolutional neural networks. arXiv:​1605.​09582
Zurück zum Zitat Wang F-Y (2004) Parallel system methods for management and control of complex systems. Control Decis 19(5):485–489MATH Wang F-Y (2004) Parallel system methods for management and control of complex systems. Control Decis 19(5):485–489MATH
Zurück zum Zitat Wang F-Y, Zhang JJ, Zheng X, Wang X, Yuan Y, Dai X, Zhang J, Yang L (2016) Where does AlphaGo go: from Church-Turing Thesis to AlphaGo Thesis and beyond. IEEE/CAA J Automatica Sinica 3(2):113–120. doi:10.1109/JAS.2016.7471613 CrossRef Wang F-Y, Zhang JJ, Zheng X, Wang X, Yuan Y, Dai X, Zhang J, Yang L (2016) Where does AlphaGo go: from Church-Turing Thesis to AlphaGo Thesis and beyond. IEEE/CAA J Automatica Sinica 3(2):113–120. doi:10.​1109/​JAS.​2016.​7471613 CrossRef
Zurück zum Zitat Wulff J, Butler DJ, Stanley GB, Black MJ (2012) Lessons and insights from creating a synthetic optical flow benchmark. In: 2012 ECCV workshop on unsolved problems in optical flow and stereo estimation, pp 168–177. doi:10.1007/978-3-642-33868-7_17 Wulff J, Butler DJ, Stanley GB, Black MJ (2012) Lessons and insights from creating a synthetic optical flow benchmark. In: 2012 ECCV workshop on unsolved problems in optical flow and stereo estimation, pp 168–177. doi:10.​1007/​978-3-642-33868-7_​17
Zurück zum Zitat Zeng X, Ouyang W, Wang M, Wang X (2014) Deep learning of scene-specific classifier for pedestrian detection. In: 2014 European conference on computer vision, pp 472-487. doi:10.1007/978-3-319-10578-9_31 Zeng X, Ouyang W, Wang M, Wang X (2014) Deep learning of scene-specific classifier for pedestrian detection. In: 2014 European conference on computer vision, pp 472-487. doi:10.​1007/​978-3-319-10578-9_​31
Metadaten
Titel
Parallel vision for perception and understanding of complex scenes: methods, framework, and perspectives
verfasst von
Kunfeng Wang
Chao Gou
Nanning Zheng
James M. Rehg
Fei-Yue Wang
Publikationsdatum
18.07.2017
Verlag
Springer Netherlands
Erschienen in
Artificial Intelligence Review / Ausgabe 3/2017
Print ISSN: 0269-2821
Elektronische ISSN: 1573-7462
DOI
https://doi.org/10.1007/s10462-017-9569-z