Skip to main content

2017 | OriginalPaper | Buchkapitel

Bio-Inspired Architecture for Deriving 3D Models from Video Sequences

verfasst von : Julius Schöning, Gunther Heidemann

Erschienen in: Computer Vision – ACCV 2016 Workshops

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In an everyday context, automatic or interactive 3D reconstruction of objects from one or several videos is not yet possible. Humans, on the contrary, are capable of recognizing the 3D shape of objects even in complex video sequences. To enable machines for doing the same, we propose a bio-inspired processing architecture, which is motivated by the human visual system and converts video data into 3D representations. Similar to the hierarchy of the ventral stream, our process reduces the influence of the position information in the video sequences by object recognition and represents the object of interest as multiple pictorial representations. These multiple pictorial representations are showing 2D projections of the object of interest from different perspectives. Thus, a 3D point cloud can be obtained by multiple view geometry algorithms. In the course of a detailed presentation of this architecture, we additionally highlight existing analogies to the view-combination scheme. The potency of our architecture is shown by reconstructing a car out of two video sequences. In case the automatic processing cannot complete the task, the user is put in the loop to solve the problem interactively. This human-machine interaction facilitates a prototype implementation of the architecture, which can reconstruct 3D objects out of one or several videos. In conclusion, the strengths and limitations of our approach are discussed, followed by an outlook to future work to improve the architecture.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
2.
Zurück zum Zitat Arikan, M., Schwärzler, M., Flöry, S., Wimmer, M., Maierhofer, S.: O-Snap: optimization-based snapping for modeling architecture. ACM Trans. Graph. 32(1), 6:1–6:15 (2013)CrossRefMATH Arikan, M., Schwärzler, M., Flöry, S., Wimmer, M., Maierhofer, S.: O-Snap: optimization-based snapping for modeling architecture. ACM Trans. Graph. 32(1), 6:1–6:15 (2013)CrossRefMATH
4.
Zurück zum Zitat Bernardini, F., Mittleman, J., Rushmeier, H., Silva, C., Taubin, G.: The ball-pivoting algorithm for surface reconstruction. IEEE Trans. Vis. Comput. Graph. 5(4), 349–359 (1999)CrossRef Bernardini, F., Mittleman, J., Rushmeier, H., Silva, C., Taubin, G.: The ball-pivoting algorithm for surface reconstruction. IEEE Trans. Vis. Comput. Graph. 5(4), 349–359 (1999)CrossRef
5.
Zurück zum Zitat Borji, A., Cheng, M.M., Jiang, H., Li, J.: Salient object deection: a benchmark. IEEE Trans. Image Process. 24(12), 5706–5722 (2015)MathSciNetCrossRef Borji, A., Cheng, M.M., Jiang, H., Li, J.: Salient object deection: a benchmark. IEEE Trans. Image Process. 24(12), 5706–5722 (2015)MathSciNetCrossRef
6.
Zurück zum Zitat Boykov, Y., Jolly, M.P.: Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In: IEEE International Conference on Computer Vision (ICCV), pp. 105–112 (2001) Boykov, Y., Jolly, M.P.: Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In: IEEE International Conference on Computer Vision (ICCV), pp. 105–112 (2001)
7.
Zurück zum Zitat Chen, T., Zhu, Z., Shamir, A., Hu, S.M., Cohen-Or, D.: 3-Sweep. ACM Trans. Graph. 32(6), 1–10 (2013) Chen, T., Zhu, Z., Shamir, A., Hu, S.M., Cohen-Or, D.: 3-Sweep. ACM Trans. Graph. 32(6), 1–10 (2013)
8.
Zurück zum Zitat Dasiopoulou, S., Giannakidou, E., Litos, G., Malasioti, P., Kompatsiaris, Y.: A survey of semantic image and video annotation tools. In: Paliouras, G., Spyropoulos, C.D., Tsatsaronis, G. (eds.) Knowledge-Driven Multimedia Information Extraction and Ontology Evolution. LNCS (LNAI), vol. 6050, pp. 196–239. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20795-2_8 CrossRef Dasiopoulou, S., Giannakidou, E., Litos, G., Malasioti, P., Kompatsiaris, Y.: A survey of semantic image and video annotation tools. In: Paliouras, G., Spyropoulos, C.D., Tsatsaronis, G. (eds.) Knowledge-Driven Multimedia Information Extraction and Ontology Evolution. LNCS (LNAI), vol. 6050, pp. 196–239. Springer, Heidelberg (2011). doi:10.​1007/​978-3-642-20795-2_​8 CrossRef
9.
Zurück zum Zitat Debevec, P.E., Taylor, C.J., Malik, J.: Modeling and rendering architecture from photographs: a hybrid geometry-and image-based approach. In: Computer Graphics and Interactive Techniques - SIGGRAPH, pp. 11–20 (1996) Debevec, P.E., Taylor, C.J., Malik, J.: Modeling and rendering architecture from photographs: a hybrid geometry-and image-based approach. In: Computer Graphics and Interactive Techniques - SIGGRAPH, pp. 11–20 (1996)
10.
Zurück zum Zitat Doermann, D., Mihalcik, D.: Tools and techniques for video performance evaluation. Int. Conf. Recogn. (ICPR) 4, 167–170 (2000) Doermann, D., Mihalcik, D.: Tools and techniques for video performance evaluation. Int. Conf. Recogn. (ICPR) 4, 167–170 (2000)
11.
Zurück zum Zitat Grundmann, M., Kwatra, V., Han, M., Essa, I.: Efficient hierarchical graph-based video segmentation. In: IEEE Computer Vision and Pattern Recognition (CVPR), pp. 2141–2148 (2010) Grundmann, M., Kwatra, V., Han, M., Essa, I.: Efficient hierarchical graph-based video segmentation. In: IEEE Computer Vision and Pattern Recognition (CVPR), pp. 2141–2148 (2010)
12.
Zurück zum Zitat van den Hengel, A., Dick, A., Thormählen, T., Ward, B., Torr, P.H.S.: VideoTrace: rapid interactive scene modelling from video. ACM Trans. Graph. 26(3), 86:1–86:6 (2007) van den Hengel, A., Dick, A., Thormählen, T., Ward, B., Torr, P.H.S.: VideoTrace: rapid interactive scene modelling from video. ACM Trans. Graph. 26(3), 86:1–86:6 (2007)
13.
Zurück zum Zitat van den Hengel, A., Hill, R., Ward, B., Dick, A.: In situ image-based modeling. In: IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 107–110 (2009) van den Hengel, A., Hill, R., Ward, B., Dick, A.: In situ image-based modeling. In: IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 107–110 (2009)
14.
Zurück zum Zitat Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: RGB-D mapping: using depth cameras for dense 3D modeling of indoor environments. In: Khatib, O., Kumar, V., Sukhatme, G. (eds.) Experimental Robotics, pp. 477–491. Springer, Heidelberg (2014)CrossRef Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: RGB-D mapping: using depth cameras for dense 3D modeling of indoor environments. In: Khatib, O., Kumar, V., Sukhatme, G. (eds.) Experimental Robotics, pp. 477–491. Springer, Heidelberg (2014)CrossRef
16.
Zurück zum Zitat Kazhdan, M., Hoppe, H.: Screened poisson surface reconstruction. ACM Trans. Graph. 32(3), 1–13 (2013)CrossRefMATH Kazhdan, M., Hoppe, H.: Screened poisson surface reconstruction. ACM Trans. Graph. 32(3), 1–13 (2013)CrossRefMATH
17.
Zurück zum Zitat Kholgade, N., Simon, T., Efros, A., Sheikh, Y.: 3D object manipulation in a single photograph using stock 3D models. ACM Trans. Graph. 33(4), 127:1–127:13 (2014)CrossRef Kholgade, N., Simon, T., Efros, A., Sheikh, Y.: 3D object manipulation in a single photograph using stock 3D models. ACM Trans. Graph. 33(4), 127:1–127:13 (2014)CrossRef
18.
Zurück zum Zitat Kowdle, A., Chang, Y.J., Gallagher, A., Batra, D., Chen, T.: Putting the user in the loop for image-based modeling. Int. J. Comput. Vis. 108(1), 30–48 (2014)MathSciNetCrossRef Kowdle, A., Chang, Y.J., Gallagher, A., Batra, D., Chen, T.: Putting the user in the loop for image-based modeling. Int. J. Comput. Vis. 108(1), 30–48 (2014)MathSciNetCrossRef
19.
Zurück zum Zitat Kurzhals, K., Bopp, C.F., Bässler, J., Ebinger, F., Weiskopf, D.: Benchmark data for evaluating visualization and analysis techniques for eye tracking for video stimuli. In: Workshop on BELIV, pp. 54–60 (2014) Kurzhals, K., Bopp, C.F., Bässler, J., Ebinger, F., Weiskopf, D.: Benchmark data for evaluating visualization and analysis techniques for eye tracking for video stimuli. In: Workshop on BELIV, pp. 54–60 (2014)
23.
Zurück zum Zitat Musialski, P., Wonka, P., Aliaga, D.G., Wimmer, M., Gool, L., Purgathofer, W.: A survey of urban reconstruction. Comput. Graph. Forum. 32, 146–177 (2013)CrossRef Musialski, P., Wonka, P., Aliaga, D.G., Wimmer, M., Gool, L., Purgathofer, W.: A survey of urban reconstruction. Comput. Graph. Forum. 32, 146–177 (2013)CrossRef
24.
Zurück zum Zitat Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohi, P., Shotton, J., Hodges, S., Fitzgibbon, A.: KinectFusion: real-time dense surface mapping and tracking. In: IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 127–136 (2011) Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohi, P., Shotton, J., Hodges, S., Fitzgibbon, A.: KinectFusion: real-time dense surface mapping and tracking. In: IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 127–136 (2011)
25.
Zurück zum Zitat Oliva, A., Torralba, A.: The role of context in object recognition. Trends Cogn. Sci. 11(12), 520–527 (2007)CrossRef Oliva, A., Torralba, A.: The role of context in object recognition. Trends Cogn. Sci. 11(12), 520–527 (2007)CrossRef
26.
Zurück zum Zitat Pan, Q., Reitmayr, G., Drummond, T.: ProFORMA: probabilistic feature-based on-line rapid model acquisition, pp. 112:1–112:11. British Machine Vision Conference (BMVC) (2009) Pan, Q., Reitmayr, G., Drummond, T.: ProFORMA: probabilistic feature-based on-line rapid model acquisition, pp. 112:1–112:11. British Machine Vision Conference (BMVC) (2009)
27.
Zurück zum Zitat Pintore, G., Gobbetti, E.: Effective mobile mapping of multi-room indoor structures. Vis. Comput. 30(6), 707–716 (2014)CrossRef Pintore, G., Gobbetti, E.: Effective mobile mapping of multi-room indoor structures. Vis. Comput. 30(6), 707–716 (2014)CrossRef
28.
Zurück zum Zitat Pollefeys, M., Nistér, D., Frahm, J.M., Akbarzadeh, A., Mordohai, P., Clipp, B., Engels, C., Gallup, D., Kim, S.J., Merrell, P., Salmi, C., Sinha, S., Talton, B., Wang, L., Yang, Q., Stewénius, H., Yang, R., Welch, G., Towles, H.: Detailed real-time urban 3D reconstruction from video. Int. J. Comput. Vis. 78(2), 143–167 (2008)CrossRef Pollefeys, M., Nistér, D., Frahm, J.M., Akbarzadeh, A., Mordohai, P., Clipp, B., Engels, C., Gallup, D., Kim, S.J., Merrell, P., Salmi, C., Sinha, S., Talton, B., Wang, L., Yang, Q., Stewénius, H., Yang, R., Welch, G., Towles, H.: Detailed real-time urban 3D reconstruction from video. Int. J. Comput. Vis. 78(2), 143–167 (2008)CrossRef
29.
Zurück zum Zitat Pollefeys, M., Van Gool, L., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., Koch, R.: Visual modeling with a hand-held camera. Int. J. Comput. Vis. 59(3), 207–232 (2004)CrossRef Pollefeys, M., Van Gool, L., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., Koch, R.: Visual modeling with a hand-held camera. Int. J. Comput. Vis. 59(3), 207–232 (2004)CrossRef
30.
Zurück zum Zitat Rother, C., Kolmogorov, V., Blake, A.: GrabCut. ACM Trans. Graph. 23(3), 309–314 (2004)CrossRef Rother, C., Kolmogorov, V., Blake, A.: GrabCut. ACM Trans. Graph. 23(3), 309–314 (2004)CrossRef
31.
Zurück zum Zitat Schöning, J.: Interactive 3D reconstruction: new opportunities for getting CAD-ready models. In: Imperial College Computing Student Workshop (ICCSW). OpenAccess Series in Informatics (OASIcs), vol. 49, pp. 54–61. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2015) Schöning, J.: Interactive 3D reconstruction: new opportunities for getting CAD-ready models. In: Imperial College Computing Student Workshop (ICCSW). OpenAccess Series in Informatics (OASIcs), vol. 49, pp. 54–61. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2015)
32.
Zurück zum Zitat Schöning, J., Faion, P., Heidemann, G.: Semi-automatic ground truth annotation in videos: an interactive tool for polygon-based object annotation and segmentation. In: International Conference on Knowledge Capture (K-CAP), pp. 17:1–17:4. ACM, New York (2015) Schöning, J., Faion, P., Heidemann, G.: Semi-automatic ground truth annotation in videos: an interactive tool for polygon-based object annotation and segmentation. In: International Conference on Knowledge Capture (K-CAP), pp. 17:1–17:4. ACM, New York (2015)
33.
Zurück zum Zitat Schöning, J., Faion, P., Heidemann, G.: Pixel-wise ground truth annotation in videos - an semi-automatic approach for pixel-wise and semantic object annotation. In: International Conference on Pattern Recognition Applications and Methods (ICPRAM), pp. 690–697. SCITEPRESS (2016) Schöning, J., Faion, P., Heidemann, G.: Pixel-wise ground truth annotation in videos - an semi-automatic approach for pixel-wise and semantic object annotation. In: International Conference on Pattern Recognition Applications and Methods (ICPRAM), pp. 690–697. SCITEPRESS (2016)
34.
Zurück zum Zitat Schöning, J., Faion, P., Heidemann, G., Krumnack, U.: Eye tracking data in multimedia containers for instantaneous visualizations. In: IEEE VIS Workshop on Eye Tracking and Visualization (ETVIS), IEEE (2016) Schöning, J., Faion, P., Heidemann, G., Krumnack, U.: Eye tracking data in multimedia containers for instantaneous visualizations. In: IEEE VIS Workshop on Eye Tracking and Visualization (ETVIS), IEEE (2016)
35.
Zurück zum Zitat Schöning, J., Faion, P., Heidemann, G., Krumnack, U.: Providing video annotations in multimedia containers for visualization and research. In: IEEE Winter Conference on Applications of Computer Vision (WACV) (2017) Schöning, J., Faion, P., Heidemann, G., Krumnack, U.: Providing video annotations in multimedia containers for visualization and research. In: IEEE Winter Conference on Applications of Computer Vision (WACV) (2017)
36.
37.
Zurück zum Zitat Schöning, J., Heidemann, G.: Taxonomy of 3D sensors - a survey of state-of-the-art consumer 3D-reconstruction sensors and their field of applications. In: Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP), vol. 3, pp. 194–199. SCITEPRESS (2016) Schöning, J., Heidemann, G.: Taxonomy of 3D sensors - a survey of state-of-the-art consumer 3D-reconstruction sensors and their field of applications. In: Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP), vol. 3, pp. 194–199. SCITEPRESS (2016)
38.
Zurück zum Zitat Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: exploring photo collections in 3D. ACM Trans. Graph. 25(3), 835–846 (2006)CrossRef Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: exploring photo collections in 3D. ACM Trans. Graph. 25(3), 835–846 (2006)CrossRef
39.
Zurück zum Zitat Solem, J.E.: Programming Computer Vision with Python: Tools and Algorithms for Analyzing Images. O’Reilly Media Inc., Sebastopol (2012) Solem, J.E.: Programming Computer Vision with Python: Tools and Algorithms for Analyzing Images. O’Reilly Media Inc., Sebastopol (2012)
41.
Zurück zum Zitat Tanskanen, P., Kolev, K., Meier, L., Camposeco, F., Saurer, O., Pollefeys, M.: Live metric 3D reconstruction on mobile phones. In: IEEE International Conference on Computer Vision (ICCV), pp. 65–72. IEEE (2013) Tanskanen, P., Kolev, K., Meier, L., Camposeco, F., Saurer, O., Pollefeys, M.: Live metric 3D reconstruction on mobile phones. In: IEEE International Conference on Computer Vision (ICCV), pp. 65–72. IEEE (2013)
43.
Zurück zum Zitat Ullman, S.: High-level Vision: Object Recognition and Visual Cognition, 2nd edn. MIT Press, Cambridge (1997)MATH Ullman, S.: High-level Vision: Object Recognition and Visual Cognition, 2nd edn. MIT Press, Cambridge (1997)MATH
44.
Zurück zum Zitat Ungerleider, L.: What and where in the human brain. Curr. Opin. Neurobiol. 4(2), 157165 (1994)CrossRef Ungerleider, L.: What and where in the human brain. Curr. Opin. Neurobiol. 4(2), 157165 (1994)CrossRef
45.
Zurück zum Zitat Ungerleider, L., Mishkin, M.: Two cortical visual systems. In: Ingle, D., Goodale, M., Mansfield, R. (eds.) Analysis Visual Behavior, pp. 549–586. MIT Press, Boston (1982) Ungerleider, L., Mishkin, M.: Two cortical visual systems. In: Ingle, D., Goodale, M., Mansfield, R. (eds.) Analysis Visual Behavior, pp. 549–586. MIT Press, Boston (1982)
46.
Zurück zum Zitat Valentin, J., Torr, P., Vineet, V., Cheng, M.M., Kim, D., Shotton, J., Kohli, P., Niener, M., Criminisi, A., Izadi, S.: Semanticpaint. ACM Trans. Graph. 34(5), 1–17 (2015)CrossRef Valentin, J., Torr, P., Vineet, V., Cheng, M.M., Kim, D., Shotton, J., Kohli, P., Niener, M., Criminisi, A., Izadi, S.: Semanticpaint. ACM Trans. Graph. 34(5), 1–17 (2015)CrossRef
48.
Zurück zum Zitat Xiang, Y., Mottaghi, R., Savarese, S.: Beyond PASCAL: a benchmark for 3D object detection in the wild. In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 75–82 (2014) Xiang, Y., Mottaghi, R., Savarese, S.: Beyond PASCAL: a benchmark for 3D object detection in the wild. In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 75–82 (2014)
49.
Zurück zum Zitat Zhang, Y., Gibson, G.M., Hay, R., Bowman, R.W., Padgett, M.J., Edgar, M.P.: A fast 3D reconstruction system with a low-cost camera accessory. Sci. Rep. 5, 10909:1–10909:7 (2015) Zhang, Y., Gibson, G.M., Hay, R., Bowman, R.W., Padgett, M.J., Edgar, M.P.: A fast 3D reconstruction system with a low-cost camera accessory. Sci. Rep. 5, 10909:1–10909:7 (2015)
50.
Zurück zum Zitat Zhang, Z., Tan, T., Huang, K., Wang, Y.: Three-dimensional deformable-model-based localization and recognition of road vehicles. IEEE Trans. Image Process. 21(1), 113 (2012)MathSciNet Zhang, Z., Tan, T., Huang, K., Wang, Y.: Three-dimensional deformable-model-based localization and recognition of road vehicles. IEEE Trans. Image Process. 21(1), 113 (2012)MathSciNet
Metadaten
Titel
Bio-Inspired Architecture for Deriving 3D Models from Video Sequences
verfasst von
Julius Schöning
Gunther Heidemann
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-54427-4_5