Skip to main content
Top

2023 | OriginalPaper | Chapter

3. Principles of Object Tracking and Mapping

Authors : Jason Rambach, Alain Pagani, Didier Stricker

Published in: Springer Handbook of Augmented Reality

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Tracking is the main enabling technology for Augmented Reality (AR) as it allows realistic placement of virtual content in the real world. In this chapter, we discuss the most important aspects of tracking for AR while reviewing existing systems that shaped the field over the past years. Initially, we provide a notation for the description of 6 Degree of Freedom (6DoF) poses and camera models. Subsequently, we describe fundamental computer vision techniques that tracking systems frequently use such as feature matching and tracking or pose estimation. We divide the description of tracking approaches into model-based approaches and Simultaneous Localization and Mapping (SLAM) approaches. Model-based approaches use a synthetic representation of an object as a template in order to match the real object. This matching can use texture or lines as tracking features in order to establish correspondences from the models to the image, whereas machine learning approaches for direct pose estimation of an object from an input image have also been recently introduced. Currently, an upcoming challenge is the extension of tracking systems for AR from rigid objects to articulated and nonrigid objects. SLAM tracking systems do not require any models as a reference as they can simultaneously track and map their environment. We discuss keypoint-based, direct, and semi-direct purely visual SLAM system approaches. Next, we analyze the use of additional sensors that can support tracking such as visual-inertial sensor fusion techniques or depth sensing. Finally, we also look at the use of machine learning techniques and especially the use of deep neural networks in conjunction with traditional computer vision approaches for SLAM.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
6.
go back to reference Adagolodjo, Y., Trivisonne, R., Haouchine, N., Cotin, S., Courtecuisse, H.: Silhouette-based pose estimation for deformable organs application to surgical augmented reality. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 539–544. IEEE, New York (2017) Adagolodjo, Y., Trivisonne, R., Haouchine, N., Cotin, S., Courtecuisse, H.: Silhouette-based pose estimation for deformable organs application to surgical augmented reality. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 539–544. IEEE, New York (2017)
7.
go back to reference Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017) Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
8.
go back to reference Baker, S., Bennett, E., Kang, S.B., Szeliski, R.: Removing rolling shutter wobble. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2392–2399. IEEE, New York (2010) Baker, S., Bennett, E., Kang, S.B., Szeliski, R.: Removing rolling shutter wobble. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2392–2399. IEEE, New York (2010)
9.
go back to reference Baker, S., Scharstein, D., Lewis, J.P., Roth, S., Black, M.J., Szeliski, R.: A database and evaluation methodology for optical flow. Int. J. Comput. Vis. 92(1), 1–31 (2011) Baker, S., Scharstein, D., Lewis, J.P., Roth, S., Black, M.J., Szeliski, R.: A database and evaluation methodology for optical flow. Int. J. Comput. Vis. 92(1), 1–31 (2011)
10.
go back to reference Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Proceedings of the Conference on Neural Information Processing Systems, pp. 153–160 (2007) Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Proceedings of the Conference on Neural Information Processing Systems, pp. 153–160 (2007)
11.
go back to reference Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional Siamese networks for object tracking. In: Proceedings of the European Conference on Computer Vision, pp. 850–865. Springer, Berlin (2016) Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional Siamese networks for object tracking. In: Proceedings of the European Conference on Computer Vision, pp. 850–865. Springer, Berlin (2016)
12.
go back to reference Bleser, G.: Towards visual-inertial slam for mobile augmented reality. Verlag Dr. Hut, Germany (2009) Bleser, G.: Towards visual-inertial slam for mobile augmented reality. Verlag Dr. Hut, Germany (2009)
13.
go back to reference Bloesch, M., Czarnowski, J., Clark, R., Leutenegger, S., Davison, A.J.: CodeSLAM-learning a compact, optimisable representation for dense visual SLAM. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2560–2568. IEEE, New York (2018) Bloesch, M., Czarnowski, J., Clark, R., Leutenegger, S., Davison, A.J.: CodeSLAM-learning a compact, optimisable representation for dense visual SLAM. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2560–2568. IEEE, New York (2018)
14.
go back to reference Bouguet, J.Y.: Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm. Intel Corporation 5(1–10), 4 (2001) Bouguet, J.Y.: Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm. Intel Corporation 5(1–10), 4 (2001)
15.
go back to reference Brachmann, E., Michel, F., Krull, A., Ying Yang, M., Gumhold, S., et al.: Uncertainty-driven 6d pose estimation of objects and scenes from a single RGB image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3364–3372. IEEE, New York (2016) Brachmann, E., Michel, F., Krull, A., Ying Yang, M., Gumhold, S., et al.: Uncertainty-driven 6d pose estimation of objects and scenes from a single RGB image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3364–3372. IEEE, New York (2016)
16.
go back to reference Bradski, G.: The OpenCV Library. Dr Dobb’s J. Softw. Tools 25(11), 120–123 (2000) Bradski, G.: The OpenCV Library. Dr Dobb’s J. Softw. Tools 25(11), 120–123 (2000)
17.
go back to reference Brox, T., Rosenhahn, B., Gall, J., Cremers, D.: Combined region and motion-based 3D tracking of rigid and articulated objects. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 402–415 (2009) Brox, T., Rosenhahn, B., Gall, J., Cremers, D.: Combined region and motion-based 3D tracking of rigid and articulated objects. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 402–415 (2009)
18.
go back to reference Buerli, M., Misslinger, S.: Introducing ARKit-Augmented Reality for iOS. In: Proceedings of the Apple Worldwide Developers Conference, pp. 1–187 (2017) Buerli, M., Misslinger, S.: Introducing ARKit-Augmented Reality for iOS. In: Proceedings of the Apple Worldwide Developers Conference, pp. 1–187 (2017)
19.
go back to reference Bulat, A., Tzimiropoulos, G.: How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks). In: Proceedings of the IEEE International Conference on Computer Vision. IEEE, New York (2017) Bulat, A., Tzimiropoulos, G.: How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks). In: Proceedings of the IEEE International Conference on Computer Vision. IEEE, New York (2017)
20.
go back to reference Calonder, M., Lepetit, V., Strecha, C., Fua, P.: Brief: Binary robust independent elementary features. In: Proceedings of the European Conference on Computer Vision, pp. 778–792. Springer, Berlin (2010) Calonder, M., Lepetit, V., Strecha, C., Fua, P.: Brief: Binary robust independent elementary features. In: Proceedings of the European Conference on Computer Vision, pp. 778–792. Springer, Berlin (2010)
21.
go back to reference Caron, G., Dame, A., Marchand, E.: Direct model based visual tracking and pose estimation using mutual information. Image Vis. Comput. 32(1), 54–63 (2014) Caron, G., Dame, A., Marchand, E.: Direct model based visual tracking and pose estimation using mutual information. Image Vis. Comput. 32(1), 54–63 (2014)
22.
go back to reference Chen, L., Day, T.W., Tang, W., John, N.W.: Recent developments and future challenges in medical mixed reality. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 123–135. IEEE, New York (2017) Chen, L., Day, T.W., Tang, W., John, N.W.: Recent developments and future challenges in medical mixed reality. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 123–135. IEEE, New York (2017)
23.
go back to reference Chen, C., Zhu, H., Li, M., You, S.: A review of visual-inertial simultaneous localization and mapping from filtering-based and optimization-based perspectives. Robotics 7(3), 45 (2018a) Chen, C., Zhu, H., Li, M., You, S.: A review of visual-inertial simultaneous localization and mapping from filtering-based and optimization-based perspectives. Robotics 7(3), 45 (2018a)
24.
go back to reference Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018b) Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018b)
25.
go back to reference Clark, R., Wang, S., Wen, H., Markham, A., Trigoni, N.: Vinet: Visual-inertial odometry as a sequence-to-sequence learning problem. In: Proceedigs of the AAAI Conference on Artificial Intelligence (2017) Clark, R., Wang, S., Wen, H., Markham, A., Trigoni, N.: Vinet: Visual-inertial odometry as a sequence-to-sequence learning problem. In: Proceedigs of the AAAI Conference on Artificial Intelligence (2017)
26.
go back to reference Concha, A., Civera, J.: DPPTAM: Dense piecewise planar tracking and mapping from a monocular sequence. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, ART-2015-92153. IEEE, New York (2015) Concha, A., Civera, J.: DPPTAM: Dense piecewise planar tracking and mapping from a monocular sequence. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, ART-2015-92153. IEEE, New York (2015)
27.
go back to reference Davison, A., Murray, D.: Simultaneous localization and map-building using active vision. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 865–880 (2002) Davison, A., Murray, D.: Simultaneous localization and map-building using active vision. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 865–880 (2002)
28.
go back to reference Davison, A.J., Reid, I.D., Molton, n.d., Stasse, O.: MonoSLAM: Real-time single camera SLAM. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1052–1067 (2007) Davison, A.J., Reid, I.D., Molton, n.d., Stasse, O.: MonoSLAM: Real-time single camera SLAM. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1052–1067 (2007)
29.
go back to reference Dementhon, D.F., Davis, L.S.: Model-based object pose in 25 lines of code. Int. J. Comput. Vis. 15(1–2), 123–141 (1995) Dementhon, D.F., Davis, L.S.: Model-based object pose in 25 lines of code. Int. J. Comput. Vis. 15(1–2), 123–141 (1995)
30.
go back to reference DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: Self-supervised interest point detection and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 224–236. IEEE, New York (2018) DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: Self-supervised interest point detection and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 224–236. IEEE, New York (2018)
31.
go back to reference Drummond, T., Cipolla, R.: Real-time visual tracking of complex structures. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 932–946 (2002) Drummond, T., Cipolla, R.: Real-time visual tracking of complex structures. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 932–946 (2002)
32.
go back to reference Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(Jul), 2121–2159 (2011)MATH Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(Jul), 2121–2159 (2011)MATH
33.
go back to reference Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: Large-scale direct monocular SLAM. In: Proceedings of the European Conference on Computer Vision, pp. 834–849. Springer, Berlin (2014) Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: Large-scale direct monocular SLAM. In: Proceedings of the European Conference on Computer Vision, pp. 834–849. Springer, Berlin (2014)
34.
go back to reference Fiala, M.: Artag revision 1, a fiducial marker system using digital techniques. Natl. Res. Counc. Publ. 47419, 1–47 (2004) Fiala, M.: Artag revision 1, a fiducial marker system using digital techniques. Natl. Res. Counc. Publ. 47419, 1–47 (2004)
35.
go back to reference Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981) Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
36.
go back to reference Fleet, D., Weiss, Y.: Optical flow estimation. In: Handbook of mathematical models in computer vision, pp. 237–257. Springer, Berlin (2006) Fleet, D., Weiss, Y.: Optical flow estimation. In: Handbook of mathematical models in computer vision, pp. 237–257. Springer, Berlin (2006)
37.
go back to reference Forster, C., Pizzoli, M., Scaramuzza, D.: SVO: Fast semi-direct monocular visual odometry. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 15–22. IEEE, New York (2014) Forster, C., Pizzoli, M., Scaramuzza, D.: SVO: Fast semi-direct monocular visual odometry. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 15–22. IEEE, New York (2014)
38.
go back to reference Forster, C., Carlone, L., Dellaert, F., Scaramuzza, D.: IMU preintegration on manifold for efficient visual-inertial maximum-a-posteriori estimation. Georgia Institute of Technology, New York (2015) Forster, C., Carlone, L., Dellaert, F., Scaramuzza, D.: IMU preintegration on manifold for efficient visual-inertial maximum-a-posteriori estimation. Georgia Institute of Technology, New York (2015)
39.
go back to reference Gao, X.S., Hou, X.R., Tang, J., Cheng, H.F.: Complete solution classification for the perspective-three-point problem. IEEE Trans. Pattern Anal. Mach. Intell. 25(8), 930–943 (2003) Gao, X.S., Hou, X.R., Tang, J., Cheng, H.F.: Complete solution classification for the perspective-three-point problem. IEEE Trans. Pattern Anal. Mach. Intell. 25(8), 930–943 (2003)
40.
go back to reference Garon, M., Lalonde, J.F.: Deep 6-DOF tracking. IEEE Trans. Vis. Comput. Graph. 23(11), 2410–2418 (2017) Garon, M., Lalonde, J.F.: Deep 6-DOF tracking. IEEE Trans. Vis. Comput. Graph. 23(11), 2410–2418 (2017)
41.
go back to reference Garon, M., Laurendeau, D., Lalonde, J.F.: A framework for evaluating 6-DOF object trackers. In: Proceedings of the European Conference on Computer Vision, pp. 582–597. Springer, Berlin (2018) Garon, M., Laurendeau, D., Lalonde, J.F.: A framework for evaluating 6-DOF object trackers. In: Proceedings of the European Conference on Computer Vision, pp. 582–597. Springer, Berlin (2018)
42.
go back to reference Gemeiner, P., Einramhof, P., Vincze, M.: Simultaneous motion and structure estimation by fusion of inertial and vision data. Int. J. Robot. Res. 26(6), 591–605 (2007) Gemeiner, P., Einramhof, P., Vincze, M.: Simultaneous motion and structure estimation by fusion of inertial and vision data. Int. J. Robot. Res. 26(6), 591–605 (2007)
43.
go back to reference Getting, I.A.: Perspective/navigation-the global positioning system. IEEE Spectr. 30(12), 36–38 (1993) Getting, I.A.: Perspective/navigation-the global positioning system. IEEE Spectr. 30(12), 36–38 (1993)
44.
go back to reference Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep learning, vol. 1. MIT Press, Cambridge (2016)MATH Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep learning, vol. 1. MIT Press, Cambridge (2016)MATH
45.
go back to reference Hager, G.D., Belhumeur, P.N.: Efficient region tracking with parametric models of geometry and illumination. IEEE Trans. Pattern Anal. Mach. Intell. 20(10):1025–1039 (1998) Hager, G.D., Belhumeur, P.N.: Efficient region tracking with parametric models of geometry and illumination. IEEE Trans. Pattern Anal. Mach. Intell. 20(10):1025–1039 (1998)
46.
go back to reference Harris, C., Stennett, C.: Rapid-a video rate object tracker. In: Proceedings of the British Machine Vision Conference, pp. 1–6 (1990) Harris, C., Stennett, C.: Rapid-a video rate object tracker. In: Proceedings of the British Machine Vision Conference, pp. 1–6 (1990)
47.
go back to reference Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of the Alvey Vision Conference, vol. 15, pp. 10–5244 (1988) Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of the Alvey Vision Conference, vol. 15, pp. 10–5244 (1988)
48.
go back to reference Hartley, R.I., Sturm, P.: Triangulation. Comput. Vis. Image Underst. 68(2), 146–157 (1997) Hartley, R.I., Sturm, P.: Triangulation. Comput. Vis. Image Underst. 68(2), 146–157 (1997)
49.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE, New York (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE, New York (2016)
50.
go back to reference Heikkila, J., Silven, O.: A four-step camera calibration procedure with implicit image correction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1106–1112. IEEE, New York (1997) Heikkila, J., Silven, O.: A four-step camera calibration procedure with implicit image correction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1106–1112. IEEE, New York (1997)
51.
go back to reference Hesch, J.A., Roumeliotis, S.I.: A direct least-squares (DLS) method for PnP. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 383–390. IEEE, New York (2011) Hesch, J.A., Roumeliotis, S.I.: A direct least-squares (DLS) method for PnP. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 383–390. IEEE, New York (2011)
52.
go back to reference Hinterstoisser, S., Lepetit, V., Ilic, S., Holzer, S., Bradski, G., Konolige, K., Navab, N.: Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. In: Proceedings of the Asian Conference on Computer Vision, pp. 548–562. Springer, Berlin (2012) Hinterstoisser, S., Lepetit, V., Ilic, S., Holzer, S., Bradski, G., Konolige, K., Navab, N.: Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. In: Proceedings of the Asian Conference on Computer Vision, pp. 548–562. Springer, Berlin (2012)
53.
go back to reference Huber, P.J.: Robust statistics. In: International Encyclopedia of Statistical Science, pp. 1248–1251. Springer, Berlin (2011) Huber, P.J.: Robust statistics. In: International Encyclopedia of Statistical Science, pp. 1248–1251. Springer, Berlin (2011)
54.
go back to reference Jin, H., Favaro, P., Soatto, S.: Real-time feature tracking and outlier rejection with changes in illumination. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 1, pp. 684–689. IEEE, New York (2001) Jin, H., Favaro, P., Soatto, S.: Real-time feature tracking and outlier rejection with changes in illumination. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 1, pp. 684–689. IEEE, New York (2001)
55.
go back to reference Kato, H., Billinghurst, M.: Marker tracking and HMD calibration for a video-based augmented reality conferencing system. In: Proceedings of the IEEE/ACM International Workshop on Augmented Reality, pp. 85–94. IEEE, New York (1999) Kato, H., Billinghurst, M.: Marker tracking and HMD calibration for a video-based augmented reality conferencing system. In: Proceedings of the IEEE/ACM International Workshop on Augmented Reality, pp. 85–94. IEEE, New York (1999)
56.
go back to reference Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874. IEEE, New York (2014) Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874. IEEE, New York (2014)
57.
go back to reference Kehl, W., Manhardt, F., Tombari, F., Ilic, S., Navab, N.: SSD-6D: Making RGB-based 3D detection and 6D pose estimation great again. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 22–29. IEEE, New York (2017a) Kehl, W., Manhardt, F., Tombari, F., Ilic, S., Navab, N.: SSD-6D: Making RGB-based 3D detection and 6D pose estimation great again. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 22–29. IEEE, New York (2017a)
58.
go back to reference Kehl, W., Tombari, F., Ilic, S., Navab, N.: Real-time 3D model tracking in color and depth on a single CPU core. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 745–753. IEEE, New York (2017b) Kehl, W., Tombari, F., Ilic, S., Navab, N.: Real-time 3D model tracking in color and depth on a single CPU core. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 745–753. IEEE, New York (2017b)
59.
go back to reference Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:14126980 (2014) Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:14126980 (2014)
60.
go back to reference Klein, G., Murray, D.: Parallel tracking and mapp.ing for small AR workspaces. In: Proceedings of the IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 225–234. IEEE, New York (2007) Klein, G., Murray, D.: Parallel tracking and mapp.ing for small AR workspaces. In: Proceedings of the IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 225–234. IEEE, New York (2007)
61.
go back to reference Klein, G., Murray, D.: Parallel tracking and mapping on a camera phone. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 83–86. IEEE, New York (2009) Klein, G., Murray, D.: Parallel tracking and mapping on a camera phone. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 83–86. IEEE, New York (2009)
62.
go back to reference Köhler, J., Pagani, A., Stricker, D.: Detection and identification techniques for markers used in computer vision. In: Proceedings of the Visualization of Large and Unstructured Data Sets-Applications in Geospatial Planning, Modeling and Engineering, Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2011) Köhler, J., Pagani, A., Stricker, D.: Detection and identification techniques for markers used in computer vision. In: Proceedings of the Visualization of Large and Unstructured Data Sets-Applications in Geospatial Planning, Modeling and Engineering, Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2011)
63.
go back to reference Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the Conference on Neural Information Processing Systems, pp. 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the Conference on Neural Information Processing Systems, pp. 1097–1105 (2012)
64.
go back to reference Laina, I., Rupprecht, C., Belagiannis, V., Tombari, F., Navab, N.: Deeper depth prediction with fully convolutional residual networks. In: Proceedings of the International Conference on 3D Vision, pp. 239–248. IEEE, New York (2016) Laina, I., Rupprecht, C., Belagiannis, V., Tombari, F., Navab, N.: Deeper depth prediction with fully convolutional residual networks. In: Proceedings of the International Conference on 3D Vision, pp. 239–248. IEEE, New York (2016)
65.
go back to reference LeCun, Y., et al. Generalization and network design strategies. Connectionism Perspect. 19, 143–155 (1989) LeCun, Y., et al. Generalization and network design strategies. Connectionism Perspect. 19, 143–155 (1989)
66.
go back to reference LeCun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision. In: Proceedings of the IEEE International Symposium on Circuits and Systems, pp. 253–256. IEEE, New York (2010) LeCun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision. In: Proceedings of the IEEE International Symposium on Circuits and Systems, pp. 253–256. IEEE, New York (2010)
67.
go back to reference Lepetit, V., Moreno-Noguer, F., Fua, P.: EPnP: An accurate o(n) solution to the PnP problem. Int. J. Comput. Vis. 81(2), 155 (2009) Lepetit, V., Moreno-Noguer, F., Fua, P.: EPnP: An accurate o(n) solution to the PnP problem. Int. J. Comput. Vis. 81(2), 155 (2009)
68.
go back to reference Leutenegger, S., Lynen, S., Bosse, M., Siegwart, R., Furgale, P.: Keyframe-based visual–inertial odometry using nonlinear optimization. Int. J. Rob. Res. 34(3), 314–334 (2015) Leutenegger, S., Lynen, S., Bosse, M., Siegwart, R., Furgale, P.: Keyframe-based visual–inertial odometry using nonlinear optimization. Int. J. Rob. Res. 34(3), 314–334 (2015)
69.
go back to reference Li, M., Mourikis, A.I.: High-precision, consistent EKF-based visual-inertial odometry. Int. J. Rob. Res. 32(6), 690–711 (2013) Li, M., Mourikis, A.I.: High-precision, consistent EKF-based visual-inertial odometry. Int. J. Rob. Res. 32(6), 690–711 (2013)
70.
go back to reference Li, Y., Wang, G., Ji, X., Xiang, Y., Fox, D.: DeepIM: Deep iterative matching for 6D pose estimation. In: Proceedings of the European Conference on Computer Vision, pp. 683–698. Springer, Berlin (2018) Li, Y., Wang, G., Ji, X., Xiang, Y., Fox, D.: DeepIM: Deep iterative matching for 6D pose estimation. In: Proceedings of the European Conference on Computer Vision, pp. 683–698. Springer, Berlin (2018)
71.
go back to reference Liang, C.K., Chang, L.W., Chen, H.H.: Analysis and compensation of rolling shutter effect. IEEE Trans. Image Process. 17(8), 1323–1330 (2008) Liang, C.K., Chang, L.W., Chen, H.H.: Analysis and compensation of rolling shutter effect. IEEE Trans. Image Process. 17(8), 1323–1330 (2008)
72.
go back to reference Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks. arXiv preprint arXiv:150202791 (2015) Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks. arXiv preprint arXiv:150202791 (2015)
73.
go back to reference Longuet-Higgins, H.C.: A computer algorithm for reconstructing a scene from two projections. Nature 293(5828), 133 (1981) Longuet-Higgins, H.C.: A computer algorithm for reconstructing a scene from two projections. Nature 293(5828), 133 (1981)
74.
go back to reference Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157. IEEE, New York (1999) Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157. IEEE, New York (1999)
75.
go back to reference Lucas, B.D., Kanade, T., et al. An iterative image registration technique with an application to stereo vision. Technical Report (1981) Lucas, B.D., Kanade, T., et al. An iterative image registration technique with an application to stereo vision. Technical Report (1981)
76.
go back to reference Luo, W., Schwing, A.G., Urtasun, R.: Efficient deep learning for stereo matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5695–5703. IEEE, New York (2016) Luo, W., Schwing, A.G., Urtasun, R.: Efficient deep learning for stereo matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5695–5703. IEEE, New York (2016)
77.
go back to reference McCormac, J., Handa, A., Davison, A., Leutenegger, S.: SemanticFusion: Dense 3D semantic mapping with convolutional neural networks. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 4628–4635. IEEE, New York (2017) McCormac, J., Handa, A., Davison, A., Leutenegger, S.: SemanticFusion: Dense 3D semantic mapping with convolutional neural networks. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 4628–4635. IEEE, New York (2017)
78.
go back to reference McCormac, J., Clark, R., Bloesch, M., Davison, A., Leutenegger, S.: Fusion++: Volumetric object-level slam. In: Proceedings of the International Conference on 3D Vision, pp. 32–41. IEEE, New York (2018) McCormac, J., Clark, R., Bloesch, M., Davison, A., Leutenegger, S.: Fusion++: Volumetric object-level slam. In: Proceedings of the International Conference on 3D Vision, pp. 32–41. IEEE, New York (2018)
79.
go back to reference Mukherjee, D., Wu, Q.M.J., Wang, G.: A comparative experimental study of image feature detectors and descriptors. Mach. Vision Appl. 26(4), 443–466 (2015) Mukherjee, D., Wu, Q.M.J., Wang, G.: A comparative experimental study of image feature detectors and descriptors. Mach. Vision Appl. 26(4), 443–466 (2015)
80.
go back to reference Mur-Artal, R., Montiel, J.M.M., Tardos, J.: ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans. Rob. 31(5), 1147–1163 (2015) Mur-Artal, R., Montiel, J.M.M., Tardos, J.: ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans. Rob. 31(5), 1147–1163 (2015)
81.
go back to reference Naimark, L., Foxlin, E.: Circular data matrix fiducial system and robust image processing for a wearable vision-inertial self-tracker. In: Proceedings of the IEEE/ACM International Symposium on Mixed and Augmented Reality, p. 27. IEEE, New York (2002) Naimark, L., Foxlin, E.: Circular data matrix fiducial system and robust image processing for a wearable vision-inertial self-tracker. In: Proceedings of the IEEE/ACM International Symposium on Mixed and Augmented Reality, p. 27. IEEE, New York (2002)
82.
go back to reference Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the International Conference on Machine Learning, pp. 807–814 (2010) Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the International Conference on Machine Learning, pp. 807–814 (2010)
83.
go back to reference Newcombe, R., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A., Kohi, P., Shotton, J., Hodges, S., Fitzgibbon, A.: KinectFusion: Real-time dense surface mapping and tracking. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 127–136. IEEE, New York (2011a) Newcombe, R., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A., Kohi, P., Shotton, J., Hodges, S., Fitzgibbon, A.: KinectFusion: Real-time dense surface mapping and tracking. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 127–136. IEEE, New York (2011a)
84.
go back to reference Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: DTAM: Dense tracking and mapping in real-time. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2320–2327. IEEE, New York (2011b) Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: DTAM: Dense tracking and mapping in real-time. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2320–2327. IEEE, New York (2011b)
85.
go back to reference Nistér, D.: An efficient solution to the five-point relative pose problem. IEEE Trans. Pattern Anal. Mach. Intell. 26(6), 0756–777 (2004) Nistér, D.: An efficient solution to the five-point relative pose problem. IEEE Trans. Pattern Anal. Mach. Intell. 26(6), 0756–777 (2004)
86.
go back to reference Nistér, D.: Preemptive RANSAC for live structure and motion estimation. Mach. Vision Appl. 16(5), 321–329 (2005) Nistér, D.: Preemptive RANSAC for live structure and motion estimation. Mach. Vision Appl. 16(5), 321–329 (2005)
87.
go back to reference Oth, L., Furgale, P., Kneip, L., Siegwart, R.: Rolling shutter camera calibration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1360–1367. IEEE, New York (2013) Oth, L., Furgale, P., Kneip, L., Siegwart, R.: Rolling shutter camera calibration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1360–1367. IEEE, New York (2013)
88.
go back to reference Pagani, A.: Reality Models for efficient registration in Augmented Reality. Verlag Dr. Hut, Germany (2014) Pagani, A.: Reality Models for efficient registration in Augmented Reality. Verlag Dr. Hut, Germany (2014)
89.
go back to reference Park, Y., Lepetit, V., Woo, W.: Multiple 3D object tracking for augmented reality. In: Proceedings of the IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 117–120. IEEE, New York (2008) Park, Y., Lepetit, V., Woo, W.: Multiple 3D object tracking for augmented reality. In: Proceedings of the IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 117–120. IEEE, New York (2008)
90.
go back to reference Paulus, C.J., Haouchine, N., Cazier, D., Cotin, S.: Augmented reality during cutting and tearing of deformable objects. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 54–59. IEEE, New York (2015) Paulus, C.J., Haouchine, N., Cazier, D., Cotin, S.: Augmented reality during cutting and tearing of deformable objects. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 54–59. IEEE, New York (2015)
91.
go back to reference Poultney, C., Chopra, S., Cun, Y.L., et al.: Efficient learning of sparse representations with an energy-based model. In: Proceedings of the Conference on Neural Information Processing Systems, pp. 1137–1144 (2007) Poultney, C., Chopra, S., Cun, Y.L., et al.: Efficient learning of sparse representations with an energy-based model. In: Proceedings of the Conference on Neural Information Processing Systems, pp. 1137–1144 (2007)
92.
go back to reference Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical recipes 3rd edition: The art of scientific computing. Cambridge University, Cambridge (2007)MATH Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical recipes 3rd edition: The art of scientific computing. Cambridge University, Cambridge (2007)MATH
93.
go back to reference Prisacariu, V.A., Reid, I.D.: PWP3D: Real-time segmentation and tracking of 3D objects. Int. J. Comput. Vis. 98(3), 335–354 (2012) Prisacariu, V.A., Reid, I.D.: PWP3D: Real-time segmentation and tracking of 3D objects. Int. J. Comput. Vis. 98(3), 335–354 (2012)
94.
go back to reference Puerto-Souza, G.A., Mariottini, G.L.: A fast and accurate feature-matching algorithm for minimally-invasive endoscopic images. IEEE Trans. Med. Imaging 32(7), 1201–1214 (2013) Puerto-Souza, G.A., Mariottini, G.L.: A fast and accurate feature-matching algorithm for minimally-invasive endoscopic images. IEEE Trans. Med. Imaging 32(7), 1201–1214 (2013)
95.
go back to reference Qin, T., Li, P., Shen, S.: Vins-mono: A robust and versatile monocular visual-inertial state estimator. IEEE Trans. Rob. 34(4), 1004–1020 (2018) Qin, T., Li, P., Shen, S.: Vins-mono: A robust and versatile monocular visual-inertial state estimator. IEEE Trans. Rob. 34(4), 1004–1020 (2018)
96.
go back to reference Rad, M., Lepetit, V.: BB8: A scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3848–3856. IEEE, New York (2017) Rad, M., Lepetit, V.: BB8: A scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3848–3856. IEEE, New York (2017)
97.
go back to reference Radkowski, R., Herrema, J., Oliver, J.: Augmented reality-based manual assembly support with visual features for different degrees of difficulty. Int. J. Hum.-Comput. Interact. 31(5), 337–349 (2015) Radkowski, R., Herrema, J., Oliver, J.: Augmented reality-based manual assembly support with visual features for different degrees of difficulty. Int. J. Hum.-Comput. Interact. 31(5), 337–349 (2015)
98.
go back to reference Rambach, J., Tewari, A., Pagani, A., Stricker, D.: Learning to fuse: A deep learning approach to visual-inertial camera pose estimation. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 71–76. IEEE, New York (2016) Rambach, J., Tewari, A., Pagani, A., Stricker, D.: Learning to fuse: A deep learning approach to visual-inertial camera pose estimation. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 71–76. IEEE, New York (2016)
99.
go back to reference Rambach, J., Pagani, A., Stricker, D.: Augmented things: enhancing AR applications leveraging the internet of things and universal 3D object tracking. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 103–108. IEEE, New York (2017) Rambach, J., Pagani, A., Stricker, D.: Augmented things: enhancing AR applications leveraging the internet of things and universal 3D object tracking. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 103–108. IEEE, New York (2017)
100.
go back to reference Rambach, J., Deng, C., Pagani, A., Stricker, D.: Learning 6DoF object poses from synthetic single channel images. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality. IEEE, New York (2018) Rambach, J., Deng, C., Pagani, A., Stricker, D.: Learning 6DoF object poses from synthetic single channel images. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality. IEEE, New York (2018)
101.
go back to reference Rambach, J., Lesur, P., Pagani, A., Stricker, D.: SlamCraft: Dense planar RGB monocular SLAM. In: Proceedings of the International Conference on Machine Vision Applications. Springer, Berlin (2019) Rambach, J., Lesur, P., Pagani, A., Stricker, D.: SlamCraft: Dense planar RGB monocular SLAM. In: Proceedings of the International Conference on Machine Vision Applications. Springer, Berlin (2019)
102.
go back to reference Reina, S.C., Solin, A., Kannala, J.: Robust gyroscope-aided camera self-calibration. In: Proceedings of the International Conference on Information Fusion, pp. 772–779. IEEE, New York (2018) Reina, S.C., Solin, A., Kannala, J.: Robust gyroscope-aided camera self-calibration. In: Proceedings of the International Conference on Information Fusion, pp. 772–779. IEEE, New York (2018)
103.
go back to reference Renaudin, V., Afzal, M.H., Lachapelle, G.: Complete triaxis magnetometer calibration in the magnetic domain. J. Sens. (2010) Renaudin, V., Afzal, M.H., Lachapelle, G.: Complete triaxis magnetometer calibration in the magnetic domain. J. Sens. (2010)
104.
go back to reference Ricolfe-Viala, C., Sanchez-Salmeron, A.J.: Lens distortion models evaluation. Appl. Opt. 49(30), 5914–5928 (2010) Ricolfe-Viala, C., Sanchez-Salmeron, A.J.: Lens distortion models evaluation. Appl. Opt. 49(30), 5914–5928 (2010)
105.
go back to reference Rosten, E., Drummond, T.: Fusing points and lines for high performance tracking. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 2, pp. 1508–1515. IEEE, New York (2005) Rosten, E., Drummond, T.: Fusing points and lines for high performance tracking. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 2, pp. 1508–1515. IEEE, New York (2005)
106.
go back to reference Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: An efficient alternative to SIFT or SURF. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2564–2571. IEEE, New York (2011) Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: An efficient alternative to SIFT or SURF. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2564–2571. IEEE, New York (2011)
107.
go back to reference Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533 (1986)MATH Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533 (1986)MATH
108.
go back to reference Runz, M., Buffier, M., Agapito, L.: Maskfusion: Real-time recognition, tracking and reconstruction of multiple moving objects. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 10–20. IEEE, New York (2018) Runz, M., Buffier, M., Agapito, L.: Maskfusion: Real-time recognition, tracking and reconstruction of multiple moving objects. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 10–20. IEEE, New York (2018)
109.
go back to reference Rusinkiewicz, S., Levoy, M.: Efficient variants of the ICP algorithm. In: Proceedings of the International Conference on 3-D Digital Imaging and Modeling, vol. 1, pp. 145–152 (2001) Rusinkiewicz, S., Levoy, M.: Efficient variants of the ICP algorithm. In: Proceedings of the International Conference on 3-D Digital Imaging and Modeling, vol. 1, pp. 145–152 (2001)
110.
go back to reference Salas-Moreno, R., Newcombe, R., Strasdat, H., Kelly, P., Davison, A.: Slam++: Simultaneous localisation and mapping at the level of objects. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1352–1359. IEEE, New York (2013) Salas-Moreno, R., Newcombe, R., Strasdat, H., Kelly, P., Davison, A.: Slam++: Simultaneous localisation and mapping at the level of objects. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1352–1359. IEEE, New York (2013)
111.
go back to reference Salas-Moreno, R., Glocken, B., Kelly, P., Davison, A.: Dense planar SLAM. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 157–164. IEEE, New York (2014) Salas-Moreno, R., Glocken, B., Kelly, P., Davison, A.: Dense planar SLAM. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 157–164. IEEE, New York (2014)
112.
go back to reference Sansoni, G., Trebeschi, M., Docchio, F.: State-of-the-art and applications of 3D imaging sensors in industry, cultural heritage, medicine, and criminal investigation. Sensors 9(1), 568–601 (2009) Sansoni, G., Trebeschi, M., Docchio, F.: State-of-the-art and applications of 3D imaging sensors in industry, cultural heritage, medicine, and criminal investigation. Sensors 9(1), 568–601 (2009)
113.
go back to reference Seo, B.K., Wuest, H.: A direct method for robust model-based 3D object tracking from a monocular RGB image. In: Proceedings of the European Conference on Computer Vision, pp. 551–562. Springer, Berlin (2016) Seo, B.K., Wuest, H.: A direct method for robust model-based 3D object tracking from a monocular RGB image. In: Proceedings of the European Conference on Computer Vision, pp. 551–562. Springer, Berlin (2016)
114.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556 (2014) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556 (2014)
115.
go back to reference Su, Y., Rambach, J., Minaskan, N., Lesur, P., Pagani, A., Stricker, D.: Deep multi-state object pose estimation for augmented reality assembly. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality. IEEE, New York (2019) Su, Y., Rambach, J., Minaskan, N., Lesur, P., Pagani, A., Stricker, D.: Deep multi-state object pose estimation for augmented reality assembly. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality. IEEE, New York (2019)
116.
go back to reference Subbarao, R., Meer, P.: Beyond RANSAC: user independent robust regression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop, pp. 101–101. IEEE, New York (2006) Subbarao, R., Meer, P.: Beyond RANSAC: user independent robust regression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop, pp. 101–101. IEEE, New York (2006)
117.
go back to reference Sun, B., Saenko, K.: Deep coral: Correlation alignment for deep domain adaptation. In: Proceedings of the European Conference on Computer Vision, pp. 443–450. Springer, Berlin (2016) Sun, B., Saenko, K.: Deep coral: Correlation alignment for deep domain adaptation. In: Proceedings of the European Conference on Computer Vision, pp. 443–450. Springer, Berlin (2016)
118.
go back to reference Sundermeyer, M., Marton, Z.C., Durner, M., Brucker, M., Triebel, R.: Implicit 3D orientation learning for 6D object detection from RGB images. In: Proceedings of the European Conference on Computer Vision, pp. 699–715. Springer, Berlin (2018) Sundermeyer, M., Marton, Z.C., Durner, M., Brucker, M., Triebel, R.: Implicit 3D orientation learning for 6D object detection from RGB images. In: Proceedings of the European Conference on Computer Vision, pp. 699–715. Springer, Berlin (2018)
119.
go back to reference Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9. IEEE, New York (2015) Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9. IEEE, New York (2015)
120.
go back to reference Tan, D.J., Navab, N., Tombari, F.: Looking beyond the simple scenarios: Combining learners and optimizers in 3D temporal tracking. IEEE Trans. Visual Comput. Graphics 23(11), 2399–2409 (2017) Tan, D.J., Navab, N., Tombari, F.: Looking beyond the simple scenarios: Combining learners and optimizers in 3D temporal tracking. IEEE Trans. Visual Comput. Graphics 23(11), 2399–2409 (2017)
121.
go back to reference Tateno, K., Tombari, F., Laina, I., Navab, N.: CNN-SLAM: Real-time dense monocular SLAM with learned depth prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2. IEEE, New York (2017) Tateno, K., Tombari, F., Laina, I., Navab, N.: CNN-SLAM: Real-time dense monocular SLAM with learned depth prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2. IEEE, New York (2017)
122.
go back to reference Tekin, B., Sinha, S.N., Fua, P.: Real-time seamless single shot 6D object pose prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 292–301. IEEE, New York (2018) Tekin, B., Sinha, S.N., Fua, P.: Real-time seamless single shot 6D object pose prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 292–301. IEEE, New York (2018)
123.
go back to reference Titterton, D., Weston, J.L., Weston, J.: Strapdown inertial navigation technology, vol. 17. IET, United Kingdom (2004) Titterton, D., Weston, J.L., Weston, J.: Strapdown inertial navigation technology, vol. 17. IET, United Kingdom (2004)
124.
go back to reference Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjustment—a modern synthesis. In: Proceedings of the International Workshop on Vision Algorithms, pp. 298–372. Springer, Berlin (1999) Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjustment—a modern synthesis. In: Proceedings of the International Workshop on Vision Algorithms, pp. 298–372. Springer, Berlin (1999)
125.
go back to reference Vacchetti, L., Lepetit, V., Fua, P.: Combining edge and texture information for real-time accurate 3D camera tracking. In: Proceedings of the IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 48–57. IEEE, New York (2004a) Vacchetti, L., Lepetit, V., Fua, P.: Combining edge and texture information for real-time accurate 3D camera tracking. In: Proceedings of the IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 48–57. IEEE, New York (2004a)
126.
go back to reference Vacchetti, L., Lepetit, V., Fua, P.: Stable real-time 3D tracking using online and offline information. IEEE Trans. Pattern Anal. Mach. Intell. 26(10), 1385–1391 (2004b) Vacchetti, L., Lepetit, V., Fua, P.: Stable real-time 3D tracking using online and offline information. IEEE Trans. Pattern Anal. Mach. Intell. 26(10), 1385–1391 (2004b)
127.
go back to reference Wagner, D., Schmalstieg, D.: Artoolkitplus for pose tracking on mobile devices. In: Proceedings of 12th Computer Vision Winter Workshop, 139–146 (2007) Wagner, D., Schmalstieg, D.: Artoolkitplus for pose tracking on mobile devices. In: Proceedings of 12th Computer Vision Winter Workshop, 139–146 (2007)
128.
go back to reference Wang, S., Clark, R., Wen, H., Trigoni, N.: Deepvo: Towards end-to-end visual odometry with deep recurrent convolutional neural networks. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 2043–2050. IEEE, New York (2017) Wang, S., Clark, R., Wen, H., Trigoni, N.: Deepvo: Towards end-to-end visual odometry with deep recurrent convolutional neural networks. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 2043–2050. IEEE, New York (2017)
129.
go back to reference Wasenmüller, O., Stricker, D.: Comparison of kinect v1 and v2 depth images in terms of accuracy and precision. In: Proceedings of the Asian Conference on Computer Vision Workshops, pp. 34–45. Springer, Berlin (2016) Wasenmüller, O., Stricker, D.: Comparison of kinect v1 and v2 depth images in terms of accuracy and precision. In: Proceedings of the Asian Conference on Computer Vision Workshops, pp. 34–45. Springer, Berlin (2016)
130.
go back to reference Whelan, T., Kaess, M., Fallon, M.F.: Kintinuous: Spatially extended {K}inect{F}usion. In: Proceedings of the Workshop on RGB-D: Advanced Reasoning with Depth Cameras (2012) Whelan, T., Kaess, M., Fallon, M.F.: Kintinuous: Spatially extended {K}inect{F}usion. In: Proceedings of the Workshop on RGB-D: Advanced Reasoning with Depth Cameras (2012)
131.
go back to reference Whelan, T., Salas-Moreno, R., Glocker, B., Davison, A., Leutenegger, S.: ElasticFusion: Real-time dense SLAM and light source estimation. Int. J. Rob. Res. 35(14), 1697–1716 (2016) Whelan, T., Salas-Moreno, R., Glocker, B., Davison, A., Leutenegger, S.: ElasticFusion: Real-time dense SLAM and light source estimation. Int. J. Rob. Res. 35(14), 1697–1716 (2016)
132.
go back to reference Wuest, H., Vial, F., Stricker, D.: Adaptive line tracking with multiple hypotheses for augmented reality. In: Proceedings of the IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 62–69. IEEE, New York (2005) Wuest, H., Vial, F., Stricker, D.: Adaptive line tracking with multiple hypotheses for augmented reality. In: Proceedings of the IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 62–69. IEEE, New York (2005)
133.
go back to reference Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Proceedings of the European Conference on Computer Vision, pp. 818–833. Springer, Berlin (2014) Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Proceedings of the European Conference on Computer Vision, pp. 818–833. Springer, Berlin (2014)
134.
go back to reference Zhang, Z.: Iterative point matching for registration of free-form curves and surfaces. Int. J. Comput. Vis. 13(2), 119–152 (1994) Zhang, Z.: Iterative point matching for registration of free-form curves and surfaces. Int. J. Comput. Vis. 13(2), 119–152 (1994)
135.
go back to reference Zhang, Z., et al. Flexible camera calibration by viewing a plane from unknown orientations. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 99, pp. 666–673. IEEE, New York (1999) Zhang, Z., et al. Flexible camera calibration by viewing a plane from unknown orientations. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 99, pp. 666–673. IEEE, New York (1999)
136.
go back to reference Zhang, Z., Li, M., Huang, K., Tan, T.: Practical camera auto-calibration based on object appearance and motion for traffic scene visual surveillance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, New York (2008) Zhang, Z., Li, M., Huang, K., Tan, T.: Practical camera auto-calibration based on object appearance and motion for traffic scene visual surveillance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, New York (2008)
137.
go back to reference Zhi, S., Bloesch, M., Leutenegger, S., Davison, A.J.: SceneCode: Monocular dense semantic reconstruction using learned encoded scene representations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11776–11785. IEEE, New York (2019) Zhi, S., Bloesch, M., Leutenegger, S., Davison, A.J.: SceneCode: Monocular dense semantic reconstruction using learned encoded scene representations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11776–11785. IEEE, New York (2019)
138.
go back to reference Zhou, H., Ummenhofer, B., Brox, T.: Deeptam: Deep tracking and mapping. In: Proceedings of the European Conference on Computer Vision, pp. 822–838. Springer, Berlin (2018) Zhou, H., Ummenhofer, B., Brox, T.: Deeptam: Deep tracking and mapping. In: Proceedings of the European Conference on Computer Vision, pp. 822–838. Springer, Berlin (2018)
Metadata
Title
Principles of Object Tracking and Mapping
Authors
Jason Rambach
Alain Pagani
Didier Stricker
Copyright Year
2023
DOI
https://doi.org/10.1007/978-3-030-67822-7_3

Premium Partner