ABSTRACT
In this work we investigate the coordination of human-machine interactions from a bird's-eye view using a single panoramic color camera. Our approach replaces conventional physical hardware sensors, such as light barriers and switches, by location-aware virtual regions. We employ recent methods from the field of pose estimation to detect human and robot joint configurations. By fusing 2D human and robot pose information with prior scene knowledge, we can lift these perceptions to a 3D metric space. In this way, our system can initiate environmental reactions induced by geometric events among humans, robots and virtual regions. We demonstrate the diverse application possibilities and robustness of our system in three use cases.
Supplemental Material
Available for Download
Supplemental files.
- Jawad Ahmad, Hadi Larijani, Rohinton Emmanuel, Mike Mannion, and Abbas Javed. 2018. An intelligent real-time occupancy monitoring system using single overhead camera. In Proceedings of SAI Intelligent Systems Conference. Springer, 957--969.Google Scholar
- Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime multi-person 2D pose estimation using part affinity fields. In CVPR, Vol. 1. 7.Google Scholar
- Antonio Criminisi, Ian Reid, and Andrew Zisserman. 1999. A plane measuring device. Image and Vision Computing 17, 8 (1999), 625--634.Google ScholarCross Ref
- Xiaochuan Fan, Kang Zheng, Youjie Zhou, and Song Wang. 2014. Pose locality constrained representation for 3D human pose reconstruction. In European Conference on Computer Vision. Springer, 174--188.Google ScholarCross Ref
- Christoph Heindl, Thomas Pönitz, Andreas Pichler, and Josef Scharinger. 2018. Large area 3D human pose detection via stereo reconstruction in panoramic cameras. In Proceedings of the OAGM Workshop 2018.Google Scholar
- Christoph Heindl, Sebastian Zambal, Thomas Pönitz, Andreas Pichler, and Josef Scharinger. 2019. 3D Robot pose estimation from 2D images. In International Conference on Digital Image & Signal Processing.Google Scholar
- Christoph Heindl, Sebastian Zambal, and Josef Scharinger. 2019. Learning to predict robot keypoints using artificially generated images. arXiv:arXiv:1907.01879Google Scholar
- Maheshkumar H Kolekar, Nishant Bharti, and Priti N Patil. 2016. Detection of fence climbing using activity recognition by support vector machine classifier. In 2016 IEEE Region 10 Conference (TENCON). IEEE, 398--402.Google ScholarCross Ref
- Thomas Kosch, Yomna Abdelrahman, Markus Funk, and Albrecht Schmidt. 2017. One size does not fit all: challenges of providing interactive worker assistance in industrial settings. In Proceedings of the 2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing. ACM, 1006--1011. Google ScholarDigital Library
- Julieta Martinez, Rayat Hossain, Javier Romero, and James J Little. 2017. A simple yet effective baseline for 3D human pose estimation. In Proceedings of the IEEE International Conference on Computer Vision. 2640--2649.Google ScholarCross Ref
- Justinas Miseikis, Inka Brijacak, Saeed Yahyanejad, Kyrre Glette, Ole Jakob Elle, and Jim Torresen. 2019. Two-Stage Transfer Learning for Heterogeneous Robot Detection and 3D Joint Position Estimation in a 2D Camera Image using CNN. arXiv preprint arXiv:1902.05718 (2019).Google Scholar
- Nikolaos Nikolakis, Vasilis Maratos, and Sotiris Makris. 2019. A cyber physical system (CPS) approach for safe human-robot collaboration in a shared workplace. Robotics and Computer-Integrated Manufacturing 56 (2019), 233--243.Google ScholarCross Ref
- Lauro Snidaro, Christian Micheloni, and Cristian Chiavedale. 2004. Video security for ambient intelligence. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 35, 1 (2004), 133--144. Google ScholarDigital Library
- Michal Tölgyessy, Martin Dekan, and Peter Hubinskỳ. 2018. Human-robot interaction using pointing gestures. In Proceedings of the 2nd International Symposium on Computer Science and Intelligent Control. ACM, 60. Google ScholarDigital Library
- Denis Tome, Chris Russell, and Lourdes Agapito. 2017. Lifting from the deep: Convolutional 3D pose estimation from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2500--2509.Google ScholarCross Ref
- Luyang Wang, Yan Chen, Zhenhua Guo, Keyuan Qian, Mude Lin, Hongsheng Li, and Jimmy S Ren. 2019. Generalizing monocular 3D human pose estimation in the wild. arXiv preprint arXiv:1904.05512 (2019).Google Scholar
Recommendations
Multiview Panoramic Cameras Using Mirror Pyramids
A mirror pyramid consists of a set of planar mirror faces arranged around an axis of symmetry and inclined to form a pyramid. By strategically positioning a number of conventional cameras around a mirror pyramid, the viewpoints of the cameras' mirror ...
Robot Homing by Exploiting Panoramic Vision
We propose a novel, vision-based method for robot homing, the problem of computing a route so that a robot can return to its initial "home" position after the execution of an arbitrary "prior" path. The method assumes that the robot tracks visual ...
A low cost tracking system for position-dependent 3D visual interaction
AVI '14: Proceedings of the 2014 International Working Conference on Advanced Visual InterfacesIn many visual interaction applications the user needs to explore a scene by moving with respect to the virtual environment. Using a fixed camera viewpoint leads to visual inconsistencies, which can be avoided only if the exact pose of the user head is ...
Comments