skip to main content
research-article

InteractionFusion: real-time reconstruction of hand poses and deformable objects in hand-object interactions

Authors Info & Claims
Published:12 July 2019Publication History
Skip Abstract Section

Abstract

Hand-object interaction is challenging to reconstruct but important for many applications like HCI, robotics and so on. Previous works focus on either the hand or the object while we jointly track the hand poses, fuse the 3D object model and reconstruct its rigid and nonrigid motions, and perform all these tasks in real time. To achieve this, we first use a DNN to segment the hand and object in the two input depth streams and predict the current hand pose based on the previous poses by a pre-trained LSTM network. With this information, a unified optimization framework is proposed to jointly track the hand poses and object motions. The optimization integrates the segmented depth maps, the predicted motion, a spatial-temporal varying rigidity regularizer and a real-time contact constraint. A nonrigid fusion technique is further involved to reconstruct the object model. Experiments demonstrate that our method can solve the ambiguity caused by heavy occlusions between hand and object, and generate accurate results for various objects and interacting motions.

References

  1. Luca Ballan, Aparna Taneja, Jürgen Gall, Luc Van Gool, and Marc Pollefeys. 2012. Motion capture of hands in action using discriminative salient points. In European Conference on Computer Vision. Springer, 640--653. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Zihao Bo, Hao Zhang, Junhai Yong, and Feng Xu. 2019. DenseAttentionSeg: Segment Hands from Interacted Objects Using Depth Input. arXiv preprint arXiv:1903.12368 (2019).Google ScholarGoogle Scholar
  3. Chiho Choi, Sang Ho Yoon, Chin-Ning Chen, and Karthik Ramani. 2017. Robust hand pose estimation during the interaction with an unknown object. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3123--3132.Google ScholarGoogle ScholarCross RefCross Ref
  4. Mingsong Dou, Sameh Khamis, Yury Degtyarev, Philip Davidson, Sean Ryan Fanello, Adarsh Kowdle, Sergio Orts Escolano, Christoph Rhemann, David Kim, Jonathan Taylor, et al. 2016. Fusion4d: Real-time performance capture of challenging scenes. ACM Transactions on Graphics (TOG) 35, 4 (2016), 114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Henning Hamer, Konrad Schindler, Esther Koller-Meier, and Luc Van Gool. 2009. Tracking a hand manipulating an object. In Computer Vision, 2009 IEEE 12th International Conference On. IEEE, 1475--1482.Google ScholarGoogle ScholarCross RefCross Ref
  6. Shangchen Han, Beibei Liu, Robert Wang, Yuting Ye, Christopher D Twigg, and Kenrick Kin. 2018. Online optical marker-based hand tracking with deep labels. ACM Transactions on Graphics (TOG) 37, 4 (2018), 166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  9. Michael Krainin, Peter Henry, Xiaofeng Ren, and Dieter Fox. 2011. Manipulator and object tracking for in-hand 3D object modeling. The International Journal of Robotics Research 30, 11 (2011), 1311--1327. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Nikolaos Kyriazis and Antonis Argyros. 2013. Physically plausible 3d scene tracking: The single actor hypothesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Nikolaos Kyriazis and Antonis Argyros. 2014. Scalable 3d tracking of multiple interacting objects. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3430--3437. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Franziska Mueller, Florian Bernard, Oleksandr Sotnychenko, Dushyant Mehta, Srinath Sridhar, Dan Casas, and Christian Theobalt. 2018. Ganerated hands for real-time 3d hand tracking from monocular rgb. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 49--59.Google ScholarGoogle ScholarCross RefCross Ref
  13. Franziska Mueller, Dushyant Mehta, Oleksandr Sotnychenko, Srinath Sridhar, Dan Casas, and Christian Theobalt. 2017. Real-time hand tracking under occlusion from an egocentric rgb-d sensor. In Proceedings of International Conference on Computer Vision (ICCV), Vol. 10.Google ScholarGoogle Scholar
  14. Richard A Newcombe, Dieter Fox, and Steven M Seitz. 2015. Dynamicfusion: Reconstruction and tracking of non-rigid scenes in real-time. In Proceedings of the IEEE conference on computer vision and pattern recognition. 343--352.Google ScholarGoogle ScholarCross RefCross Ref
  15. Richard A Newcombe, Shahram Izadi, Otmar Hilliges, David Molyneaux, David Kim, Andrew J Davison, Pushmeet Kohli, Jamie Shotton, Steve Hodges, and Andrew W Fitzgibbon. 2011. Kinectfusion: Real-time dense surface mapping and tracking.. In ISMAR, Vol. 11. 127--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Iason Oikonomidis, Nikolaos Kyriazis, and Antonis A Argyros. 2011. Full dof tracking of a hand interacting with an object by modeling occlusions and physical constraints. (2011).Google ScholarGoogle Scholar
  17. Paschalis Panteleris and Antonis Argyros. 2017. Back to RGB: 3D tracking of hands and hand-object interactions based on short-baseline stereo. Hand 2, 63 (2017), 39.Google ScholarGoogle Scholar
  18. Paschalis Panteleris, Nikolaos Kyriazis, and Antonis A Argyros. 2015. 3D Tracking of Human Hands in Interaction with Unknown Objects.. In BMVC. 123--1.Google ScholarGoogle Scholar
  19. Antoine Petit, Stéphane Cotin, Vincenzo Lippiello, and Bruno Siciliano. 2018. Capturing Deformations of Interacting Non-rigid Objects Using RGB-D Data. In IROS 2018-IEEE/RSJ International Conference on Intelligent Robots and Systems.Google ScholarGoogle Scholar
  20. Kha Gia Quach, Chi Nhan Duong, Khoa Luu, and Tien D Bui. 2016. Depth-based 3D hand pose tracking. In 2016 23rd International Conference on Pattern Recognition (ICPR). IEEE, 2746--2751.Google ScholarGoogle ScholarCross RefCross Ref
  21. Grégory Rogez, James S Supancic, and Deva Ramanan. 2015a. First-person pose recognition using egocentric workspaces. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4325--4333.Google ScholarGoogle ScholarCross RefCross Ref
  22. Grégory Rogez, James S Supancic, and Deva Ramanan. 2015b. Understanding everyday hands in action from rgb-d images. In Proceedings of the IEEE international conference on computer vision. 3889--3897. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Javier Romero, Hedvig Kjellström, and Danica Kragic. 2010. Hands in action: real-time 3D reconstruction of hands in interaction with objects. In Robotics and Automation (ICRA), 2010 IEEE International Conference on. IEEE, 458--463.Google ScholarGoogle ScholarCross RefCross Ref
  24. Szymon Rusinkiewicz, Olaf Hall-Holt, and Marc Levoy. 2002. Real-time 3D model acquisition. ACM Transactions on Graphics (TOG) 21, 3 (2002), 438--446. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Szymon Rusinkiewicz and Marc Levoy. 2001. Efficient variants of the ICP algorithm. In 3-D Digital Imaging and Modeling, 2001. Proceedings. Third International Conference on. IEEE, 145--152.Google ScholarGoogle Scholar
  26. Tanner Schmidt, Katharina Hertkorn, Richard Newcombe, Zoltan Marton, Michael Suppa, and Dieter Fox. 2015. Depth-based tracking with physical constraints for robot manipulation. In 2015 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 119--126.Google ScholarGoogle ScholarCross RefCross Ref
  27. Miroslava Slavcheva, Maximilian Baust, Daniel Cremers, and Slobodan Ilic. 2017. Killing-fusion: Non-rigid 3d reconstruction without correspondences. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1386--1395.Google ScholarGoogle Scholar
  28. Miroslava Slavcheva, Maximilian Baust, and Slobodan Ilic. 2018. SobolevFusion: 3D reconstruction of scenes undergoing free non-rigid motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2646--2655.Google ScholarGoogle ScholarCross RefCross Ref
  29. Srinath Sridhar, Franziska Mueller, Michael Zollhöfer, Dan Casas, Antti Oulasvirta, and Christian Theobalt. 2016. Real-time joint tracking of a hand manipulating an object from rgb-d input. In European Conference on Computer Vision. Springer, 294--310.Google ScholarGoogle ScholarCross RefCross Ref
  30. Jonathan Taylor, Lucas Bordeaux, Thomas Cashman, Bob Corish, Cem Keskin, Toby Sharp, Eduardo Soto, David Sweeney, Julien Valentin, Benjamin Luff, et al. 2016. Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences. ACM Transactions on Graphics (TOG) 35, 4 (2016), 143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Jonathan Taylor, Vladimir Tankovich, Danhang Tang, Cem Keskin, David Kim, Philip Davidson, Adarsh Kowdle, and Shahram Izadi. 2017. Articulated distance fields for ultra-fast tracking of hands interacting. ACM Transactions on Graphics (TOG) 36, 6 (2017), 244. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Anastasia Tkach, Mark Pauly, and Andrea Tagliasacchi. 2016. Sphere-meshes for real-time hand modeling and tracking. ACM Transactions on Graphics (TOG) 35, 6 (2016), 222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Anastasia Tkach, Andrea Tagliasacchi, Edoardo Remelli, Mark Pauly, and Andrew Fitzgibbon. 2017. Online generative model personalization for hand tracking. ACM Transactions on Graphics (TOG) 36, 6 (2017), 243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Aggeliki Tsoli and Antonis A Argyros. 2018. Joint 3D Tracking of a Deformable Object in Interaction with a Hand. In European Conference on Computer Vision.Google ScholarGoogle ScholarCross RefCross Ref
  35. Dimitrios Tzionas, Luca Ballan, Abhilash Srikantha, Pablo Aponte, Marc Pollefeys, and Juergen Gall. 2016. Capturing hands in action using discriminative salient points and physics simulation. International Journal of Computer Vision 118, 2 (2016), 172--193. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Dimitrios Tzionas and Juergen Gall. 2015. 3d object reconstruction from hand-object interactions. In Proceedings of the IEEE International Conference on Computer Vision. 729--737. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Yangang Wang, Jianyuan Min, Jianjie Zhang, Yebin Liu, Feng Xu, Qionghai Dai, and Jinxiang Chai. 2013. Video-based hand manipulation capture through composite motion control. ACM Transactions on Graphics (TOG) 32, 4 (2013), 43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Thibaut Weise, Bastian Leibe, and Luc Van Gool. 2008. Accurate and robust registration for in-hand modeling. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on. IEEE, 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  39. Thibaut Weise, Thomas Wismer, Bastian Leibe, and Luc Van Gool. 2011. Online loop closure for real-time interactive 3D scanning. Computer Vision and Image Understanding 115, 5 (2011), 635--648. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Carl Yuheng Ren, Victor Prisacariu, David Murray, and Ian Reid. 2013. Star3d: Simultaneous tracking and reconstruction of 3d objects using rgb-d data. In Proceedings of the IEEE International Conference on Computer Vision. 1561--1568. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. InteractionFusion: real-time reconstruction of hand poses and deformable objects in hand-object interactions

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Graphics
      ACM Transactions on Graphics  Volume 38, Issue 4
      August 2019
      1480 pages
      ISSN:0730-0301
      EISSN:1557-7368
      DOI:10.1145/3306346
      Issue’s Table of Contents

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 July 2019
      Published in tog Volume 38, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader