nach oben

Erschienen in:

2021 | OriginalPaper | Buchkapitel

Top-1 CORSMAL Challenge 2020 Submission: Filling Mass Estimation Using Multi-modal Observations of Human-Robot Handovers

verfasst von : Vladimir Iashin, Francesca Palermo, Gökhan Solak, Claudio Coppola

Erschienen in: Pattern Recognition. ICPR International Workshops and Challenges

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Human-robot object handover is a key skill for the future of human-robot collaboration. CORSMAL 2020 Challenge focuses on the perception part of this problem: the robot needs to estimate the filling mass of a container held by a human. Although there are powerful methods in image processing and audio processing individually, answering such a problem requires processing data from multiple sensors together. The appearance of the container, the sound of the filling, and the depth data provide essential information. We propose a multi-modal method to predict three key indicators of the filling mass: filling type, filling level, and container capacity. These indicators are then combined to estimate the filling mass of a container. Our method obtained Top-1 overall performance among all submissions to CORSMAL 2020 Challenge on both public and private subsets while showing no evidence of overfitting. Our source code is publicly available: github.com/v-iashin/CORSMAL.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel ICPR 2020 Competition on Text Block Segmentation on a NewsEye Dataset

Nächstes Kapitel Audio-Visual Hybrid Approach for Filling Mass Estimation

github.com/tyiannak/pyAudioAnalysis.

github.com/v-iashin/video_features.

github.com/CORSMAL/LoDE.

hub.docker.com/r/iashin/corsmal.

github.com/v-iashin/CORSMAL.

We also trained the GRU model on R(2+1)d features, yet it reached only F1 = 67.3.

The 2020 CORSMAL Challenge. Multi-modal fusion and learning for robotics. https://corsmal.eecs.qmul.ac.uk/ICPR2020challenge.html. Accessed 22 Nov 2020

Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996). https://doi.org/10.1007/BF00058655CrossRefMATH

Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014 Workshop on Deep Learning, Dec 2014 (2014)

Giannakopoulos, T.: pyAudioAnalysis: an open-source python library for audio signal analysis. PLoS One 10(12), e0144610 (2015)CrossRef

Griffith, S., Sukhoy, V., Wegter, T., Stoytchev, A.: Object categorization in the sink: learning behavior-grounded object categories with water. In: Proceedings of the 2012 ICRA Workshop on Semantic Perception, Mapping and Exploration. Citeseer (2012)

Hampali, S., Rad, M., Oberweger, M., Lepetit, V.: Honnotate: a method for 3D annotation of hand and object poses. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3196–3206 (2020)

He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

Hershey, S., et al.: CNN architectures for large-scale audio classification. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 131–135 (2017). https://doi.org/10.1109/ICASSP.2017.7952132

10.

Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef

11.

Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7310–7311 (2017)

12.

Iashin, V., Rahtu, E.: A better use of audio-visual cues: dense video captioning with bi-modal transformer. In: British Machine Vision Conference (BMVC) (2020)

13.

Iashin, V., Rahtu, E.: Multi-modal dense video captioning. In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp. 958–959 (2020)

14.

King, R.D., et al.: Is it better to combine predictions? Protein Eng. 13(1), 15–19 (2000)CrossRef

15.

Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings (2015)

16.

Kokic, M., Kragic, D., Bohg, J.: Learning to estimate pose and shape of hand-held objects from RGB images. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3980–3987. IEEE (2019)

17.

Liang, H., et al.: Making sense of audio vibration for liquid height estimation in robotic pouring. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5333–5339 (2019). https://doi.org/10.1109/IROS40897.2019.8968303

18.

Liang, H., et al.: Robust robotic pouring using audition and haptics. arXiv preprint arXiv:2003.00342 (2020)

19.

Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48CrossRef

20.

Liu, Y., Albanie, S., Nagrani, A., Zisserman, A.: Use what you have: video retrieval using representations from collaborative experts. In: British Machine Vision Conference (2019)

21.

Mottaghi, R., Schenck, C., Fox, D., Farhadi, A.: See the glass half full: reasoning about liquid containers, their volume and content. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1871–1880 (2017)

22.

Peng, S., Liu, Y., Huang, Q., Zhou, X., Bao, H.: PVNet: pixel-wise voting network for 6DoF pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4561–4570 (2019)

23.

Phillips, C.J., Lecce, M., Daniilidis, K.: Seeing glassware: from edge detection to pose estimation and shape recovery. In: Robotics: Science and Systems, vol. 3 (2016)

24.

Sajjan, S., et al.: Clear grasp: 3D shape estimation of transparent objects for manipulation. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 3634–3642. IEEE (2020)

25.

Sanchez-Matilla, R., et al.: Benchmark for human-to-robot handovers of unseen containers with unknown filling. IEEE Robot. Autom. Lett. 5(2), 1642–1649 (2020). https://doi.org/10.1109/LRA.2020.2969200CrossRef

26.

Statistics, L.B., Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)CrossRef

27.

Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., Paluri, M.: A closer look at spatiotemporal convolutions for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6450–6459 (2018)

28.

Wang, C., et al.: Densefusion: 6D object pose estimation by iterative dense fusion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3343–3352 (2019)

29.

Wang, H., Sridhar, S., Huang, J., Valentin, J., Song, S., Guibas, L.J.: Normalized object coordinate space for category-level 6D object pose and size estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019)

30.

Wang, Q., Zhang, L., Bertinetto, L., Hu, W., Torr, P.H.: Fast online object tracking and segmentation: a unifying approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1328–1338 (2019)

31.

Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes (2018)

32.

Xompero, A., Sanchez-Matilla, R., Modas, A., Frossard, P., Cavallaro, A.: Multi-view shape estimation of transparent containers. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2363–2367 (2020). https://doi.org/10.1109/ICASSP40776.2020.9054112

33.

Xompero, A., Sanchez-Matilla, R., Mazzon, R., Cavallaro, A.: CORSMAL containers manipulation (2020). https://doi.org/10.17636/101CORSMAL1, http://corsmal.eecs.qmul.ac.uk/containers_manip.html

Titel: Top-1 CORSMAL Challenge 2020 Submission: Filling Mass Estimation Using Multi-modal Observations of Human-Robot Handovers
verfasst von: Vladimir Iashin
Francesca Palermo
Gökhan Solak
Claudio Coppola
Verlag: Springer International Publishing
Buch: Pattern Recognition. ICPR International Workshops and Challenges
Print ISBN: 978-3-030-68792-2

Electronic ISBN: 978-3-030-68793-9

Copyright-Jahr: 2021
DOI: https://doi.org/10.1007/978-3-030-68793-9_31

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"