Skip to main content
Top
Published in:

18-02-2021 | Original Paper

Predicting multimodal presentation skills based on instance weighting domain adaptation

Authors: Yutaro Yagi, Shogo Okada, Shota Shiobara, Sota Sugimura

Published in: Journal on Multimodal User Interfaces | Issue 1/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Presentation skills assessment is one of the central challenges of multimodal modeling. Presentation skills are composed of verbal and nonverbal skill components, but because people demonstrate their presentation skills in a variety of manners, the observed multimodal features vary widely. Due to the differences in features, when test data samples are generated on different training data sample distributions, in many cases, the prediction accuracy of the skills degrades. In machine learning theory, this problem in which training (source) data are biased is known as instance selection bias or covariate shift. To solve this problem, this paper presents an instance weighting adaptation method that is applied to estimate the presentation skills of each participant from multimodal (verbal and nonverbal) features. For this purpose, we collect a novel multimodal presentation dataset that includes audio signal data, body motion sensor data, and text data of the speech content for participants observed in 58 presentation sessions. The dataset also includes both verbal and nonverbal presentation skills, which are assessed by two external experts from a human resources department. We extract multimodal features, such as spoken utterances, acoustic features, and the amount of body motion, to estimate the presentation skills. We propose two approaches, early fusing and late fusing, for the regression models based on multimodal instance weighting adaptation. The experimental results show that the early fusing regression model with instance weighting adaptation achieved \(\rho =0.39\) for the Pearson correlation, which presents the regression accuracy for the clarity of presentation goal elements. In the maximum case, the accuracy (correlation coefficient) is improved from \(-0.34\) to +0.35 by instance weighting adaptation.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
The spoken content in the presentations include private information related to the company and the presenter, so the data set is not available to the public due to privacy policies.
 
2
The lecturers provide feedback comments, including the good points in the presentation or points to be improved, to the attendees after the program.
 
Literature
1.
go back to reference Aran O, Gatica-Perez D (2013) One of a kind: inferring personality impressions in meetings. In: Proceedings of ACM ICMI, pp 11–18 Aran O, Gatica-Perez D (2013) One of a kind: inferring personality impressions in meetings. In: Proceedings of ACM ICMI, pp 11–18
2.
go back to reference Baltruŝaitis T, Mahmoud M, Robinson P (2015) Cross-dataset learning and person-specific normalisation for automatic action unit detection. In: Proceedings of FG workshops Baltruŝaitis T, Mahmoud M, Robinson P (2015) Cross-dataset learning and person-specific normalisation for automatic action unit detection. In: Proceedings of FG workshops
3.
go back to reference Batrinca L, Mana N, Lepri B, Sebe N, Pianesi F (2016) Multimodal personality recognition in collaborative goal-oriented tasks. IEEE Trans Multimedia 18(4):659–673CrossRef Batrinca L, Mana N, Lepri B, Sebe N, Pianesi F (2016) Multimodal personality recognition in collaborative goal-oriented tasks. IEEE Trans Multimedia 18(4):659–673CrossRef
4.
go back to reference Berger CR (2003) Chapter 7 “Message Production Skill in Social Interaction”. In: Handbook of communication and social interaction skills. Psychology Press Berger CR (2003) Chapter 7 “Message Production Skill in Social Interaction”. In: Handbook of communication and social interaction skills. Psychology Press
5.
go back to reference Biel JI, Teijeiro-Mosquera L, Gatica-Perez D (2012) Facetube: predicting personality from facial expressions of emotion in online conversational video. In: Proceedings of ACM ICMI Biel JI, Teijeiro-Mosquera L, Gatica-Perez D (2012) Facetube: predicting personality from facial expressions of emotion in online conversational video. In: Proceedings of ACM ICMI
6.
go back to reference Chen L, Feng G, Joe J, Leong CW, Kitchen C, Lee CM (2014) Towards automated assessment of public speaking skills using multimodal cues. In: Proceedings of ACM ICMI Chen L, Feng G, Joe J, Leong CW, Kitchen C, Lee CM (2014) Towards automated assessment of public speaking skills using multimodal cues. In: Proceedings of ACM ICMI
7.
go back to reference Chollet M, Massachi T, Scherer S (2017) Racing heart and sweaty palms. In: Beskow J, Peters C, Castellano G, O’Sullivan C, Leite I, Kopp S (eds) Intelligent virtual agents. Springer International Publishing Chollet M, Massachi T, Scherer S (2017) Racing heart and sweaty palms. In: Beskow J, Peters C, Castellano G, O’Sullivan C, Leite I, Kopp S (eds) Intelligent virtual agents. Springer International Publishing
8.
go back to reference Chollet M, Prendinger H, Scherer S (2016) Native versus non-native language fluency implications on multimodal interaction for interpersonal skills training. In: Proceedings of ACM ICMI Chollet M, Prendinger H, Scherer S (2016) Native versus non-native language fluency implications on multimodal interaction for interpersonal skills training. In: Proceedings of ACM ICMI
9.
go back to reference Chollet M, Scherer S (2017) Assessing public speaking ability from thin slices of behavior. In: Proceedings of IEEE FG Chollet M, Scherer S (2017) Assessing public speaking ability from thin slices of behavior. In: Proceedings of IEEE FG
10.
go back to reference Chollet M, Stefanov K, Prendinger H, Scherer S (2015) Public speaking training with a multimodal interactive virtual audience framework. In: Proceedings of ACM ICMI Chollet M, Stefanov K, Prendinger H, Scherer S (2015) Public speaking training with a multimodal interactive virtual audience framework. In: Proceedings of ACM ICMI
11.
go back to reference Chollet M, Wörtwein T, Morency LP, Shapiro A, Scherer S (2015) Exploring feedback strategies to improve public speaking: An interactive virtual audience framework. In: Proceedings of ACM UbiComp Chollet M, Wörtwein T, Morency LP, Shapiro A, Scherer S (2015) Exploring feedback strategies to improve public speaking: An interactive virtual audience framework. In: Proceedings of ACM UbiComp
12.
go back to reference Greene JO, Burleson BR (2003) Handbook of communication and social interaction skills. Psychology Press Greene JO, Burleson BR (2003) Handbook of communication and social interaction skills. Psychology Press
13.
go back to reference Hall JA (1984) Nonverbal sex differences? Communication accuracy and expressive style. Johns Hopkins University Press Hall JA (1984) Nonverbal sex differences? Communication accuracy and expressive style. Johns Hopkins University Press
14.
go back to reference Hoque ME, Courgeon M, Martin JC, Mutlu B, Picard RW (2013) Mach: my automated conversation coach. In: Proceedings of ACM UbiComp. ACM, pp 697–706 Hoque ME, Courgeon M, Martin JC, Mutlu B, Picard RW (2013) Mach: my automated conversation coach. In: Proceedings of ACM UbiComp. ACM, pp 697–706
15.
go back to reference Härdle W, Müller M, Sperlich S, Werwatz A (2004) Nonparametric and semiparametric models Härdle W, Müller M, Sperlich S, Werwatz A (2004) Nonparametric and semiparametric models
16.
go back to reference Ishii R, Otsuka K, Kumano S, Higashinaka R, Tomita J (2018) Analyzing gaze behavior and dialogue act during turn-taking for estimating empathy skill level. In: Proceedings of ACM ICMI Ishii R, Otsuka K, Kumano S, Higashinaka R, Tomita J (2018) Analyzing gaze behavior and dialogue act during turn-taking for estimating empathy skill level. In: Proceedings of ACM ICMI
17.
go back to reference Jayagopi DB, Sanchez-Cortes D, Otsuka K, Yamato J, Gatica-Perez D (2012) Linking speaking and looking behavior patterns with group composition, perception, and performance. In: Proceedings of ACM ICMI Jayagopi DB, Sanchez-Cortes D, Otsuka K, Yamato J, Gatica-Perez D (2012) Linking speaking and looking behavior patterns with group composition, perception, and performance. In: Proceedings of ACM ICMI
18.
go back to reference Kanamori T, Hido S, Sugiyama M (2009) A least-squares approach to direct importance estimation. J Mach Learn Res 10:1391–1445MathSciNetMATH Kanamori T, Hido S, Sugiyama M (2009) A least-squares approach to direct importance estimation. J Mach Learn Res 10:1391–1445MathSciNetMATH
19.
go back to reference Kanamori T, Suzuki T, Sugiyama M (2012) Statistical analysis of kernel-based least-squares density-ratio estimation. Mach Learn 86(3):335–367MathSciNetCrossRef Kanamori T, Suzuki T, Sugiyama M (2012) Statistical analysis of kernel-based least-squares density-ratio estimation. Mach Learn 86(3):335–367MathSciNetCrossRef
20.
go back to reference Kudo T, Yamamoto K, Matsumoto Y (2004) Applying conditional random fields to Japanese morphological analysis. In: Proceedings of EMNLP Kudo T, Yamamoto K, Matsumoto Y (2004) Applying conditional random fields to Japanese morphological analysis. In: Proceedings of EMNLP
21.
go back to reference Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174CrossRef Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174CrossRef
22.
go back to reference Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of ICML Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of ICML
23.
24.
go back to reference Li Y, Kambara H, Koike Y, Sugiyama M (2010) Application of covariate shift adaptation techniques in brain-computer interfaces. IEEE Trans Biomed Eng 57(6):1318–1324CrossRef Li Y, Kambara H, Koike Y, Sugiyama M (2010) Application of covariate shift adaptation techniques in brain-computer interfaces. IEEE Trans Biomed Eng 57(6):1318–1324CrossRef
25.
go back to reference Lin YS, Lee CC (2018) Using interlocutor-modulated attention blstm to predict personality traits in small group interaction. In: Proceedings of ACM ICMI Lin YS, Lee CC (2018) Using interlocutor-modulated attention blstm to predict personality traits in small group interaction. In: Proceedings of ACM ICMI
26.
go back to reference Lombard M, Snyder-Duch J, Bracken C (2005) Practical resources for assessing and reporting intercoder reliability in content analysis research projects. Retrieved April 19 Lombard M, Snyder-Duch J, Bracken C (2005) Practical resources for assessing and reporting intercoder reliability in content analysis research projects. Retrieved April 19
27.
go back to reference Mikolov T, Corrado G, Chen K, Dean J (2013) Efficient estimation of word representations in vector space Mikolov T, Corrado G, Chen K, Dean J (2013) Efficient estimation of word representations in vector space
28.
go back to reference Nguyen L, Frauendorfer D, Mast M, Gatica-Perez D (2014) Hire me: computational inference of hirability in employment interviews based on nonverbal behavior. IEEE Trans Multimedia Nguyen L, Frauendorfer D, Mast M, Gatica-Perez D (2014) Hire me: computational inference of hirability in employment interviews based on nonverbal behavior. IEEE Trans Multimedia
29.
go back to reference Okada S, Komatani K (2018) Investigating effectiveness of linguistic features based on speech recognition for storytelling skill assessment. In: Recent trends and future technology in applied intelligence. Springer International Publishing, pp 148–157 Okada S, Komatani K (2018) Investigating effectiveness of linguistic features based on speech recognition for storytelling skill assessment. In: Recent trends and future technology in applied intelligence. Springer International Publishing, pp 148–157
30.
go back to reference Okada S, Ohtake Y, Nakano YI, Hayashi Y, Huang HH, Takase Y, Nitta K (2016) Estimating communication skills using dialogue acts and nonverbal features in multiple discussion datasets. In: Proceedings of ACM ICMI Okada S, Ohtake Y, Nakano YI, Hayashi Y, Huang HH, Takase Y, Nitta K (2016) Estimating communication skills using dialogue acts and nonverbal features in multiple discussion datasets. In: Proceedings of ACM ICMI
31.
go back to reference Park S, Shim HS, Chatterjee M, Sagae K, Morency LP (2014) Computational analysis of persuasiveness in social multimedia: A novel dataset and multimodal prediction approach. In: Proceedings of ACM ICMI Park S, Shim HS, Chatterjee M, Sagae K, Morency LP (2014) Computational analysis of persuasiveness in social multimedia: A novel dataset and multimodal prediction approach. In: Proceedings of ACM ICMI
32.
go back to reference Pérez-Rosas V, Mihalcea R, Morency LP (2013) Utterance-level multimodal sentiment analysis. In: Proceedings of ACL Pérez-Rosas V, Mihalcea R, Morency LP (2013) Utterance-level multimodal sentiment analysis. In: Proceedings of ACL
33.
go back to reference Pianesi F, Mana N, Cappelletti A, Lepri B, Zancanaro M (2008) Multimodal recognition of personality traits in social interactions. In: Proceedings of ACM ICMI Pianesi F, Mana N, Cappelletti A, Lepri B, Zancanaro M (2008) Multimodal recognition of personality traits in social interactions. In: Proceedings of ACM ICMI
34.
go back to reference Ramanarayanan V, Leong CW, Chen L, Feng G, Suendermann-Oeft D (2015) Evaluating speech, face, emotion and body movement time-series features for automated multimodal presentation scoring. In: Proceedings of ACM ICMI Ramanarayanan V, Leong CW, Chen L, Feng G, Suendermann-Oeft D (2015) Evaluating speech, face, emotion and body movement time-series features for automated multimodal presentation scoring. In: Proceedings of ACM ICMI
35.
go back to reference Rosenberg A, Hirschberg J (2005) Acoustic/prosodic and lexical correlates of charismatic speech. In: Proceedings of INTERSPEECH Rosenberg A, Hirschberg J (2005) Acoustic/prosodic and lexical correlates of charismatic speech. In: Proceedings of INTERSPEECH
36.
go back to reference Sanchez-Cortes D, Aran O, Mast MS, Gatica-Perez D (2012) A nonverbal behavior approach to identify emergent leaders in small groups. IEEE Trans Multimedia 14 Sanchez-Cortes D, Aran O, Mast MS, Gatica-Perez D (2012) A nonverbal behavior approach to identify emergent leaders in small groups. IEEE Trans Multimedia 14
37.
go back to reference Scherer S, Weibel N, Morency LP, Oviatt S (2012) Multimodal prediction of expertise and leadership in learning groups. In: Proceedings of the international workshop on MLA Scherer S, Weibel N, Morency LP, Oviatt S (2012) Multimodal prediction of expertise and leadership in learning groups. In: Proceedings of the international workshop on MLA
38.
go back to reference Shimodaira H (2000) Improving predictive inference under covariate shift by weighting the log-likelihood function. J Stat Plan Inference 90(2):227–244MathSciNetCrossRef Shimodaira H (2000) Improving predictive inference under covariate shift by weighting the log-likelihood function. J Stat Plan Inference 90(2):227–244MathSciNetCrossRef
39.
go back to reference Sugiyama M, Kawanabe M (2012) Machine learning in non-stationary environments: introduction to covariate shift adaptation. The MIT Press Sugiyama M, Kawanabe M (2012) Machine learning in non-stationary environments: introduction to covariate shift adaptation. The MIT Press
40.
go back to reference Sugiyama M, Nakajima S, Kashima H, Buenau PV, Kawanabe M (2008) Direct importance estimation with model selection and its application to covariate shift adaptation. In: Proceedings of advances in neural information processing systems Sugiyama M, Nakajima S, Kashima H, Buenau PV, Kawanabe M (2008) Direct importance estimation with model selection and its application to covariate shift adaptation. In: Proceedings of advances in neural information processing systems
41.
go back to reference Tanaka H, Negoro H, Iwasaka H, Nakamura S (2018) Listening skills assessment through computer agents. In: Proceedings of ACM ICMI Tanaka H, Negoro H, Iwasaka H, Nakamura S (2018) Listening skills assessment through computer agents. In: Proceedings of ACM ICMI
42.
go back to reference Tanaka H, Sakti S, Neubig G, Toda T, Negoro H, Iwasaka H, Nakamura S (2015) Automated social skills trainer. In: Proceedings of ACM IUI Tanaka H, Sakti S, Neubig G, Toda T, Negoro H, Iwasaka H, Nakamura S (2015) Automated social skills trainer. In: Proceedings of ACM IUI
43.
go back to reference Tsuboi Y, Kashima H, Hido S, Bickel S, Sugiyama M (2009) Direct density ratio estimation for large-scale covariate shift adaptation. J Inf Process 17:138–155 Tsuboi Y, Kashima H, Hido S, Bickel S, Sugiyama M (2009) Direct density ratio estimation for large-scale covariate shift adaptation. J Inf Process 17:138–155
44.
go back to reference Valente F, Kim S, Motlicek P (2012) Annotation and recognition of personality traits in spoken conversations from the ami meetings corpus. In: Proceedings of INTERSPEECH Valente F, Kim S, Motlicek P (2012) Annotation and recognition of personality traits in spoken conversations from the ami meetings corpus. In: Proceedings of INTERSPEECH
45.
go back to reference Wood E, Baltruaitis T, Zhang X, Sugano Y, Robinson P, Bulling A (2015) Rendering of eyes for eye-shape registration and gaze estimation. In: Proceedings of IEEE ICCV Wood E, Baltruaitis T, Zhang X, Sugano Y, Robinson P, Bulling A (2015) Rendering of eyes for eye-shape registration and gaze estimation. In: Proceedings of IEEE ICCV
46.
go back to reference Wörtwein T, Chollet M, Schauerte B, Morency LP, Stiefelhagen R, Scherer S (2015) Multimodal public speaking performance assessment. In: Proceedings of ACM ICMI Wörtwein T, Chollet M, Schauerte B, Morency LP, Stiefelhagen R, Scherer S (2015) Multimodal public speaking performance assessment. In: Proceedings of ACM ICMI
47.
go back to reference Wörtwein T, Morency L, Scherer S (2015) Automatic assessment and analysis of public speaking anxiety: a virtual audience case study. In: Proceedings of ACII Wörtwein T, Morency L, Scherer S (2015) Automatic assessment and analysis of public speaking anxiety: a virtual audience case study. In: Proceedings of ACII
48.
go back to reference Zadrozny B (2004) Learning and evaluating classifiers under sample selection bias. In: Proceedings of ICML Zadrozny B (2004) Learning and evaluating classifiers under sample selection bias. In: Proceedings of ICML
Metadata
Title
Predicting multimodal presentation skills based on instance weighting domain adaptation
Authors
Yutaro Yagi
Shogo Okada
Shota Shiobara
Sota Sugimura
Publication date
18-02-2021
Publisher
Springer International Publishing
Published in
Journal on Multimodal User Interfaces / Issue 1/2022
Print ISSN: 1783-7677
Electronic ISSN: 1783-8738
DOI
https://doi.org/10.1007/s12193-021-00367-x

Premium Partner