Skip to main content
Erschienen in:
Buchtitelbild

2020 | OriginalPaper | Buchkapitel

Explainable and Transferrable Text Categorization

verfasst von : Tobias Eljasik-Swoboda, Felix Engel, Matthias Hemmje

Erschienen in: Data Management Technologies and Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Automated argument stance (pro/contra) detection is a challenging text categorization problem, especially if said arguments are to be detected for new topics. In previous research, we designed and evaluated an explainable machine learning based classifier. It was capable to achieve 96% F1 for argument stance recognition within the same topic and 60% F1 for previously unseen topics, which informed our hypothesis, that there are two sets of features in argument stance recognition: General features and topic specific features. An advantage of the described system is its quick transferability to new problems. Besides providing further details about the developed C3 TFIDF-SVM classifier, we investigate the classifiers effectiveness for different text categorization problems spanning two natural languages. Besides the quick transferability, the generation of human readable explanations about why specific results were achieved is a key feature of the described approach. We further investigate the generated explanation understandability and conduct a survey about how understandable the classifier’s explanations are.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance); OJ L, 4 May, 2016, vol. 119, pp. 1–88 (2016) European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance); OJ L, 4 May, 2016, vol. 119, pp. 1–88 (2016)
3.
Zurück zum Zitat Clos, J., Wiratunga, N., Massie, S.: Towards explainable text classification by jointly learning lexicon and modifier terms. In: IJCAI-17 Workshop on Explainable AI (XAI) (2017) Clos, J., Wiratunga, N., Massie, S.: Towards explainable text classification by jointly learning lexicon and modifier terms. In: IJCAI-17 Workshop on Explainable AI (XAI) (2017)
5.
Zurück zum Zitat Mohammad, S., Kiritchenko, S., Sobhani, P., Zhu, X., Cherry, C: Semeval-2016 task 6: detecting stance in tweets. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 31–41 (2016) Mohammad, S., Kiritchenko, S., Sobhani, P., Zhu, X., Cherry, C: Semeval-2016 task 6: detecting stance in tweets. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 31–41 (2016)
6.
Zurück zum Zitat Eljasik-Swoboda, T., Engel, F., Hemmje, M.: Using topic specific features for argument stance recognition. In: Proceedings of the 8th International Conference on Data Science, Technology and Applications (DATA 2019), pp. 13–22 (2019). ISBN:978-989-758-377-3 Eljasik-Swoboda, T., Engel, F., Hemmje, M.: Using topic specific features for argument stance recognition. In: Proceedings of the 8th International Conference on Data Science, Technology and Applications (DATA 2019), pp. 13–22 (2019). ISBN:978-989-758-377-3
7.
Zurück zum Zitat Mohammad, S.M., Sobhani, P., Kiritchenko, S.: Stance and sentiment in tweets. ACM Trans. Internet Technol. Argument. Soc. Media 17, 1–23 (2016) Mohammad, S.M., Sobhani, P., Kiritchenko, S.: Stance and sentiment in tweets. ACM Trans. Internet Technol. Argument. Soc. Media 17, 1–23 (2016)
8.
Zurück zum Zitat Stab, C., Miller, T., Schiller, B., Rai, P., Gurevych, I.: Cross-topic argument mining from heterogeneous sources. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2018) (2018) Stab, C., Miller, T., Schiller, B., Rai, P., Gurevych, I.: Cross-topic argument mining from heterogeneous sources. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2018) (2018)
10.
Zurück zum Zitat Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34, 1–47 (2002)CrossRef Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34, 1–47 (2002)CrossRef
11.
Zurück zum Zitat Bader, S., Hitzler, P.: Dimensions of neural-symbolic integration – a structured survey. arXiv preprint arXiv:cs/0511042 (2005) Bader, S., Hitzler, P.: Dimensions of neural-symbolic integration – a structured survey. arXiv preprint arXiv:​cs/​0511042 (2005)
12.
Zurück zum Zitat Swoboda, T., Kaufmann, M., Hemmje, M.: Toward cloud-based classification and annotation support. In: Proceedings of the 6th International Conference on Cloud Computing and Services Science (CLOSER 2016), vol. 2, pp. 131–237 (2016) Swoboda, T., Kaufmann, M., Hemmje, M.: Toward cloud-based classification and annotation support. In: Proceedings of the 6th International Conference on Cloud Computing and Services Science (CLOSER 2016), vol. 2, pp. 131–237 (2016)
13.
Zurück zum Zitat McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5, 115–133 (1943)MathSciNetCrossRef McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5, 115–133 (1943)MathSciNetCrossRef
14.
Zurück zum Zitat Helbig, H., Scherer, A.: Kurs 1830: Neuronale Netze. University of Hagen, Germany (2011) Helbig, H., Scherer, A.: Kurs 1830: Neuronale Netze. University of Hagen, Germany (2011)
15.
Zurück zum Zitat Arel, I., Rose, D.C., Karnowski, T.P.: Deep machine learning – a new frontier in artificial intelligence research. In: IEEE Computational Intelligence Magazine, USA, November issue, pp. 13–18 (2010) Arel, I., Rose, D.C., Karnowski, T.P.: Deep machine learning – a new frontier in artificial intelligence research. In: IEEE Computational Intelligence Magazine, USA, November issue, pp. 13–18 (2010)
16.
Zurück zum Zitat Vapnik, V.N., Chervonenkis, A.Y.: On a class of algorithms of learning pattern recognition. Framework of the Generalised Portrait Method, Oб oднoм клacce aлгopитмoв oбyчeния pacпoзнaвaнию oбpaзoв, Aвтoмaтикa и тeлeмexaникa (1964) Vapnik, V.N., Chervonenkis, A.Y.: On a class of algorithms of learning pattern recognition. Framework of the Generalised Portrait Method, Oб oднoм клacce aлгopитмoв oбyчeния pacпoзнaвaнию oбpaзoв, Aвтoмaтикa и тeлeмexaникa (1964)
17.
Zurück zum Zitat Chang, C., Lin, C.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–27 (2011)CrossRef Chang, C., Lin, C.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–27 (2011)CrossRef
18.
Zurück zum Zitat Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representation in vector space. In: Proceedings of Workshop at ICLR (2013) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representation in vector space. In: Proceedings of Workshop at ICLR (2013)
19.
Zurück zum Zitat Pennington, J., Socher, R., Manning, C.: GloBe: global vectors for word representation. In: Empirical Methods in Natural Language Processing, pp. 1532–1543 (2014) Pennington, J., Socher, R., Manning, C.: GloBe: global vectors for word representation. In: Empirical Methods in Natural Language Processing, pp. 1532–1543 (2014)
20.
Zurück zum Zitat Zanzotto, F.M., Korkontzelos, I., Fallucchi, F., Manandhar, S.: Estimating linear models for compositional distributed semantics. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 1263–1271 (2010) Zanzotto, F.M., Korkontzelos, I., Fallucchi, F., Manandhar, S.: Estimating linear models for compositional distributed semantics. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 1263–1271 (2010)
21.
Zurück zum Zitat Kusner, M.J., Sun, Y., Kolkin, N., Weinberger, K.Q.: From word embeddings to document distances. In: Proceedings of the 32nd International Conference on Machine Learning (2015) Kusner, M.J., Sun, Y., Kolkin, N., Weinberger, K.Q.: From word embeddings to document distances. In: Proceedings of the 32nd International Conference on Machine Learning (2015)
23.
Zurück zum Zitat Eljasik-Swoboda, T., Kaufmann, M., Hemmje, M.: No target function classifier – fast unsupervised text categorization using semantic spaces. In: Proceedings of the 7th International Conference on Data Science, Technology and Applications (DATA 2018), pp. 35–46 (2018) Eljasik-Swoboda, T., Kaufmann, M., Hemmje, M.: No target function classifier – fast unsupervised text categorization using semantic spaces. In: Proceedings of the 7th International Conference on Data Science, Technology and Applications (DATA 2018), pp. 35–46 (2018)
24.
Zurück zum Zitat Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805v2 (2019) Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv:​1810.​04805v2 (2019)
25.
Zurück zum Zitat Wolff, E.: Microservices – Flexible Software Architecture. Pearson Education, USA (2017) Wolff, E.: Microservices – Flexible Software Architecture. Pearson Education, USA (2017)
29.
Zurück zum Zitat Wiegand, M., Siegel, M., Ruppenhofer, J.: Overview of the GermEval 2018 shared task on the identification of offensive language. In: Proceedings of the GermEval, Vienna, Austria (2018) Wiegand, M., Siegel, M., Ruppenhofer, J.: Overview of the GermEval 2018 shared task on the identification of offensive language. In: Proceedings of the GermEval, Vienna, Austria (2018)
30.
Zurück zum Zitat Coucke, A., et al.: Snipts voice platform, an embedded spoken language understanding system for private-by-design voice interfaces. arXiv:1805.10190 (2018) Coucke, A., et al.: Snipts voice platform, an embedded spoken language understanding system for private-by-design voice interfaces. arXiv:​1805.​10190 (2018)
Metadaten
Titel
Explainable and Transferrable Text Categorization
verfasst von
Tobias Eljasik-Swoboda
Felix Engel
Matthias Hemmje
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-54595-6_1