nach oben

Erschienen in:

2020 | OriginalPaper | Buchkapitel

Explainable and Transferrable Text Categorization

verfasst von : Tobias Eljasik-Swoboda, Felix Engel, Matthias Hemmje

Erschienen in: Data Management Technologies and Applications

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Automated argument stance (pro/contra) detection is a challenging text categorization problem, especially if said arguments are to be detected for new topics. In previous research, we designed and evaluated an explainable machine learning based classifier. It was capable to achieve 96% F1 for argument stance recognition within the same topic and 60% F1 for previously unseen topics, which informed our hypothesis, that there are two sets of features in argument stance recognition: General features and topic specific features. An advantage of the described system is its quick transferability to new problems. Besides providing further details about the developed C3 TFIDF-SVM classifier, we investigate the classifiers effectiveness for different text categorization problems spanning two natural languages. Besides the quick transferability, the generation of human readable explanations about why specific results were achieved is a key feature of the described approach. We further investigate the generated explanation understandability and conduct a survey about how understandable the classifier’s explanations are.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nächstes Kapitel Dealing with Critical Issues in Emails: A Comparison of Approaches for Sentiment Analysis

US National Library of Medicine National Institutes of Health pubmed.gov. https://www.ncbi.nlm.nih.gov/pubmed/. Accessed 17 Sep 2019

European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance); OJ L, 4 May, 2016, vol. 119, pp. 1–88 (2016)

Clos, J., Wiratunga, N., Massie, S.: Towards explainable text classification by jointly learning lexicon and modifier terms. In: IJCAI-17 Workshop on Explainable AI (XAI) (2017)

Lippi, M., Torroni, P.: Argument mining: a machine learning perspective. In: Black, E., Modgil, S., Oren, N. (eds.) TAFA 2015. LNCS (LNAI), vol. 9524, pp. 163–176. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-28460-6_10CrossRef

Mohammad, S., Kiritchenko, S., Sobhani, P., Zhu, X., Cherry, C: Semeval-2016 task 6: detecting stance in tweets. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 31–41 (2016)

Eljasik-Swoboda, T., Engel, F., Hemmje, M.: Using topic specific features for argument stance recognition. In: Proceedings of the 8th International Conference on Data Science, Technology and Applications (DATA 2019), pp. 13–22 (2019). ISBN:978-989-758-377-3

Mohammad, S.M., Sobhani, P., Kiritchenko, S.: Stance and sentiment in tweets. ACM Trans. Internet Technol. Argument. Soc. Media 17, 1–23 (2016)

Stab, C., Miller, T., Schiller, B., Rai, P., Gurevych, I.: Cross-topic argument mining from heterogeneous sources. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2018) (2018)

Same Side Stance Classification. https://sameside.webis.de/. Accessed 24 Sep 2019

10.

Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34, 1–47 (2002)CrossRef

11.

Bader, S., Hitzler, P.: Dimensions of neural-symbolic integration – a structured survey. arXiv preprint arXiv:cs/0511042 (2005)

12.

Swoboda, T., Kaufmann, M., Hemmje, M.: Toward cloud-based classification and annotation support. In: Proceedings of the 6th International Conference on Cloud Computing and Services Science (CLOSER 2016), vol. 2, pp. 131–237 (2016)

13.

McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5, 115–133 (1943)MathSciNetCrossRef

14.

Helbig, H., Scherer, A.: Kurs 1830: Neuronale Netze. University of Hagen, Germany (2011)

15.

Arel, I., Rose, D.C., Karnowski, T.P.: Deep machine learning – a new frontier in artificial intelligence research. In: IEEE Computational Intelligence Magazine, USA, November issue, pp. 13–18 (2010)

16.

Vapnik, V.N., Chervonenkis, A.Y.: On a class of algorithms of learning pattern recognition. Framework of the Generalised Portrait Method, Oб oднoм клacce aлгopитмoв oбyчeния pacпoзнaвaнию oбpaзoв, Aвтoмaтикa и тeлeмexaникa (1964)

17.

Chang, C., Lin, C.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–27 (2011)CrossRef

18.

Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representation in vector space. In: Proceedings of Workshop at ICLR (2013)

19.

Pennington, J., Socher, R., Manning, C.: GloBe: global vectors for word representation. In: Empirical Methods in Natural Language Processing, pp. 1532–1543 (2014)

20.

Zanzotto, F.M., Korkontzelos, I., Fallucchi, F., Manandhar, S.: Estimating linear models for compositional distributed semantics. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 1263–1271 (2010)

21.

Kusner, M.J., Sun, Y., Kolkin, N., Weinberger, K.Q.: From word embeddings to document distances. In: Proceedings of the 32nd International Conference on Machine Learning (2015)

22.

Dai, X., Bikdash, M., Meyer, M.: From social media to public health surveillance: word embedding based clustering method for twitter classification. In: Proceedings of SoutheastCon, pp. 1–7 (2017). https://doi.org/10.1109/secon.2017.7925400

23.

Eljasik-Swoboda, T., Kaufmann, M., Hemmje, M.: No target function classifier – fast unsupervised text categorization using semantic spaces. In: Proceedings of the 7th International Conference on Data Science, Technology and Applications (DATA 2018), pp. 35–46 (2018)

24.

Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805v2 (2019)

25.

Wolff, E.: Microservices – Flexible Software Architecture. Pearson Education, USA (2017)

26.

Dropwizard: Production-ready, out of the box. https://dropwizard.io. Accessed 12 Sep 2019

27.

Enterprise Container Platform | Docker. https://www.docker.com/. Accessed 30 Sep 2019

28.

Peldszus, A.: An annotated corpus of argumentative microtexts. https://github.com/peldszus/arg-microtexts. Accessed 15 Mar 2019

29.

Wiegand, M., Siegel, M., Ruppenhofer, J.: Overview of the GermEval 2018 shared task on the identification of offensive language. In: Proceedings of the GermEval, Vienna, Austria (2018)

30.

Coucke, A., et al.: Snipts voice platform, an embedded spoken language understanding system for private-by-design voice interfaces. arXiv:1805.10190 (2018)

Titel: Explainable and Transferrable Text Categorization
verfasst von: Tobias Eljasik-Swoboda
Felix Engel
Matthias Hemmje
Verlag: Springer International Publishing
Buch: Data Management Technologies and Applications
Print ISBN: 978-3-030-54594-9

Electronic ISBN: 978-3-030-54595-6

Copyright-Jahr: 2020
DOI: https://doi.org/10.1007/978-3-030-54595-6_1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"