Top

Published in:

2020 | OriginalPaper | Chapter

Time Expressions Identification Without Human-Labeled Corpus for Clinical Text Mining in Russian

Authors : Anastasia A. Funkner, Sergey V. Kovalchuk

Published in: Computational Science – ICCS 2020

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

To obtain accurate predictive models in medicine, it is necessary to use complete relevant information about the patient. We propose an approach for extracting temporary expressions from unlabeled natural language texts. This approach can be used for the first analysis of the corpus, for data labeling as the first stage, or for obtaining linguistic constructions that can be used for a rule-based approach to retrieve information. Our method includes the sequential use of several machine learning and natural language processing methods: classification of sentences, the transformation of word bag frequencies, clustering of sentences with time expressions, classification of new data into clusters and construction of sentence profiles using feature importances. With this method, we derive the list of the most frequent time expressions and extract events and/or time events for 9801 sentences of anamnesis in Russian. The proposed approach is independent of the corpus language and can be used for other tasks, for example, extracting an experiencer of a disease.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Risk-Based AED Placement - Singapore Case

next chapter Experiencer Detection and Automated Extraction of a Family Disease Tree from Medical Texts in Russian Language

Jackson, P., Moulinier, I.: Natural Language Processing for Online Applications: Text Retrieval, Extraction, and Categorization. John Benjamins Publishing Company, Amsterdam (2002)CrossRef

Dalianis, H.: Clinical Text Mining: Secondary Use of Electronic Patient Records. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78503-5CrossRef

Riloff, E.: Automatically constructing a dictionary for information extraction tasks. In: Proceedings of National Conference on Artificial Intelligence, pp. 811–816 (1993)

Riloff, E., Jones, R.: Learning dictionaries for information bootstrapping extraction by multi-level. In: Proceeding AAAI 1999/IAAI 1999 Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence, pp. 474–479 (1999)

Shickel, B., Tighe, P.J., Bihorac, A., Rashidi, P.: Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis

Kudinov M.S., Romanenko A.A., Piontkovskaja I.I.: Conditional random field in segmentation and noun phrase inclination tasks for Russian. Кoмпьютepнaя лингвиcтикa и интeллeктyaльныe тexнoлoгии, pp. 297–306 (2014)

Shelmanov, A.O., Smirnov, I.V., Vishneva, E.A.: Information extraction from clinical texts in Russian. Komp’juternaja Lingvistika i Intellektual’nye Tehnol. 1, 560–572 (2015)

Baranov, A., et al.: Technologies for complex intelligent clinical data analysis. Vestn. Ross. Akad. meditsinskikh Nauk. 71, 160–171 (2016). https://doi.org/10.15690/vramn663CrossRef

Lin, C., Miller, T., Dligach, D., Bethard, S., Savova, G.: A BERT-based universal model for both within- and cross-sentence clinical temporal relation extraction. In: Proceedings of the 2nd Clinical Natural Language Processing Workshop, vol. 2, pp. 65–71 (2019)

10.

Lin, C., Miller, T., Dligach, D., Bethard, S., Savova, G.: Representations of time expressions for temporal relation extraction with convolutional neural networks 322–327 (2017). https://doi.org/10.18653/v1/w17-2341

11.

Balabaeva, K., Funkner, A., Kovalchuk, S.: Automated spelling correction for clinical text mining in Russian (2020)

12.

Funkner, A., Balabaeva, K., Kovalchuk, S.: Negation detection for clinical text mining in Russian (2020)

13.

Harkema, H., Dowling, J.N., Thornblade, T., Chapman, W.W.: ConText: an algorithm for determining negation, experiencer, and temporal status from clinical reports. J. Biomed. Inform. 42, 839–851 (2009). https://doi.org/10.1016/j.jbi.2009.05.002CrossRef

14.

Korobov, M.: Morphological analyzer and generator for Russian and Ukrainian languages. In: Khachay, M.Yu., Konstantinova, N., Panchenko, A., Ignatov, D.I., Labunets, V.G. (eds.) AIST 2015. CCIS, vol. 542, pp. 320–332. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26123-2_31

15.

Sorokin, A.A., Shavrina, T.O.: Automatic spelling correction for Russian social media texts. In: Proceedings of the International Conference “Dialog”, Moscow. pp. 688–701 (2016)

16.

Ingria, R., et al.: TimeML: robust specification of event and temporal expressions in text. New Dir. Quest. Ans. 3, 28–34 (2003)

17.

Molnar, C.: Interpretable Machine Learning. Lulu, Morrisville (2019)

18.

Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: Proceedings of the International Conference on New Methods in Language Processing (1994)

19.

Russian statistical taggers and parsers. http://corpus.leeds.ac.uk/mocky/

20.

Negri, M., Marseglia, L.: Recognition and normalization of time expressions: ITC-irst at TERN 2004. Rapp. interne, ITC-irst, Trento (2004)

21.

Zhao, X., Jin, P., Yue, L.: Automatic temporal expression normalization with reference time dynamic-choosing. In: Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference (2010)

22.

Korobkin, D.M., Vasiliev, S.S., Fomenkov, S.A., Lobeyko, V.I.: Extraction of structural elements of inventions from Russian-language patents. In: Multi Conference on Computer Science and Information Systems, MCCSIS 2019 - Proceedings of the International Conferences on Big Data Analytics, Data Mining and Computational Intelligence 2019 and Theory and Practice in Modern Computing 2019 (2019)

23.

Funkner, A.A., Yakovlev, A.N., Kovalchuk, S.V.: Data-driven modeling of clinical pathways using electronic health records. Procedia Comput. Sci. 121, 835–842 (2017). https://doi.org/10.1016/j.procs.2017.11.108CrossRef

24.

Derevitskii, I., Funkner, A., Metsker, O., Kovalchuk, S.: Graph-based predictive modelling of chronic disease development: type 2 DM case study. Stud. Health Technol. Inform. 261, 150–155 (2019). https://doi.org/10.3233/978-1-61499-975-1-150CrossRef

25.

Balabaeva, K., Kovalchuk, S., Metsker, O.: Dynamic features impact on the quality of chronic heart failure predictive modelling. Stud. Health Technol. Inform. 261, 179–184 (2019)

Title: Time Expressions Identification Without Human-Labeled Corpus for Clinical Text Mining in Russian
Authors: Anastasia A. Funkner
Sergey V. Kovalchuk
Publisher: Springer International Publishing
Book: Computational Science – ICCS 2020
Print ISBN: 978-3-030-50422-9

Electronic ISBN: 978-3-030-50423-6

Copyright Year: 2020
DOI: https://doi.org/10.1007/978-3-030-50423-6_44

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner