Skip to main content
Top

2020 | OriginalPaper | Chapter

Time Expressions Identification Without Human-Labeled Corpus for Clinical Text Mining in Russian

Authors : Anastasia A. Funkner, Sergey V. Kovalchuk

Published in: Computational Science – ICCS 2020

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

To obtain accurate predictive models in medicine, it is necessary to use complete relevant information about the patient. We propose an approach for extracting temporary expressions from unlabeled natural language texts. This approach can be used for the first analysis of the corpus, for data labeling as the first stage, or for obtaining linguistic constructions that can be used for a rule-based approach to retrieve information. Our method includes the sequential use of several machine learning and natural language processing methods: classification of sentences, the transformation of word bag frequencies, clustering of sentences with time expressions, classification of new data into clusters and construction of sentence profiles using feature importances. With this method, we derive the list of the most frequent time expressions and extract events and/or time events for 9801 sentences of anamnesis in Russian. The proposed approach is independent of the corpus language and can be used for other tasks, for example, extracting an experiencer of a disease.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Jackson, P., Moulinier, I.: Natural Language Processing for Online Applications: Text Retrieval, Extraction, and Categorization. John Benjamins Publishing Company, Amsterdam (2002)CrossRef Jackson, P., Moulinier, I.: Natural Language Processing for Online Applications: Text Retrieval, Extraction, and Categorization. John Benjamins Publishing Company, Amsterdam (2002)CrossRef
3.
go back to reference Riloff, E.: Automatically constructing a dictionary for information extraction tasks. In: Proceedings of National Conference on Artificial Intelligence, pp. 811–816 (1993) Riloff, E.: Automatically constructing a dictionary for information extraction tasks. In: Proceedings of National Conference on Artificial Intelligence, pp. 811–816 (1993)
4.
go back to reference Riloff, E., Jones, R.: Learning dictionaries for information bootstrapping extraction by multi-level. In: Proceeding AAAI 1999/IAAI 1999 Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence, pp. 474–479 (1999) Riloff, E., Jones, R.: Learning dictionaries for information bootstrapping extraction by multi-level. In: Proceeding AAAI 1999/IAAI 1999 Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence, pp. 474–479 (1999)
5.
go back to reference Shickel, B., Tighe, P.J., Bihorac, A., Rashidi, P.: Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis Shickel, B., Tighe, P.J., Bihorac, A., Rashidi, P.: Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis
6.
go back to reference Kudinov M.S., Romanenko A.A., Piontkovskaja I.I.: Conditional random field in segmentation and noun phrase inclination tasks for Russian. Кoмпьютepнaя лингвиcтикa и интeллeктyaльныe тexнoлoгии, pp. 297–306 (2014) Kudinov M.S., Romanenko A.A., Piontkovskaja I.I.: Conditional random field in segmentation and noun phrase inclination tasks for Russian. Кoмпьютepнaя лингвиcтикa и интeллeктyaльныe тexнoлoгии, pp. 297–306 (2014)
7.
go back to reference Shelmanov, A.O., Smirnov, I.V., Vishneva, E.A.: Information extraction from clinical texts in Russian. Komp’juternaja Lingvistika i Intellektual’nye Tehnol. 1, 560–572 (2015) Shelmanov, A.O., Smirnov, I.V., Vishneva, E.A.: Information extraction from clinical texts in Russian. Komp’juternaja Lingvistika i Intellektual’nye Tehnol. 1, 560–572 (2015)
9.
go back to reference Lin, C., Miller, T., Dligach, D., Bethard, S., Savova, G.: A BERT-based universal model for both within- and cross-sentence clinical temporal relation extraction. In: Proceedings of the 2nd Clinical Natural Language Processing Workshop, vol. 2, pp. 65–71 (2019) Lin, C., Miller, T., Dligach, D., Bethard, S., Savova, G.: A BERT-based universal model for both within- and cross-sentence clinical temporal relation extraction. In: Proceedings of the 2nd Clinical Natural Language Processing Workshop, vol. 2, pp. 65–71 (2019)
11.
go back to reference Balabaeva, K., Funkner, A., Kovalchuk, S.: Automated spelling correction for clinical text mining in Russian (2020) Balabaeva, K., Funkner, A., Kovalchuk, S.: Automated spelling correction for clinical text mining in Russian (2020)
12.
go back to reference Funkner, A., Balabaeva, K., Kovalchuk, S.: Negation detection for clinical text mining in Russian (2020) Funkner, A., Balabaeva, K., Kovalchuk, S.: Negation detection for clinical text mining in Russian (2020)
14.
15.
go back to reference Sorokin, A.A., Shavrina, T.O.: Automatic spelling correction for Russian social media texts. In: Proceedings of the International Conference “Dialog”, Moscow. pp. 688–701 (2016) Sorokin, A.A., Shavrina, T.O.: Automatic spelling correction for Russian social media texts. In: Proceedings of the International Conference “Dialog”, Moscow. pp. 688–701 (2016)
16.
go back to reference Ingria, R., et al.: TimeML: robust specification of event and temporal expressions in text. New Dir. Quest. Ans. 3, 28–34 (2003) Ingria, R., et al.: TimeML: robust specification of event and temporal expressions in text. New Dir. Quest. Ans. 3, 28–34 (2003)
17.
go back to reference Molnar, C.: Interpretable Machine Learning. Lulu, Morrisville (2019) Molnar, C.: Interpretable Machine Learning. Lulu, Morrisville (2019)
18.
go back to reference Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: Proceedings of the International Conference on New Methods in Language Processing (1994) Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: Proceedings of the International Conference on New Methods in Language Processing (1994)
20.
go back to reference Negri, M., Marseglia, L.: Recognition and normalization of time expressions: ITC-irst at TERN 2004. Rapp. interne, ITC-irst, Trento (2004) Negri, M., Marseglia, L.: Recognition and normalization of time expressions: ITC-irst at TERN 2004. Rapp. interne, ITC-irst, Trento (2004)
21.
go back to reference Zhao, X., Jin, P., Yue, L.: Automatic temporal expression normalization with reference time dynamic-choosing. In: Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference (2010) Zhao, X., Jin, P., Yue, L.: Automatic temporal expression normalization with reference time dynamic-choosing. In: Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference (2010)
22.
go back to reference Korobkin, D.M., Vasiliev, S.S., Fomenkov, S.A., Lobeyko, V.I.: Extraction of structural elements of inventions from Russian-language patents. In: Multi Conference on Computer Science and Information Systems, MCCSIS 2019 - Proceedings of the International Conferences on Big Data Analytics, Data Mining and Computational Intelligence 2019 and Theory and Practice in Modern Computing 2019 (2019) Korobkin, D.M., Vasiliev, S.S., Fomenkov, S.A., Lobeyko, V.I.: Extraction of structural elements of inventions from Russian-language patents. In: Multi Conference on Computer Science and Information Systems, MCCSIS 2019 - Proceedings of the International Conferences on Big Data Analytics, Data Mining and Computational Intelligence 2019 and Theory and Practice in Modern Computing 2019 (2019)
25.
go back to reference Balabaeva, K., Kovalchuk, S., Metsker, O.: Dynamic features impact on the quality of chronic heart failure predictive modelling. Stud. Health Technol. Inform. 261, 179–184 (2019) Balabaeva, K., Kovalchuk, S., Metsker, O.: Dynamic features impact on the quality of chronic heart failure predictive modelling. Stud. Health Technol. Inform. 261, 179–184 (2019)
Metadata
Title
Time Expressions Identification Without Human-Labeled Corpus for Clinical Text Mining in Russian
Authors
Anastasia A. Funkner
Sergey V. Kovalchuk
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-50423-6_44

Premium Partner