Skip to main content

2022 | OriginalPaper | Buchkapitel

Named Entity Recognition on CORD-19 Bio-Medical Dataset with Tolerance Rough Sets

verfasst von : Seeratpal Jaura, Sheela Ramanna

Erschienen in: Transactions on Rough Sets XXIII

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Biomedical named entity recognition is becoming increasingly important to biomedical research due to a proliferation of articles and also due to the current pandemic disease. This paper addresses the task of automatically finding and recognizing biomedical entity types related to COVID (e.g., virus, cell, therapeutic) with tolerance rough sets. The task includes i) extracting nouns and their co-occurring contextual patterns from a large BioNER dataset related to COVID-19 and, ii) annotating unlabelled data with a semi-supervised learning algorithm using co-occurence statistics. 465,250 noun phrases and 6,222,196 contextual patterns were extracted from 29,500 articles using natural language text processing methods. Three categories were successfully classified at this time: virus, cell and therapeutic. Early precision@N results demonstrate that our proposed tolerant pattern learner (TPL) is able to constrain concept drift in all 3 categories during the iterative learning process.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Callan, J.: The lemur project and its clueweb12 dataset. In: Invited Talk at the SIGIR 2012 Workshop on Open-Source Information Retrieval (2012) Callan, J.: The lemur project and its clueweb12 dataset. In: Invited Talk at the SIGIR 2012 Workshop on Open-Source Information Retrieval (2012)
3.
Zurück zum Zitat Ho, T.B., Nguyen, N.B.: Nonhierarchical document clustering based on a tolerance rough set model. Int. J. Intell. Syst. 17, 199–212 (2002)CrossRefMATH Ho, T.B., Nguyen, N.B.: Nonhierarchical document clustering based on a tolerance rough set model. Int. J. Intell. Syst. 17, 199–212 (2002)CrossRefMATH
5.
Zurück zum Zitat Marcus, S.: Tolerance rough sets, Cech topologies, learning processes. Bull. Pol. Acad. Sci. Tech. Sci. 42(3), 471–487 (1994)MATH Marcus, S.: Tolerance rough sets, Cech topologies, learning processes. Bull. Pol. Acad. Sci. Tech. Sci. 42(3), 471–487 (1994)MATH
6.
Zurück zum Zitat Moghaddam, H.: Exploring scalability and concept drift issues in learning categorical facts with tolerance rough sets. Master’s thesis, University of Winnipeg (2019). Supervisor: S. Ramanna Moghaddam, H.: Exploring scalability and concept drift issues in learning categorical facts with tolerance rough sets. Master’s thesis, University of Winnipeg (2019). Supervisor: S. Ramanna
7.
Zurück zum Zitat Moghaddam, H., Ramanna, S.: Harvesting patterns from textual web sources with tolerance rough sets. Patterns 1(4), 100053 (2020)CrossRef Moghaddam, H., Ramanna, S.: Harvesting patterns from textual web sources with tolerance rough sets. Patterns 1(4), 100053 (2020)CrossRef
9.
Zurück zum Zitat Ngo, C.L.: A tolerance rough set approach to clustering web search results. Master’s thesis, Warsaw University (2003) Ngo, C.L.: A tolerance rough set approach to clustering web search results. Master’s thesis, Warsaw University (2003)
10.
Zurück zum Zitat Nguyen, H.S.: Applications of tolerance rough set model semantic text analysis. In: Ropiak, K., Polkowski, L., Artiemjew, P. (eds.) Proceedings of the 28th International Workshop on Concurrency, Specification and Programming. CEUR Workshop Proceedings, Olsztyn, Poland, 24–26 September 2019, vol. 2571. CEUR-WS.org (2019). http://ceur-ws.org/Vol-2571/CSP2019_paper_18.pdf Nguyen, H.S.: Applications of tolerance rough set model semantic text analysis. In: Ropiak, K., Polkowski, L., Artiemjew, P. (eds.) Proceedings of the 28th International Workshop on Concurrency, Specification and Programming. CEUR Workshop Proceedings, Olsztyn, Poland, 24–26 September 2019, vol. 2571. CEUR-WS.org (2019). http://​ceur-ws.​org/​Vol-2571/​CSP2019_​paper_​18.​pdf
11.
Zurück zum Zitat Nguyen, H.S., Ho, T.B.: Rough document clustering and the internet. In: Handbook of Granular Computing, pp. 987–1003 (2008) Nguyen, H.S., Ho, T.B.: Rough document clustering and the internet. In: Handbook of Granular Computing, pp. 987–1003 (2008)
12.
Zurück zum Zitat Nguyen, S.H., Nguyen, H.S.: An approach to semantic indexing based on tolerance rough set model. In: Nguyen, N.T., van Do, T., le Thi, H.A. (eds.) Advanced Computational Methods for Knowledge Engineering. SCI, vol. 479, pp. 343–354. Springer International Publishing, Heidelberg (2013). https://doi.org/10.1007/978-3-319-00293-4_26CrossRef Nguyen, S.H., Nguyen, H.S.: An approach to semantic indexing based on tolerance rough set model. In: Nguyen, N.T., van Do, T., le Thi, H.A. (eds.) Advanced Computational Methods for Knowledge Engineering. SCI, vol. 479, pp. 343–354. Springer International Publishing, Heidelberg (2013). https://​doi.​org/​10.​1007/​978-3-319-00293-4_​26CrossRef
13.
Zurück zum Zitat Nieminen, J.: Rough tolerance equality and tolerance black boxes. Fund. Inform. 11, 289–296 (1988)MathSciNetMATH Nieminen, J.: Rough tolerance equality and tolerance black boxes. Fund. Inform. 11, 289–296 (1988)MathSciNetMATH
16.
Zurück zum Zitat Polkowski, L., Skowron, A., Zytkow, J.: Tolerance based rough sets. In: Lin, T.Y., Wildberger, M. (eds.) Soft Computing: Rough Sets, Fuzzy Logic, Neural Networks, Uncertainty Management, Knowledge Discovery, pp. 55–58. Simulation Councils Inc., San Diego (1994) Polkowski, L., Skowron, A., Zytkow, J.: Tolerance based rough sets. In: Lin, T.Y., Wildberger, M. (eds.) Soft Computing: Rough Sets, Fuzzy Logic, Neural Networks, Uncertainty Management, Knowledge Discovery, pp. 55–58. Simulation Councils Inc., San Diego (1994)
17.
Zurück zum Zitat Ramanna, S., Peters, J.F., Sengoz, C.: Application of tolerance rough sets in structured and unstructured text categorization: a survey. In: Wang, G., Skowron, A., Yao, Y., Ślęzak, D., Polkowski, L. (eds.) Thriving Rough Sets. SCI, vol. 708, pp. 119–138. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54966-8_6CrossRef Ramanna, S., Peters, J.F., Sengoz, C.: Application of tolerance rough sets in structured and unstructured text categorization: a survey. In: Wang, G., Skowron, A., Yao, Y., Ślęzak, D., Polkowski, L. (eds.) Thriving Rough Sets. SCI, vol. 708, pp. 119–138. Springer, Cham (2017). https://​doi.​org/​10.​1007/​978-3-319-54966-8_​6CrossRef
18.
Zurück zum Zitat Sengoz, C.: A granular-based approach for semi-supervised web information labeling. Master’s thesis, University of Winnipeg (2014). Supervisor: S. Ramanna Sengoz, C.: A granular-based approach for semi-supervised web information labeling. Master’s thesis, University of Winnipeg (2014). Supervisor: S. Ramanna
20.
Zurück zum Zitat Sengoz, C., Ramanna, S.: Learning relational facts from the web: a tolerance rough set approach. Pattern Recogn. Lett. 67(P2), 130–137 (2015)CrossRef Sengoz, C., Ramanna, S.: Learning relational facts from the web: a tolerance rough set approach. Pattern Recogn. Lett. 67(P2), 130–137 (2015)CrossRef
21.
Zurück zum Zitat Skowron, A., Stepaniuk, J.: Tolerance approximation spaces. Fund. Inform. 27(2–3), 245–253 (1996)MathSciNetMATH Skowron, A., Stepaniuk, J.: Tolerance approximation spaces. Fund. Inform. 27(2–3), 245–253 (1996)MathSciNetMATH
25.
26.
27.
Zurück zum Zitat Wang, X., Song, X., Li, B., Guan, Y., Han, J.: Comprehensive named entity recognition on CORD-19 with distant or weak supervision. arXiv preprint arXiv:2003.12218 (2020) Wang, X., Song, X., Li, B., Guan, Y., Han, J.: Comprehensive named entity recognition on CORD-19 with distant or weak supervision. arXiv preprint arXiv:​2003.​12218 (2020)
Metadaten
Titel
Named Entity Recognition on CORD-19 Bio-Medical Dataset with Tolerance Rough Sets
verfasst von
Seeratpal Jaura
Sheela Ramanna
Copyright-Jahr
2022
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-66544-2_3

Premium Partner