Skip to main content
Top

2018 | OriginalPaper | Chapter

Protein Complex Mention Recognition with Web-Based Knowledge Learning

Authors : Ruoyao Ding, Xiaoyi Pan, Yingying Qu, Cathy H. Wu, K. Vijay-Shanker

Published in: Emerging Technologies for Education

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Protein complex plays an essential role in cellular functions and is an important named entity in the biomedical field. Since protein complex –relevant experimental results are usually published in scientific articles, recognizing protein complex mentions from literature is a crucial step of discovering protein complex-related information from existing scientific research studies. In this paper, we propose a method for protein complex mention recognition, which applies knowledge automatically learned from PubMed. Evaluation shows our method achieves a F1-score of 81%, demonstrating its effectiveness in the protein complex recognition task.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
2.
go back to reference Gingras, A.-C., Aebersold, R., Raught, B.: Advances in protein complex analysis using mass spectrometry. J. Physiol. 563(Pt 1), 11–21 (2005)CrossRef Gingras, A.-C., Aebersold, R., Raught, B.: Advances in protein complex analysis using mass spectrometry. J. Physiol. 563(Pt 1), 11–21 (2005)CrossRef
3.
go back to reference Meldal, B.H.M., Forner-Martinez, O., Costanzo, M.C., et al.: The complex portal–an encyclopaedia of macromolecular complexes. Nucleic Acids Res. 43(Database issue), D479–D484 (2015)CrossRef Meldal, B.H.M., Forner-Martinez, O., Costanzo, M.C., et al.: The complex portal–an encyclopaedia of macromolecular complexes. Nucleic Acids Res. 43(Database issue), D479–D484 (2015)CrossRef
4.
go back to reference Settles, B.: ABNER: an open source tool for automatically tagging genes, proteins and other entity names in text. Bioinformatics 21(14), 3191–3192 (2005)CrossRef Settles, B.: ABNER: an open source tool for automatically tagging genes, proteins and other entity names in text. Bioinformatics 21(14), 3191–3192 (2005)CrossRef
5.
go back to reference Leaman, R., Gonzalez, G.: BANNER: an executable survey of advances in biomedical named entity recognition. In: Pacific Symposium on Biocomputing, pp. 652–663 (2008) Leaman, R., Gonzalez, G.: BANNER: an executable survey of advances in biomedical named entity recognition. In: Pacific Symposium on Biocomputing, pp. 652–663 (2008)
6.
go back to reference Torii, M., Hu, Z., Wu, C.H., Liu, H.: BioTagger-GM: a gene/protein name recognition system. J. Am. Med. Inform. Assoc. (JAMIA) 16(2), 247–255 (2009)CrossRef Torii, M., Hu, Z., Wu, C.H., Liu, H.: BioTagger-GM: a gene/protein name recognition system. J. Am. Med. Inform. Assoc. (JAMIA) 16(2), 247–255 (2009)CrossRef
7.
go back to reference Lu, Y., Ji, D., Yao, X., Wei, X., Liang, X.: CHEMDNER system with mixed conditional random fields and multi-scale word clustering. J. Cheminformatics 7(Suppl 1), S4 (2015). Text mining for chemistry and the CHEMDNER trackCrossRef Lu, Y., Ji, D., Yao, X., Wei, X., Liang, X.: CHEMDNER system with mixed conditional random fields and multi-scale word clustering. J. Cheminformatics 7(Suppl 1), S4 (2015). Text mining for chemistry and the CHEMDNER trackCrossRef
8.
go back to reference Liu, H., Torii, M., Hu, Z.Z., Wu, C.: Gene mention and gene normalization based on machine learning and online resources. In: Proceedings of the Second BioCreative Challenge Workshop, pp. 135–140. CNIO (2007) Liu, H., Torii, M., Hu, Z.Z., Wu, C.: Gene mention and gene normalization based on machine learning and online resources. In: Proceedings of the Second BioCreative Challenge Workshop, pp. 135–140. CNIO (2007)
9.
go back to reference Batista-Navarro, R., Rak, R., Ananiadou, S.: Optimising chemical named entity recognition with pre-processing analytics, knowledge-rich features and heuristics. J. Cheminformatics 7(Suppl 1), S6 (2015). Text mining for chemistry and the CHEMDNER trackCrossRef Batista-Navarro, R., Rak, R., Ananiadou, S.: Optimising chemical named entity recognition with pre-processing analytics, knowledge-rich features and heuristics. J. Cheminformatics 7(Suppl 1), S6 (2015). Text mining for chemistry and the CHEMDNER trackCrossRef
10.
go back to reference Lowe, D.M., Sayle, R.A.: LeadMine: a grammar and dictionary driven approach to entity recognition. J. Cheminformatics 7(Suppl 1), S5 (2015). Text mining for chemistry and the CHEMDNER trackCrossRef Lowe, D.M., Sayle, R.A.: LeadMine: a grammar and dictionary driven approach to entity recognition. J. Cheminformatics 7(Suppl 1), S5 (2015). Text mining for chemistry and the CHEMDNER trackCrossRef
11.
go back to reference Kaewphan, S., Hakala, K., Ginter, F.: UTU: disease mention recognition and normalization with CRFs and vector space representations. In: SemEval@ COLING, pp. 807–811 (2014) Kaewphan, S., Hakala, K., Ginter, F.: UTU: disease mention recognition and normalization with CRFs and vector space representations. In: SemEval@ COLING, pp. 807–811 (2014)
12.
go back to reference Natale, D.A., Arighi, C.N., Blake, J.A., et al.: Protein Ontology: a controlled structured network of protein entities. Nucleic Acids Res. 42(Database issue), D415–D421 (2014)CrossRef Natale, D.A., Arighi, C.N., Blake, J.A., et al.: Protein Ontology: a controlled structured network of protein entities. Nucleic Acids Res. 42(Database issue), D415–D421 (2014)CrossRef
13.
go back to reference Ruepp, A., Waegele, B., Lechner, M., et al.: CORUM: the comprehensive resource of mammalian protein complexes–2009. Nucleic Acids Res. 38(Database issue), D497–D501 (2010)CrossRef Ruepp, A., Waegele, B., Lechner, M., et al.: CORUM: the comprehensive resource of mammalian protein complexes–2009. Nucleic Acids Res. 38(Database issue), D497–D501 (2010)CrossRef
14.
go back to reference Fukuda, K., Tamura, A., Tsunoda, T., Takagi, T.: Toward information extraction: identifying protein names from biological papers. In: Pacific Symposium on Biocomputing, pp. 707–718 (1998) Fukuda, K., Tamura, A., Tsunoda, T., Takagi, T.: Toward information extraction: identifying protein names from biological papers. In: Pacific Symposium on Biocomputing, pp. 707–718 (1998)
15.
go back to reference Lafferty, J., McCallum, A., et al.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001) Lafferty, J., McCallum, A., et al.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)
16.
go back to reference Okazaki, N.: CRFsuite: a fast implementation of Conditional Random Fields (2007). [2015-03-24] Okazaki, N.: CRFsuite: a fast implementation of Conditional Random Fields (2007). [2015-03-24]
17.
go back to reference Narayanaswamy, M., Ravikumar, K.E., Vijay-Shanker, K.: A biological named entity recognizer. In: Pacific Symposium on Biocomputing, pp. 427–438 (2003) Narayanaswamy, M., Ravikumar, K.E., Vijay-Shanker, K.: A biological named entity recognizer. In: Pacific Symposium on Biocomputing, pp. 427–438 (2003)
18.
go back to reference Ding, R., Arighi, C.N., Lee, J.-Y., Wu, C.H., Vijay-Shanker, K.: pGenN, a gene normalization tool for plant genes and proteins in scientific literature. PLoS ONE 10(8), e0135305 (2015)CrossRef Ding, R., Arighi, C.N., Lee, J.-Y., Wu, C.H., Vijay-Shanker, K.: pGenN, a gene normalization tool for plant genes and proteins in scientific literature. PLoS ONE 10(8), e0135305 (2015)CrossRef
19.
go back to reference Schwartz, A.S., Hearst, M.A.: A simple algorithm for identifying abbreviation definitions in biomedical text. In: Pacific Symposium on Biocomputing, pp. 451–462 (2003) Schwartz, A.S., Hearst, M.A.: A simple algorithm for identifying abbreviation definitions in biomedical text. In: Pacific Symposium on Biocomputing, pp. 451–462 (2003)
Metadata
Title
Protein Complex Mention Recognition with Web-Based Knowledge Learning
Authors
Ruoyao Ding
Xiaoyi Pan
Yingying Qu
Cathy H. Wu
K. Vijay-Shanker
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-030-03580-8_20

Premium Partner