Skip to main content
Erschienen in: Discover Computing 1-2/2016

01.04.2016 | Medical Information Retrieval

Biomedical term extraction: overview and a new methodology

verfasst von: Juan Antonio Lossio-Ventura, Clement Jonquet, Mathieu Roche, Maguelonne Teisseire

Erschienen in: Discover Computing | Ausgabe 1-2/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Terminology extraction is an essential task in domain knowledge acquisition, as well as for information retrieval. It is also a mandatory first step aimed at building/enriching terminologies and ontologies. As often proposed in the literature, existing terminology extraction methods feature linguistic and statistical aspects and solve some problems related (but not completely) to term extraction, e.g. noise, silence, low frequency, large-corpora, complexity of the multi-word term extraction process. In contrast, we propose a cutting edge methodology to extract and to rank biomedical terms, covering all the mentioned problems. This methodology offers several measures based on linguistic, statistical, graphic and web aspects. These measures extract and rank candidate terms with excellent precision: we demonstrate that they outperform previously reported precision results for automatic term extraction, and work with different languages (English, French, and Spanish). We also demonstrate how the use of graphs and the web to assess the significance of a term candidate, enables us to outperform precision results. We evaluated our methodology on the biomedical GENIA and LabTestsOnline corpora and compared it with previously reported measures.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Ahmad, K., Gillam, L., & Tostevin, L. (1999). University of surrey participation in TREC-8: Weirdness indexing for logical document extrapolation and retrieval (wilder). In TREC. Ahmad, K., Gillam, L., & Tostevin, L. (1999). University of surrey participation in TREC-8: Weirdness indexing for logical document extrapolation and retrieval (wilder). In TREC.
Zurück zum Zitat Aubin, S., & Hamon, T. (2006). Improving term extraction with terminological resources. In Proceedings of the 5th international conference natural language processing (pp. 380–387). FinTAL’06 Turku, Finland: Springer. Aubin, S., & Hamon, T. (2006). Improving term extraction with terminological resources. In Proceedings of the 5th international conference natural language processing (pp. 380–387). FinTAL’06 Turku, Finland: Springer.
Zurück zum Zitat Banerjee, A., Chandrasekhar, A. G., Duo, E., & Jackson, M. O. (2014). Gossip: Identifying central individuals in a social network. Technical report, National Bureau of EconomicResearch. Banerjee, A., Chandrasekhar, A. G., Duo, E., & Jackson, M. O. (2014). Gossip: Identifying central individuals in a social network. Technical report, National Bureau of EconomicResearch.
Zurück zum Zitat Barrón-Cedeño, A., Sierra, G., Drouin, P., & Ananiadou, S. (2009). An improved automatic term recognition method for spanish. In Proceedings of the 10th international conference on computational linguistics and intelligent text processing (pp. 125–136) CICLing’09. Springer. Barrón-Cedeño, A., Sierra, G., Drouin, P., & Ananiadou, S. (2009). An improved automatic term recognition method for spanish. In Proceedings of the 10th international conference on computational linguistics and intelligent text processing (pp. 125–136) CICLing’09. Springer.
Zurück zum Zitat Blanco, R., & Lioma, C. (2012). Graph-based term weighting for information retrieval. Information Retrieval, 15(1), 54–92.CrossRef Blanco, R., & Lioma, C. (2012). Graph-based term weighting for information retrieval. Information Retrieval, 15(1), 54–92.CrossRef
Zurück zum Zitat Borgatti, S. P. (2005). Centrality and network flow. Social Networks, 27(1), 55–71.CrossRef Borgatti, S. P. (2005). Centrality and network flow. Social Networks, 27(1), 55–71.CrossRef
Zurück zum Zitat Borgatti, S. P., Mehra, A., Brass, D. J., & Labianca, G. (2009). Network analysis in the social sciences. Science, 323(5916), 892–895.CrossRef Borgatti, S. P., Mehra, A., Brass, D. J., & Labianca, G. (2009). Network analysis in the social sciences. Science, 323(5916), 892–895.CrossRef
Zurück zum Zitat Bowker, L., & Pearson, J. (2002). Working with specialized language: A practical guide to using corpora. London: Routledge.CrossRef Bowker, L., & Pearson, J. (2002). Working with specialized language: A practical guide to using corpora. London: Routledge.CrossRef
Zurück zum Zitat Chaudhari, D. L., Damani, O. P., & Laxman, S. (2011). Lexical co-occurrence, statistical significance, and word association. In Proceedings of the conference on empirical methods in natural language processing (pp. 1058–1068). EMNLP’11, Stroudsburg, PA, USA. Association for Computational Linguistics. Chaudhari, D. L., Damani, O. P., & Laxman, S. (2011). Lexical co-occurrence, statistical significance, and word association. In Proceedings of the conference on empirical methods in natural language processing (pp. 1058–1068). EMNLP’11, Stroudsburg, PA, USA. Association for Computational Linguistics.
Zurück zum Zitat Cilibrasi, R. L., & Vitanyi, P. M. (2007). The google similarity distance. IEEE Transactions on Knowledge and Data Engineering, 19(3), 370–383.CrossRef Cilibrasi, R. L., & Vitanyi, P. M. (2007). The google similarity distance. IEEE Transactions on Knowledge and Data Engineering, 19(3), 370–383.CrossRef
Zurück zum Zitat Conrado, M. S., Pardo, T. A., & Rezende, S. O. (2013). Exploration of a rich feature set for automatic term extraction. Advances in Artificial Intelligence and Its Applications (pp. 342–354), vol. 8265 of Lecture Notes in Computer Science Berlin Heidelberg: Springer. Conrado, M. S., Pardo, T. A., & Rezende, S. O. (2013). Exploration of a rich feature set for automatic term extraction. Advances in Artificial Intelligence and Its Applications (pp. 342–354), vol. 8265 of Lecture Notes in Computer Science Berlin Heidelberg: Springer.
Zurück zum Zitat Daille, B., Gaussier, E., & Langé, J.-M. (1994). Towards automatic extraction of monolingual and bilingual terminology. In Proceedings of the 15th conference on computational linguistics—Volume 1, COLING’94, pages 515–521, Stroudsburg, PA, USA. Association for Computational Linguistics. Daille, B., Gaussier, E., & Langé, J.-M. (1994). Towards automatic extraction of monolingual and bilingual terminology. In Proceedings of the 15th conference on computational linguistics—Volume 1, COLING’94, pages 515–521, Stroudsburg, PA, USA. Association for Computational Linguistics.
Zurück zum Zitat Daille, B., & Morin, E. (2005). French-english terminology extraction from comparable corpora. In Proceedings of the 2nd international joint conference natural language processing (pp. 707–718). IJCNLP’05. Springer. Daille, B., & Morin, E. (2005). French-english terminology extraction from comparable corpora. In Proceedings of the 2nd international joint conference natural language processing (pp. 707–718). IJCNLP’05. Springer.
Zurück zum Zitat Déjean, H., & Gaussier, E. (2002). Une nouvelle approche à l’extraction de lexiques bilingues à partir de corpus comparables. Déjean, H., & Gaussier, E. (2002). Une nouvelle approche à l’extraction de lexiques bilingues à partir de corpus comparables.
Zurück zum Zitat Deléger, L., Merkel, M., & Zweigenbaum, P. (2009). Translating medical terminologies through word alignment in parallel text corpora. Journal of Biomedical Informatics, 42(4), 692–701.CrossRef Deléger, L., Merkel, M., & Zweigenbaum, P. (2009). Translating medical terminologies through word alignment in parallel text corpora. Journal of Biomedical Informatics, 42(4), 692–701.CrossRef
Zurück zum Zitat Dobrov, B., & Loukachevitch, N. (2011). Multiple evidence for term extraction in broad domains. In Proceeding of recent advances in natural language processing (pp. 710–715). RANLP’11 Bulgaria: Hissar. Dobrov, B., & Loukachevitch, N. (2011). Multiple evidence for term extraction in broad domains. In Proceeding of recent advances in natural language processing (pp. 710–715). RANLP’11 Bulgaria: Hissar.
Zurück zum Zitat Frantzi, K., Ananiadou, S., & Mima, H. (2000). Automatic recognition of multi-word terms: The c-value/nc-value method. International Journal on Digital Libraries, 3(2), 115–130.CrossRef Frantzi, K., Ananiadou, S., & Mima, H. (2000). Automatic recognition of multi-word terms: The c-value/nc-value method. International Journal on Digital Libraries, 3(2), 115–130.CrossRef
Zurück zum Zitat Freeman, L. C. (1979). Centrality in social networks conceptual clarification. Social Networks, 1(3), 215–239.CrossRef Freeman, L. C. (1979). Centrality in social networks conceptual clarification. Social Networks, 1(3), 215–239.CrossRef
Zurück zum Zitat Gaizauskas, R., Demetriou, G., & Humphreys, K. (2000). Term recognition and classification in biological science journal articles. In Proceeding of the computional terminology for medical and biological applications workshop of the 2nd international conference on NLP (pp. 37–44). Gaizauskas, R., Demetriou, G., & Humphreys, K. (2000). Term recognition and classification in biological science journal articles. In Proceeding of the computional terminology for medical and biological applications workshop of the 2nd international conference on NLP (pp. 37–44).
Zurück zum Zitat Golik, W., Bossy, R., Ratkovic, Z., & Nédellec, C. (2013). Improving term extraction with linguistic analysis in the biomedical domain. In Proceedings of the 14th international conference on intelligent text processing and computational linguistics, special issue of the journal Research in Computing Science (pp. 24–30). CICLing’13. Golik, W., Bossy, R., Ratkovic, Z., & Nédellec, C. (2013). Improving term extraction with linguistic analysis in the biomedical domain. In Proceedings of the 14th international conference on intelligent text processing and computational linguistics, special issue of the journal Research in Computing Science (pp. 24–30). CICLing’13.
Zurück zum Zitat Hamon, T., Engström, C., & Silvestrov, S. (2014). Term ranking adaptation to the domain: Genetic algorithm-based optimisation of the c-value. In Proceedings of the 9th international conference on natural language processing (pp. 71–83). PolTAL’2014 - LNAI Warsaw, Poland: Springer. Hamon, T., Engström, C., & Silvestrov, S. (2014). Term ranking adaptation to the domain: Genetic algorithm-based optimisation of the c-value. In Proceedings of the 9th international conference on natural language processing (pp. 71–83). PolTAL’2014 - LNAI Warsaw, Poland: Springer.
Zurück zum Zitat Harispe, S., Ranwez, S., Janaqi, S., & Montmain, J. (2014). The semantic measures library and toolkit: Fast computation of semantic similarity and relatedness using biomedical ontologies. Bioinformatics, 30(5), 740–742.CrossRef Harispe, S., Ranwez, S., Janaqi, S., & Montmain, J. (2014). The semantic measures library and toolkit: Fast computation of semantic similarity and relatedness using biomedical ontologies. Bioinformatics, 30(5), 740–742.CrossRef
Zurück zum Zitat Hliaoutakis, A., Zervanou, K., & Petrakis, E. G. (2009). The amtex approach in the medical document indexing and retrieval application. Data and Knowledge Engineering, 68(3), 380–392.CrossRef Hliaoutakis, A., Zervanou, K., & Petrakis, E. G. (2009). The amtex approach in the medical document indexing and retrieval application. Data and Knowledge Engineering, 68(3), 380–392.CrossRef
Zurück zum Zitat Ji, L., Sum, M., Lu, Q., Li, W., & Chen. Y. (2007). Chinese terminology extraction using window-based contextual information. In Proceedings of the 8th international conference on computational linguistics and intelligent text processing (pp. 62–74). CICLing’07, Berlin, Heidelberg. Springer-Verlag. Ji, L., Sum, M., Lu, Q., Li, W., & Chen. Y. (2007). Chinese terminology extraction using window-based contextual information. In Proceedings of the 8th international conference on computational linguistics and intelligent text processing (pp. 62–74). CICLing’07, Berlin, Heidelberg. Springer-Verlag.
Zurück zum Zitat Kageura, K., & Umino, B. (1996). Methods of automatic term recognition: A review. Terminology, 3(2), 259–289.CrossRef Kageura, K., & Umino, B. (1996). Methods of automatic term recognition: A review. Terminology, 3(2), 259–289.CrossRef
Zurück zum Zitat Kontonatsios, G., Korkontzelos, I., Tsujii, J., & Ananiadou, S. (2014). Combining string and context similarity for bilingual term alignment from comparable corpora. In Proceedings of the 2014 conference on empirical methods in natural language processing (pp. 1701–1712). EMNLP’14, Doha, Qatar. Association for Computational Linguistics. Kontonatsios, G., Korkontzelos, I., Tsujii, J., & Ananiadou, S. (2014). Combining string and context similarity for bilingual term alignment from comparable corpora. In Proceedings of the 2014 conference on empirical methods in natural language processing (pp. 1701–1712). EMNLP’14, Doha, Qatar. Association for Computational Linguistics.
Zurück zum Zitat Kontonatsios, G., Mihăilă, C., Korkontzelos, I., Thompson, P., & Ananiadou, S. (2014). A hybrid approach to compiling bilingual dictionaries of medical terms from parallel corpora. In Statistical language and speech processing pp. 57–69. Springer. Kontonatsios, G., Mihăilă, C., Korkontzelos, I., Thompson, P., & Ananiadou, S. (2014). A hybrid approach to compiling bilingual dictionaries of medical terms from parallel corpora. In Statistical language and speech processing pp. 57–69. Springer.
Zurück zum Zitat Kozakov, L., Park, Y., Fin, T., Drissi, Y., Doganata, N., & Confino, T. (2007). Glossary extraction and knowledge in large organisations via semantic web technologies. In Proceedings of the 6th international semantic web conference and he 2nd Asian semantic web conference (semantic web challenge track), ISWC-ASWC’07. Springer. Kozakov, L., Park, Y., Fin, T., Drissi, Y., Doganata, N., & Confino, T. (2007). Glossary extraction and knowledge in large organisations via semantic web technologies. In Proceedings of the 6th international semantic web conference and he 2nd Asian semantic web conference (semantic web challenge track), ISWC-ASWC’07. Springer.
Zurück zum Zitat Krauthammer, M., & Nenadic, G. (2004). Term identification in the biomedical literature. Journal of Biomedical Informatics, 37(6), 512–526.CrossRef Krauthammer, M., & Nenadic, G. (2004). Term identification in the biomedical literature. Journal of Biomedical Informatics, 37(6), 512–526.CrossRef
Zurück zum Zitat Lossio-Ventura, J. A., Hacid, H., Ansiaux, A., & Maag, M. L. (2012). Conversations reconstruction in the social web. In Proceedings of the 21st international conference companion on World Wide Web (pp. 573–574). WWW’12, Lyon, France, ACM. Lossio-Ventura, J. A., Hacid, H., Ansiaux, A., & Maag, M. L. (2012). Conversations reconstruction in the social web. In Proceedings of the 21st international conference companion on World Wide Web (pp. 573–574). WWW’12, Lyon, France, ACM.
Zurück zum Zitat Lossio-Ventura, J. A., Jonquet, C., Roche, M., & Teisseire, M. (2014). BIOTEX: A system for biomedical terminology extraction, ranking, and validation. In Proceedings of the 13th international semantic web conference, posters and demonstrations track (pp. 157–160). ISWC’14. Lossio-Ventura, J. A., Jonquet, C., Roche, M., & Teisseire, M. (2014). BIOTEX: A system for biomedical terminology extraction, ranking, and validation. In Proceedings of the 13th international semantic web conference, posters and demonstrations track (pp. 157–160). ISWC’14.
Zurück zum Zitat Lossio-Ventura, J. A., Jonquet, C., Roche, M., Teisseire, M., & ACM. (2014). Integration of linguistic and web information to improve biomedical terminology extraction. In Proceedings of the 18th international database engineering and applications symposium (pp. 265–269). IDEAS’14 Porto, Portugal: ACM. Lossio-Ventura, J. A., Jonquet, C., Roche, M., Teisseire, M., & ACM. (2014). Integration of linguistic and web information to improve biomedical terminology extraction. In Proceedings of the 18th international database engineering and applications symposium (pp. 265–269). IDEAS’14 Porto, Portugal: ACM.
Zurück zum Zitat Lossio-Ventura, J. A., Jonquet, C., Roche, M., & Teisseire, M. (2014). Yet another ranking function for automatic multiword term extraction. In Proceedings of the 9th international conference on natural language processing, number 8686 in PolTAL’2014 - LNAI (pp. 52–64). Warsaw, Poland, Springer. Lossio-Ventura, J. A., Jonquet, C., Roche, M., & Teisseire, M. (2014). Yet another ranking function for automatic multiword term extraction. In Proceedings of the 9th international conference on natural language processing, number 8686 in PolTAL’2014 - LNAI (pp. 52–64). Warsaw, Poland, Springer.
Zurück zum Zitat Lv, Y., & Zhai, C. (2011). Adaptive term frequency normalization for BM25. In Proceedings of the 20th ACM international conference on information and knowledge management (pp. 1985–1988). CIKM’11, New York, NY, USA. ACM. Lv, Y., & Zhai, C. (2011). Adaptive term frequency normalization for BM25. In Proceedings of the 20th ACM international conference on information and knowledge management (pp. 1985–1988). CIKM’11, New York, NY, USA. ACM.
Zurück zum Zitat Lv, Y., & Zhai, C. (2011). When documents are very long, BM25 fails! In Proceedings of the 34th international acm sigir conference on research and development in information retrieval, SIGIR’11 (pp. 1103–1104). New York, NY, USA. ACM. Lv, Y., & Zhai, C. (2011). When documents are very long, BM25 fails! In Proceedings of the 34th international acm sigir conference on research and development in information retrieval, SIGIR’11 (pp. 1103–1104). New York, NY, USA. ACM.
Zurück zum Zitat Matsuo, Y., & Ishizuka, M. (2004). Keyword extraction from a single document using word co-occurrence statistical information. International Journal on Artificial Intelligence Tools, 13(01), 157–169.CrossRef Matsuo, Y., & Ishizuka, M. (2004). Keyword extraction from a single document using word co-occurrence statistical information. International Journal on Artificial Intelligence Tools, 13(01), 157–169.CrossRef
Zurück zum Zitat Morin, E., & Prochasson, E. (2011). Bilingual lexicon extraction from comparable corpora enhanced with parallel corpora. In Proceedings of the 4th workshop on building and using comparable corpora: comparable corpora and the web (pp. 27–34). Association for Computational Linguistics. Morin, E., & Prochasson, E. (2011). Bilingual lexicon extraction from comparable corpora enhanced with parallel corpora. In Proceedings of the 4th workshop on building and using comparable corpora: comparable corpora and the web (pp. 27–34). Association for Computational Linguistics.
Zurück zum Zitat Murdoch, T. B., & Detsky, A. S. (2013). The inevitable application of big data to health care. Journal of the American Medical Association, JAMA, 309(13), 1351–1352.CrossRef Murdoch, T. B., & Detsky, A. S. (2013). The inevitable application of big data to health care. Journal of the American Medical Association, JAMA, 309(13), 1351–1352.CrossRef
Zurück zum Zitat Nakagawa, H., & Mori, T. (2002). A simple but powerful automatic term extraction method. In COLING-02 on COMPUTERM 2002: Second International Workshop on Computational Terminology—Vol. 14, COMPUTERM ’02 (pp. 1–7). Stroudsburg, PA, USA, Association for Computational Linguistics. Nakagawa, H., & Mori, T. (2002). A simple but powerful automatic term extraction method. In COLING-02 on COMPUTERM 2002: Second International Workshop on Computational Terminology—Vol. 14, COMPUTERM ’02 (pp. 1–7). Stroudsburg, PA, USA, Association for Computational Linguistics.
Zurück zum Zitat Névéol, A., Grosjean, J., Darmoni, S. J., & Zweigenbaum, P. (2014). Language resources for french in the biomedical domain. In Proceedings of the 9th international conference on language resources and evaluation, LREC’14. Association for Computational Linguistics. Névéol, A., Grosjean, J., Darmoni, S. J., & Zweigenbaum, P. (2014). Language resources for french in the biomedical domain. In Proceedings of the 9th international conference on language resources and evaluation, LREC’14. Association for Computational Linguistics.
Zurück zum Zitat Newman, D., Koilada, N., Lau, J. H., & Baldwin, T. (December 2012). Bayesian text segmentation for index term identification and keyphrase extraction. In Proceedings of 24th international conference on computational linguistics (pp. 2077–2092). COLING’12 India: Mumbai. Newman, D., Koilada, N., Lau, J. H., & Baldwin, T. (December 2012). Bayesian text segmentation for index term identification and keyphrase extraction. In Proceedings of 24th international conference on computational linguistics (pp. 2077–2092). COLING’12 India: Mumbai.
Zurück zum Zitat Noh, T.-G., Park, S.-B., Yoon, H.-G., Lee, S.-J., & Park, S.-Y. (2009). An automatic translation of tags for multimedia contents using folksonomy networks. In Proceedings of the 32Nd international ACM SIGIR conference on research and development in information retrieval, SIGIR’09 (pp. 492–499). New York, NY, USA, ACM. Noh, T.-G., Park, S.-B., Yoon, H.-G., Lee, S.-J., & Park, S.-Y. (2009). An automatic translation of tags for multimedia contents using folksonomy networks. In Proceedings of the 32Nd international ACM SIGIR conference on research and development in information retrieval, SIGIR’09 (pp. 492–499). New York, NY, USA, ACM.
Zurück zum Zitat Noy, N. F., Shah, N. H., Whetzel, P. L., Dai, B., Dorf, M., Griffith, N. B., et al. (2009). Bioportal: Ontologies and integrated data resources at the click of a mouse. Nucleic Acids Research, 37, 170–173.CrossRef Noy, N. F., Shah, N. H., Whetzel, P. L., Dai, B., Dorf, M., Griffith, N. B., et al. (2009). Bioportal: Ontologies and integrated data resources at the click of a mouse. Nucleic Acids Research, 37, 170–173.CrossRef
Zurück zum Zitat Opsahl, T., Agneessens, F., & Skvoretz, J. (2010). Node centrality in weighted networks: Generalizing degree and shortest paths. Social Networks, 32(3), 245–251.CrossRef Opsahl, T., Agneessens, F., & Skvoretz, J. (2010). Node centrality in weighted networks: Generalizing degree and shortest paths. Social Networks, 32(3), 245–251.CrossRef
Zurück zum Zitat Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The pagerank citation ranking: Bringing order to the web. Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The pagerank citation ranking: Bringing order to the web.
Zurück zum Zitat Pantel, P., Crestan, E., Borkovsky, A., Popescu, A.-M., & Vyas, V. (2009). Web-scale distributional similarity and entity set expansion. In Proceedings of the conference on empirical methods in natural language processing, EMNLP’09 (pp. 938–947). Stroudsburg, PA, USA. Association for Computational Linguistics. Pantel, P., Crestan, E., Borkovsky, A., Popescu, A.-M., & Vyas, V. (2009). Web-scale distributional similarity and entity set expansion. In Proceedings of the conference on empirical methods in natural language processing, EMNLP’09 (pp. 938–947). Stroudsburg, PA, USA. Association for Computational Linguistics.
Zurück zum Zitat Qureshi, M. A., O’Riordan, C., & Pasi, G. (2012). Short-text domain specific key terms/phrases extraction using an n-gram model with wikipedia. In Proceedings of the 21st ACM international conference on information and knowledge management, CIKM’12 (pp. 2515–2518). New York, NY, USA, ACM. Qureshi, M. A., O’Riordan, C., & Pasi, G. (2012). Short-text domain specific key terms/phrases extraction using an n-gram model with wikipedia. In Proceedings of the 21st ACM international conference on information and knowledge management, CIKM’12 (pp. 2515–2518). New York, NY, USA, ACM.
Zurück zum Zitat Robertson, S. E., Walker, S., & Beaulieu, M. (1999). Okapi at TREC-7: Automatic ad hoc, filtering, vlc and interactive track. IN, 21, 253–264. Robertson, S. E., Walker, S., & Beaulieu, M. (1999). Okapi at TREC-7: Automatic ad hoc, filtering, vlc and interactive track. IN, 21, 253–264.
Zurück zum Zitat Rose, S., Engel, D., Cramer, N., & Cowley, W. (2010). Automatic keyword extraction from individual documents. In M. W. Berry, J. Kogan (Eds.), Text Mining: Applications and Theory (pp. 1–20). John Wiley and Sons, Ltd. Rose, S., Engel, D., Cramer, N., & Cowley, W. (2010). Automatic keyword extraction from individual documents. In M. W. Berry, J. Kogan (Eds.), Text Mining: Applications and Theory (pp. 1–20). John Wiley and Sons, Ltd.
Zurück zum Zitat Rousseau, F., & Vazirgiannis, M. (2013). Graph-of-word and tw-idf: New approach to ad hoc ir. In Proceedings of the 22Nd ACM international conference on information and knowledge management, CIKM’13 (pp. 59–68). New York, NY, USA, ACM. Rousseau, F., & Vazirgiannis, M. (2013). Graph-of-word and tw-idf: New approach to ad hoc ir. In Proceedings of the 22Nd ACM international conference on information and knowledge management, CIKM’13 (pp. 59–68). New York, NY, USA, ACM.
Zurück zum Zitat Rubin, D. L., Shah, N. H., & Noy, N. F. (2008). Biomedical ontologies: A functional perspective. Briefings in Bioinformatics, 9(1), 75–90.CrossRef Rubin, D. L., Shah, N. H., & Noy, N. F. (2008). Biomedical ontologies: A functional perspective. Briefings in Bioinformatics, 9(1), 75–90.CrossRef
Zurück zum Zitat Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information processing and management, 24(5), 513–523.CrossRef Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information processing and management, 24(5), 513–523.CrossRef
Zurück zum Zitat Singhal, A., Buckley, C., & Mitra, M. (1996). Pivoted document length normalization. In Proceedings of the 19th international ACM SIGIR conference on research and development in information retrieval, SIGIR’96 (pp. 21–29). New York, NY, USA, ACM. Singhal, A., Buckley, C., & Mitra, M. (1996). Pivoted document length normalization. In Proceedings of the 19th international ACM SIGIR conference on research and development in information retrieval, SIGIR’96 (pp. 21–29). New York, NY, USA, ACM.
Zurück zum Zitat Spasic, I., Greenwood, M., Preece, A., Francis, N., & Elwyn, G. (2013). FlexiTerm: a flexible term recognition method. Biomedical Semantics, 4(1), 27. CrossRef Spasic, I., Greenwood, M., Preece, A., Francis, N., & Elwyn, G. (2013). FlexiTerm: a flexible term recognition method. Biomedical Semantics, 4(1), 27. CrossRef
Zurück zum Zitat Stoykova, V., & Petkova, E. (2012). Automatic extraction of mathematical terms for precalculus. Procedia Technology Journal, 1, 464–468.CrossRef Stoykova, V., & Petkova, E. (2012). Automatic extraction of mathematical terms for precalculus. Procedia Technology Journal, 1, 464–468.CrossRef
Zurück zum Zitat Tamura, A., Watanabe, T., & Sumita, E. (2012). Bilingual lexicon extraction from comparable corpora using label propagation. In Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning, EMNLP-CoNLL’12 (pp. 24–36). Stroudsburg, PA, USA, Association for Computational Linguistics. Tamura, A., Watanabe, T., & Sumita, E. (2012). Bilingual lexicon extraction from comparable corpora using label propagation. In Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning, EMNLP-CoNLL’12 (pp. 24–36). Stroudsburg, PA, USA, Association for Computational Linguistics.
Zurück zum Zitat Tian, Y., & Lo, D. (2015). A comparative study on the effectiveness of part-of-speech tagging techniques on bug reports. In Proceedings of the 22nd international IEEE conference on software analysis, evolution, and reengineering, SANER’15 (pp. 570–574). Montreal, Canada, IEEE. Tian, Y., & Lo, D. (2015). A comparative study on the effectiveness of part-of-speech tagging techniques on bug reports. In Proceedings of the 22nd international IEEE conference on software analysis, evolution, and reengineering, SANER’15 (pp. 570–574). Montreal, Canada, IEEE.
Zurück zum Zitat Van Eck, N. J., Waltman, L., Noyons, E. C., & Buter, R. K. (2010). Automatic term identification for bibliometric mapping. Scientometrics, 82(3), 581–596.CrossRef Van Eck, N. J., Waltman, L., Noyons, E. C., & Buter, R. K. (2010). Automatic term identification for bibliometric mapping. Scientometrics, 82(3), 581–596.CrossRef
Zurück zum Zitat Yang, Y., Zhao, T., Lu, Q., Zheng, D., & Yu, H. (2009). Chinese term extraction using different types of relevance. In Proceedings of the international joint conference on natural language processing, ACL-IJCNLP’09 (pp. 213–216). Suntec, Singapore, Association for Computational Linguistics. Yang, Y., Zhao, T., Lu, Q., Zheng, D., & Yu, H. (2009). Chinese term extraction using different types of relevance. In Proceedings of the international joint conference on natural language processing, ACL-IJCNLP’09 (pp. 213–216). Suntec, Singapore, Association for Computational Linguistics.
Zurück zum Zitat Zadeh, R. B., & Goel, A. (2013). Dimension independent similarity computation. Journal of Machine Learning Research, 14(1), 1605–1626.MATH Zadeh, R. B., & Goel, A. (2013). Dimension independent similarity computation. Journal of Machine Learning Research, 14(1), 1605–1626.MATH
Zurück zum Zitat Zhang, X., Song, Y., & Fang, A. (2010). Term recognition using conditional random fields. In International conference on natural language processing and knowledge engineering, NLP-KE’10 (pp. 1–6). IEEE. Zhang, X., Song, Y., & Fang, A. (2010). Term recognition using conditional random fields. In International conference on natural language processing and knowledge engineering, NLP-KE’10 (pp. 1–6). IEEE.
Zurück zum Zitat Zhang, Z., Iria, J., Brewster, C., & Ciravegna, F. (2008). A comparative evaluation of term recognition algorithms. In Proceedings of the sixth international conference on language resources and evaluation, LREC’08, Marrakech, Morocco. Zhang, Z., Iria, J., Brewster, C., & Ciravegna, F. (2008). A comparative evaluation of term recognition algorithms. In Proceedings of the sixth international conference on language resources and evaluation, LREC’08, Marrakech, Morocco.
Metadaten
Titel
Biomedical term extraction: overview and a new methodology
verfasst von
Juan Antonio Lossio-Ventura
Clement Jonquet
Mathieu Roche
Maguelonne Teisseire
Publikationsdatum
01.04.2016
Verlag
Springer Netherlands
Erschienen in
Discover Computing / Ausgabe 1-2/2016
Print ISSN: 2948-2984
Elektronische ISSN: 2948-2992
DOI
https://doi.org/10.1007/s10791-015-9262-2

Weitere Artikel der Ausgabe 1-2/2016

Discover Computing 1-2/2016 Zur Ausgabe

Premium Partner