Abstract
Question answering systems offer a friendly interface for human beings to interact with massive online information. It is time consuming for users to retrieve useful medical information with search engines among massive online websites. An effort is made to build a Chinese Question Answering System in Medical Domain (CQASMD) to provide useful medical information for users. A large medical knowledge base with more than 300 thousand medical terms and their descriptions is firstly constructed to store the structured medical knowledge data, and classified with the FastText model. Furthermore, a Word2Vec model is adopted to capture the semantic meanings of words, and the questions and answers are processed with sentence embedding to capture semantic context information. Users’ questions are firstly classified and processed into a sentence vector and a matching algorithm is adopted to match the most similar question. After querying the constructed medical knowledge base, the corresponding answers to previous questions are responded to users. The architecture and flowchart of CQASMD is proposed, which will play an important role in self disease diagnosis and treatment.
Similar content being viewed by others
References
HAZRINA S, SHAREF N M, IBRAHIM H, et al. Review on the advancements of disambiguation in semantic question answering system [J]. Information Processing and Management, 2017, 53(1): 52–69.
ALLAM A M N, HAGGAG M H. The question answering systems: A survey [J]. International Journal of Research and Reviews in Information Sciences, 2012, 2(3): 1–12.
SOCHER R, BENGIO Y, MANNING C D. Deep learning for NLP (without magic) [C]//Tutorial Abstracts of ACL 2012. Jeju, Korea: ACL, 2012: 5–5.
LECUN Y, BENGIO Y, HINTON G. Deep learning [J]. Nature, 2015, 521: 436–444.
HANBURY A. Medical information retrieval: An instance of domain-specific search [C]//Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval. Portland, OR, USA: ACM, 2012: 1191–1192.
JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification [EB/OL]. (2016-08-09). [2018-04-18]. https://doi.org/arxiv.org/pdf/1607.01759.pdf.
BOJANOWSKI P, GRAVE E, JOULIN A, et al. Enriching word vectors with subword information [EB/OL]. (2017-06-19).[2018-04-18]. https://doi.org/arxiv.org/pdf/1607.04606.pdf.
MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality [EB/OL]. (2013-10-16). [2018-04-18]. https://doi.org/arxiv.org/pdf/1310.4546.pdf.
MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space [EB/OL]. (2013-09-07). [2018-04-18]. https://doi.org/arxiv.org/pdf/1301.3781v3.pdf.
IYYERM, MANJUNATHA V, BOYD-GRABER J, et al. Deep unordered composition rivals syntactic methods for text classification [C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. Beijing, China: ACL, 2015: 1681–1691.
WANG S, MANNING C D. Baselines and bigrams: Simple, good sentiment and topic classification [C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: ACL, 2012: 90–94.
Author information
Authors and Affiliations
Corresponding author
Additional information
Foundation item: the National Natural Science Foundation of China (No. 61303094), the Program of Science and Technology Commission of Shanghai Municipality (Nos. 16511102400 and 16111107801), and the Innovation Program of Shanghai Municipal Education Commission (No. 14YZ024)
Rights and permissions
About this article
Cite this article
Feng, G., Du, Z. & Wu, X. A Chinese Question Answering System in Medical Domain. J. Shanghai Jiaotong Univ. (Sci.) 23, 678–683 (2018). https://doi.org/10.1007/s12204-018-1982-1
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12204-018-1982-1