nach oben

Erschienen in:

2021 | OriginalPaper | Buchkapitel

Large-Scale Multi-granular Concept Extraction Based on Machine Reading Comprehension

verfasst von : Siyu Yuan, Deqing Yang, Jiaqing Liang, Jilun Sun, Jingyue Huang, Kaiyan Cao, Yanghua Xiao, Rui Xie

Erschienen in: The Semantic Web – ISWC 2021

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The concepts in knowledge graphs (KGs) enable machines to understand natural language, and thus play an indispensable role in many applications. However, existing KGs have the poor coverage of concepts, especially fine-grained concepts. In order to supply existing KGs with more fine-grained and new concepts, we propose a novel concept extraction framework, namely MRC-CE, to extract large-scale multi-granular concepts from the descriptive texts of entities. Specifically, MRC-CE is built with a machine reading comprehension model based on BERT, which can extract more fine-grained concepts with a pointer network. Furthermore, a random forest and rule-based pruning are also adopted to enhance MRC-CE’s precision and recall simultaneously. Our experiments evaluated upon multilingual KGs, i.e., English Probase and Chinese CN-DBpedia, justify MRC-CE’s superiority over the state-of-the-art extraction models in KG completion. Particularly, after running MRC-CE for each entity in CN-DBpedia, more than 7,053,900 new concepts (instanceOf relations) are supplied into the KG. The code and datasets have been released at https://github.com/fcihraeipnusnacwh/MRC-CE.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Improving Inductive Link Prediction Using Hyper-relational Facts

Nächstes Kapitel Graphhopper: Multi-hop Scene Graph Reasoning for Visual Question Answering

https://www.meituan.com.

http://kw.fudan.edu.cn/cndbpedia.

https://www.microsoft.com/en-us/research/project/probase/.

https://www.wikipedia.org/.

We translate Chinese patterns for CN-DBpedia into English.

Prince Station’s abstract text and CE results were translated from Chinese.

Akbik, A., Bergmann, T., Blythe, D., Rasul, K., Schweter, S., Vollgraf, R.: FLAIR: an easy-to-use framework for state-of-the-art NLP. In: Proceedings of NAACL, pp. 54–59 (2019)

Alomari, S., Abdullah, S.: Improving an AI-based algorithm to automatically generate concept maps. Comput. Inf. Sci. 12(4), 72 (2019)

Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_52CrossRef

Bai, H., Xing, F.Z., Cambria, E., Huang, W.B.: Business taxonomy construction using concept-level hierarchical clustering. arXiv preprint arXiv:1906.09694 (2019)

Budin, G.: Ontology-driven translation management. In: Knowledge Systems and Translation (2005)

Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116 (2019)

Cui, W., Xiao, Y., Wang, W.: KBQA: an online template based question answering system over freebase. In: Proceedings of IJCAI (2016)

Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

Ji, J., Chen, B., Jiang, H.: Fully-connected LSTM–CRF on medical concept extraction. Int. J. Mach. Learn. Cybern. 11(9), 1971–1979 (2020). https://doi.org/10.1007/s13042-020-01087-6CrossRef

10.

Kingma, J., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of ICLR (2015)

11.

Lample, G., Conneau, A.: Crosslingual language model pretraining. In: Proceedings of NeurIPS (2019)

12.

Li, N., Tian, M., Lv, S.: Extracting hierarchical relations between the back-of-the-book index terms. In: Hong, J.-F., Zhang, Y., Liu, P. (eds.) CLSW 2019. LNCS (LNAI), vol. 11831, pp. 433–443. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-38189-9_45CrossRef

13.

Li, X., Feng, J., Meng, Y., Han, Q., Wu, F., Li, J.: A unified MRC framework for named entity recognition. In: Proceedings of ACL (2020)

14.

Li, X., et al.: Entity-relation extraction as multi-turn question answering. arXiv preprint arXiv:1905.05529 (2019)

15.

Liang, J., Xiao, Y., Wang, H., Zhang, Y., Wang, W.: Probase+: inferring missing links in conceptual taxonomies. IEEE Trans. Knowl. Data Eng. 29(6), 1281–1295 (2017)CrossRef

16.

Liang, J., Xiao, Y., Wang, H., Zhang, Y., Wang, W.: Probase+: inferring missing links in conceptual taxonomies. IEEE TKDE 29(6), 1281–1295 (2017)

17.

Liang, J., Zhang, Y., Xiao, Y., Wang, H., Wang, W., Zhu, P.: On the transitivity of hypernym-hyponym relations in data-driven lexical taxonomies. In: Proceedings of AAAI, vol. 31 (2017)

18.

Liao, J., Sun, F., Gu, J.: Combining concept graph with improved neural networks for Chinese short text classification. In: Wang, X., Lisi, F.A., Xiao, G., Botoeva, E. (eds.) JIST 2019. CCIS, vol. 1157, pp. 205–212. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-3412-6_20CrossRef

19.

Liu, S., Zhang, X., Zhang, S., Wang, H., Zhang, W.: Neural machine reading comprehension: methods and trends. Appl. Sci. 9(18), 3698 (2019)CrossRef

20.

Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

21.

Nguyen, A.D., Nguyen, K.H., Ngo, V.V.: Neural sequence labeling for Vietnamese POS tagging and NER. In: Proceedings of IEEE-RIVF, pp. 1–5. IEEE (2019)

22.

Nie, Y., Tian, Y., Song, Y., Ao, X., Wan, X.: Improving named entity recognition with attentive ensemble of syntactic information. arXiv preprint arXiv:2010.15466 (2020)

23.

Petrucci, G., Rospocher, M., Ghidini, C.: Expressive ontology learning as neural machine translation. JWS 52, 66–82 (2018)CrossRef

24.

Ponzetto, S.P., Strube, M.: WikiTaxonomy: a large scale knowledge resource. In: Proceedings of ECAI, vol. 178, pp. 751–752. Citeseer (2008)

25.

Poria, S., Hussain, A., Cambria, E.: EmoSenticSpace: dense concept-based affective features with common-sense knowledge. In: Multimodal Sentiment Analysis. SC, vol. 8, pp. 85–116. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95020-4_5CrossRef

26.

Preum, S.M., Shu, S., Alemzadeh, H., Stankovic, J.A.: EMSContExt: EMS protocol-driven concept extraction for cognitive assistance in emergency response. In: Proceedings of AAAI, pp. 13350–13355 (2020)

27.

Qiu, J., Chai, Y., Tian, Z., Du, X., Guizani, M.: Automatic concept extraction based on semantic graphs from big data in smart city. IEEE Trans. Comput. Soc. Syst. 7(1), 225–233 (2019)CrossRef

28.

Roller, S., Kiela, D., Nickel, M.: Hearst patterns revisited: automatic hypernym detection from large text corpora. In: Proceedings of ACL (2018)

29.

Ruan, D.R., He, X.Y., Li, D.Y., Gao, K.: Modeling and extracting hyponymy relationships on Chinese electric power field content. In: 2016 8th International Conference on Modelling, Identification and Control (ICMIC), pp. 439–443. IEEE (2016)

30.

Sammut, C., Webb, G.I. (eds.): Encyclopedia of Machine Learning and Data Mining. Springer, Boston (2017). https://doi.org/10.1007/978-1-4899-7687-1CrossRefMATH

31.

Seo, M., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603 (2016)

32.

Sharma, R., Gopalani, D., Meena, Y.: Concept-based approach for research paper recommendation. In: Shankar, B.U., Ghosh, K., Mandal, D.P., Ray, S.S., Zhang, D., Pal, S.K. (eds.) PReMI 2017. LNCS, vol. 10597, pp. 687–692. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69900-4_87CrossRef

33.

Shen, Y., Huang, P.S., Gao, J., Chen, W.: ReasoNet: learning to stop reading in machine comprehension. In: Proceedings of ACM SIGKDD, pp. 1047–1055 (2017)

34.

Shvets, A., Wanner, L.: Concept extraction using pointer-generator networks. arXiv preprint arXiv:2008.11295 (2020)

35.

Song, Y., Tian, S., Yu, L.: A method for identifying local drug names in Xinjiang based on BERT-BiLSTM-CRF. Autom. Control Comput. Sci. 54(3), 179–190 (2020). https://doi.org/10.3103/S0146411620030098CrossRef

36.

Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a core of semantic knowledge. In: Proceedings of WWW (2007)

37.

Vinyals, O., Fortunato, M., Jaitly, N.: Pointer networks. In: Proceeedings of NIPS, pp. 2692–2700 (2015)

38.

Wei, Z., Su, J., Wang, Y., Tian, Y., Chang, Y.: A novel hierarchical binary tagging framework for joint extraction of entities and relations. arXiv preprint arXiv:1909.03227 (2019)

39.

Wu, W., Li, H., Wang, H., Zhu, K.Q.: Probase: a probabilistic taxonomy for text understanding. In: Proceedings of ACM SIGMOD, pp. 481–492 (2012)

40.

Xu, B., et al.: METIC: multi-instance entity typing from corpus. In: Proceedings of CIKM, pp. 903–912 (2018)

41.

Xu, B., et al.: CN-DBpedia: a never-ending Chinese knowledge extraction system. In: Benferhat, S., Tabia, K., Ali, M. (eds.) IEA/AIE 2017. LNCS (LNAI), vol. 10351, pp. 428–438. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60045-1_44CrossRef

42.

Xu, B., Zhang, Y., Liang, J., Xiao, Y., Hwang, S., Wang, W.: Cross-lingual type inference. In: Navathe, S.B., Wu, W., Shekhar, S., Du, X., Wang, X.S., Xiong, H. (eds.) DASFAA 2016. LNCS, vol. 9642, pp. 447–462. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32025-0_28CrossRef

43.

Yang, Y., Shen, X., Wang, Y.: BERT-BiLSTM-CRF for Chinese sensitive vocabulary recognition. In: Li, K., Li, W., Wang, H., Liu, Y. (eds.) ISICA 2019. CCIS, vol. 1205, pp. 257–268. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-5577-0_19CrossRef

44.

Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. In: Proceedings of NIPS, pp. 5753–5763 (2019)

45.

Yao, J., Cui, B., Cong, G., Huang, Y.: Evolutionary taxonomy construction from dynamic tag space. WWW 15(5–6), 581–602 (2012). https://doi.org/10.1007/s11280-011-0150-4CrossRef

46.

Yilahun, H., Abdurahman, K., Imam, S., Hamdulla, A.: Automatic extraction of Uyghur domain concepts based on multi-feature for ontology extension. IET Netw. 9(4), 200–205 (2020)CrossRef

47.

Zhao, G., Zhang, X.: Domain-specific ontology concept extraction and hierarchy extension. In: Proceedings of NLPIR, pp. 60–64 (2018)

Titel: Large-Scale Multi-granular Concept Extraction Based on Machine Reading Comprehension
verfasst von: Siyu Yuan
Deqing Yang
Jiaqing Liang
Jilun Sun
Jingyue Huang
Kaiyan Cao
Yanghua Xiao
Rui Xie
Verlag: Springer International Publishing
Buch: The Semantic Web – ISWC 2021
Print ISBN: 978-3-030-88360-7

Electronic ISBN: 978-3-030-88361-4

Copyright-Jahr: 2021
DOI: https://doi.org/10.1007/978-3-030-88361-4_6

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner