1 Introduction
-
Survey on knowledge graphs: We conduct a comprehensive survey of existing knowledge graph studies. In particular, this work thoroughly analyzes the advancements in knowledge graphs in terms of state-of-the-art technologies and applications.
-
Knowledge graph opportunities: We investigate potential opportunities for knowledge graphs in terms of knowledge graph-based AI systems and application fields that utilize knowledge graphs. Firstly, we examine the benefits of knowledge graphs for AI systems, including recommender systems, question-answering systems, and information retrieval. Then, we discuss the far-reaching impacts of knowledge graphs on human society by describing current and potential knowledge graph applications in various fields (e.g., education, scientific research, social media, and medical care).
-
Knowledge graph challenges: We provide deep insights into significant technical challenges facing knowledge graphs. In particular, we elaborate on limitations concerning five representative knowledge graph technologies, including knowledge graph embeddings, knowledge acquisition, knowledge graph completion, knowledge fusion, and knowledge reasoning.
2 Overview
2.1 What Are Knowledge Graphs?
-
DBpedia, a knowledge graph that intends to discover semantically meaningful information form Wikipedia and convert it into an effective well-structured ontological knowledge base in DBpedia (Auer et al. 2007).
-
Freebase, a knowledge graph which is built upon multiple sources that provides a structured and global resource of information (Bollacker et al. 2008).
-
Facebook’s entity graph, a knowledge graph that converts the unstructured content of the user profiles into meaningful structured data (Ugander et al. 2011).
-
Wikidata, a cross-lingual document-oriented knowledge graph which supports many sites and services such as Wikipedia (Vrandečić and Krötzsch 2014).
-
Yago, a quality knowledge base that contains a huge number of entities and their corresponding relationships. These entities are extracted from multiple sources such as Wikipedia and WordNet (Rebele et al. 2016).
-
WordNet, a lexical knowledge base measuring the semantic similarity between words. The knowledge base contains a number of hierarchical concept graphs to analyse the semantic similarity (Pedersen et al. 2004).
2.2 Current Research on Knowledge Graphs
2.2.1 Knowledge Graph Embedding
2.2.2 Knowledge Acquisition
2.2.3 Knowledge Graph Completion
2.2.4 Knowledge Fusion
2.2.5 Knowledge Reasoning
2.2.6 AI Systems
2.2.7 Application Fields
3 Knowledge Graphs for AI Systems
AI Systems | Approaches | Techniques on knowledge graphs |
---|---|---|
Recommender systems | KPRN (Wang et al. 2019b) | Entity-relation path generation based on user-item interaction |
RippleNet (Wang et al. 2018b) | Preference propagation | |
MKR (Wang et al. 2019c) | Laten user-item interaction | |
MKGAT (Sun et al. 2020) | Neighbor information extraction; relation reasoning | |
Ripp-MKR (Wang et al. 2021) | Preference propagation; laten user-item interaction | |
RKG (Shu and Huang 2021) | User preferenfce lists-based knowledge graph construction | |
Question-answering systems | MHPGM (Bauer et al. 2018) | Multiple hop relation reasoning |
PCQA (Shin et al. 2019) | Predicate constraints-based relation extraction | |
KEQA (Huang et al. 2019) | Simple question-based triplet construction | |
EmbedKGQA (Saxena et al. 2020) | Knowledge graph embedding-based multi-hop question answering | |
Information retrieval | EQFE (Dalton et al. 2014) | Query knowledge graph-based feature expansion |
Knowledge graph based Information Retrieval Technology (Wang et al. 2018a) | Query-document knowledge graph construction | |
CKG (Wise et al. 2020) | Document knowledge graph construction | |
EDRM(Liu et al. 2018) | Integration of semantics from knowledge graphs and entities from queries and documents representations of their entities |
3.1 Recommender Systems
3.1.1 Traditional Recommender Systems
3.1.1.1 Content-Based Recommender Systems
3.1.1.2 CF-Based Recommender Systems
3.1.2 Knowledge Graph-Based Recommender Systems
-
Better Representation of Data: Generally, the traditional recommender systems suffer from data sparsity issues because users usually have experience with only a small number of items. However, the rich representation of entities and their connections in knowledge graphs alleviate this issue.
-
Alleviating Cold Start Issues: It becomes challenging for traditional recommender systems to make recommendations when there are new users or items in the data set. In knowledge graph-based recommender systems, information about new items and users can be obtained through the relations between entities within knowledge graphs. For example, when a new Science-Fiction movie such as “Tenet” is added to the data set of a movie recommender system that employs knowledge graphs, the information about “Tenet" can be gained by its relationship with the genre Science-Fiction (gaining triplet (Tenet, has genre of, Sci-Fi)).
-
The Explainability of Recommendation: Users and the recommended items are connected along with the links in knowledge graphs. Thereby, the reasoning process can be easily illustrated by the propagation of knowledge graphs.
3.2 Question–Answering Systems
-
Increased Efficiency: Instead of searching for answers from massive textual data, which may contain a large volume of useless data items, knowledge graph-based question-answering systems focus only on entities with relevant properties and semantics. Therefore, they reduce the search space significantly and extract the answers effectively and efficiently.
-
Multi-hop Question Answering: The answers can be more complex and sophisticated than the ones produced with traditional methods since facts and concepts from knowledge graphs can be combined via multi-hop question answering.
3.3 Information Retrieval
-
Semantic Representation of Items: Items are represented according to a formal and interlinked model that supports semantic similarity, reasoning, and query expansion. This typically allows the system to retrieve more relevant items and makes the system more interpretable.
-
High Search Efficiency: Knowledge graph-based information retrieval can use the advanced representation of the items to reduce the search space significantly (e.g., discarding documents that use the same terms with different meanings), resulting in improved efficiency.
-
Accurate Retrieval Results: In knowledge graph-based information retrieval, the correlation between query and documents is analyzed based on the relations between entities in the knowledge graph. This is more accurate than finding the similarities between queries and documents.
4 Applications and Potentials
Fields | Applications | Methods | Functions |
---|---|---|---|
Education | Knowledge Graph based Course Management Model (Aliyu et al. 2020) | Course knowledge graphs | Courses management; Generation of course allocation schedule |
KnowEdu (Chen et al. 2018) | Instructional concepts extraction; Educational relation identification | Educational knowledge graph construction | |
Knowledge Graph-based Tool for Online Learning (Zablith 2022) | Integration of social media contents and formal learning contents | Efficient online knowledge acquisition | |
Scientific Research | Scientific Publication Management Model (Chi et al. 2018) | Knowledge graph based academic network | Scientific publication management |
Reviewer Recommendation System Yong et al. (2021) | Knowledge graph-based rule engine establishment | Precise matching of reviewer and paper | |
Social Networks | DEAP-FAKED (Mayank et al. 2021) | News-Entity knowledge graphs | Fake news detection |
GraphRec (Fan et al. 2019) | Information aggregation of user-user and user-item graphs | Social Recommendation | |
Graph Reasoning Model (Wang et al. 2018d) | Knowledge graph propogation | Social relationship extraction | |
Health/Medical Care | SMR (Gong et al. 2021) | Medical knowledge graph embeddings | Safe medicine recommendation |
DETERRENT (Cui et al. 2020) | Knowledge guided graph attention network | Health misinformation detection | |
KGNN (Lin et al. 2020) | Mining the relationships between drugs | Drug discovery | |
COVID-KG(Yuan et al. 2021) | Multimedia knowledge graph construction | Drug discovery |
4.1 Education
4.2 Scientific Research
4.3 Social Networks
4.4 Health/Medical Care
5 Technical Challenges
5.1 Knowledge Graph Embeddings
Categories | Techniques | Evaluation approaches_data set | Results (%) |
---|---|---|---|
Tensor factorization-based methods | RESCAL (Nickel et al. 2011) | Link prediction[Hits@10]_FB15K | 44.1 |
HolE (Nickel et al. 2016) | Link prediction[Hits@10]_FB15K | 73.9 | |
ComplEx (Trouillon et al. 2016) | Link prediction[Hits@10]_FB15K | 84 | |
SimplE (Kazemi and Poole 2018) | Link prediction[Hits@10]_FB15K | 83.8 | |
RotatE (Sun et al. 2019a) | Link prediction[Hits@10]_FB15K | 88.4 | |
QuatE (Zhang et al. 2019c) | Link prediction[Hits@10]_FB15K | 90 | |
Translation-based methods | TransE (Bordes et al. 2013) | Link prediction[Hits@10]_FB15K | 47.1 |
TransH (Wang et al. 2014) | Link prediction[Hits@10]_FB15K | 64.4 | |
TransR (Lin et al. 2015) | Link prediction[Hits@10]_FB15K | 68.7 | |
TransD (Ji et al. 2015) | Link prediction[Hits@10]_FB15K | 77.3 | |
TranSparse (Ji et al. 2016) | Link prediction[Hits@10]_FB15K | 79.9 | |
STransE (Nguyen et al. 2016) | Link prediction[Hits@10]_FB15K | 79.7 | |
TransA (Jia et al. 2016) | Link prediction[Hits@10]_FB15K | 80.4 | |
KG2E (He et al. 2015) | Link prediction[Hits@10]_FB15K | 71.5 | |
TransG (Xiao et al. 2015) | Link prediction[Hits@10]_FB15K | 88.2 | |
Neural network-based methods | SME (Bordes et al. 2014) | Link prediction[Hits@10]_FB15K | 41.3 |
NTN (Socher et al. 2013) | Triplet classification[Accuracy]_WN11 | 86.2 | |
SLM (Socher et al. 2013) | Triplet classification[Accuracy]_WN11 | 76 | |
RMNN (Liu et al. 2016) | Triplet classification[Accuracy]_WN11 | 89.9 | |
R-GCN (Schlichtkrull et al. 2018) | Link prediction[Hits@10]_FB15K | 84.2 | |
ConvKB (Nguyen et al. 2017) | Link prediction[Hits@10]_WN18RR | 52.5 | |
KBGAN (Cai and Wang 2017) | Link prediction[Hits@10]_WN18 | 89.2 |