Optimizing Ontology Alignment through Linkage Learning on Entity Correspondences

Xue, Xingsi; Yang, Chaofan; Jiang, Chao; Tsai, Pei-Wei; Mao, Guojun; Zhu, Hai

doi:https://doi.org/10.1155/2021/5574732

Complexity

On this page

Abstract Introduction Related Works Preliminaries Discussion Conclusions Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Artificial Intelligence for Smart System Simulation

View this Special Issue

Research Article | Open Access

Volume 2021 | Article ID 5574732 | https://doi.org/10.1155/2021/5574732

Optimizing Ontology Alignment through Linkage Learning on Entity Correspondences

Xingsi Xue,^1,2,3Chaofan Yang,^1,2,3Chao Jiang,^1,2,3Pei-Wei Tsai,⁴Guojun Mao,²and Hai Zhu⁵

Academic Editor: Abd E.I.-Baset Hassanien

Received10 Jan 2021

Revised25 Jan 2021

Accepted27 Jan 2021

Published05 Feb 2021

Abstract

Data heterogeneity is the obstacle for the resource sharing on Semantic Web (SW), and ontology is regarded as a solution to this problem. However, since different ontologies are constructed and maintained independently, there also exists the heterogeneity problem between ontologies. Ontology matching is able to identify the semantic correspondences of entities in different ontologies, which is an effective method to address the ontology heterogeneity problem. Due to huge memory consumption and long runtime, the performance of the existing ontology matching techniques requires further improvement. In this work, an extended compact genetic algorithm-based ontology entity matching technique (ECGA-OEM) is proposed, which uses both the compact encoding mechanism and linkage learning approach to match the ontologies efficiently. Compact encoding mechanism does not need to store and maintain the whole population in the memory during the evolving process, and the utilization of linkage learning protects the chromosome’s building blocks, which is able to reduce the algorithm’s running time and ensure the alignment’s quality. In the experiment, ECGA-OEM is compared with the participants of ontology alignment evaluation initiative (OAEI) and the state-of-the-art ontology matching techniques, and the experimental results show that ECGA-OEM is both effective and efficient.

1. Introduction

Semantic Web (SW) is proposed by Tims Berners-Lee in 1998, which makes the intelligent applications be able to understand a word’s meaning in semantic level. Ontologies are the solution to the issue of data heterogeneity on SW since it is able to make consensus of a certain conception meaning of a field and provide abundant domain knowledge and semantic vocabulary for the interaction between application systems. However, due to SW’s scattering essence, there might be different definitions on a concept in separate ontologies, which leads to the issue of ontology heterogeneity [1]. Ontology matching is regarded as an effective method to address it, and swarm intelligent algorithm- (SIA-) based ontology matching techniques have achieved good performance in past studies [2], such as genetic algorithm (GA) [3], particle swarm optimization algorithm (PSO) [4], firefly algorithm (FA) [2], and artificial bee colony algorithm (ABC) [5]. However, there are two drawbacks in the existing SIA-based approaches: (1) massive time and memory consumption is required, which heavily blocks the efficiency of the ontology matching process; (2) an expert of related field or a reference alignment is required in the process of ontology matching which is usually not available in real application conditions. To overcome these drawbacks, an extended compact genetic algorithm-based ontology entity matching technique (ECGA-OEM) is proposed in this work, which uses both compact encoding mechanism [6, 7] and linkage learning approach to efficiently match the ontologies. In particular, our contributions are as follows:(i)A new evaluating metrics on the ontology alignment is proposed, which is able to work without the reference alignment and the domain experts.(ii)An optimal model on ontology entity matching problem is constructed.(iii)An ECGA-OEM is proposed, which uses the linkage learning and compact encoding mechanism to efficiently address the ontology entity matching problem.

The rest of this paper is organized as follows: the related works are narrated in Section 2; the statement of ontology, ontology matching, and similarity measures are presented in Section 3; the ontology entity matching through ECGA proposed by this paper is revealed in Section 4; Section 5 presents the experiment results; and finally, Section 6 draws the conclusion and presents the future work.

Ontology matching is a complex, time-consuming, and error-prone work, especially when the scale of ontologies is large. Recently, a number of the machine learning (ML) techniques [8–14] have been proposed to automatically determine the ontology alignment. To improve the matching efficiency, Araújo et al. [15] presented the matching system through parallel computing (PC) technique and Amin et al. [16] matching ontology based on cloud computing (CC). At the same time, SIA-based technique has achieved great performance in the ontology matching [1, 2, 17–20] domain [21–26].

Generally, ontology matching techniques are classified into two categories: ontology metamatching techniques and ontology entity matching techniques [27]. The former dedicates to address the problem that how to aggregate different similarity measures with appropriate weights, and the latter tries to directly determine the entity correspondence set between two ontologies. The first SIA-based ontology metamatching system is genetics for ontology alignment (GOAL), which aims at optimizing the aggregating weight set for different matchers [3, 28–30]. Memetic algorithms (MAs), which introduce local search (LS) strategy into evolutionary algorithms (EAs) to improve its local optimization capability, are proposed to solve ontology metamatching problem [31]. To overcome the drawback of overreliance reference alignment, Xue et al. presented a partial reference alignment (PRA), in which only a part of standard reference is used to assess the quality of alignment [32]. Furthermore, Xue and Wang proposed an innovative metric named unanimous improvement ratio (UIR) to assess the alignment’s quality, in which the reference alignment is not required [33]. Besides, artificial bee colony (ABC) algorithm is also adopted to address ontology metamatching problem, which further improves the solution’s quality [5].

During the matching process, ontology metamatching techniques need to maintain several similarity matrices, which leads to huge memory consumption. For this reason, ontology entity matching techniques, which aims at directly determining the optimal pair set, attracts the expert’s interests. Genetic algorithm-based ontology matching (GAOM) firstly regards certain matching pairs set as the optimizing objective [34]. MA is also utilized to solve the ontology entity matching problem, whose performance outperforms GA [35]. Bock et al. [4] use PSO to solve the ontology entity matching. In detail, it evaluates the fitness of chromosomes through a certain aggregation strategy for multiple objective functions. Alves et al. [36] argue that instances consisting in the ontology can be used to improve alignment in the condition that knowledge is embedded in them. For this reason, Xue et al. [37] take also instance-level matching into consideration to further improve the quality of alignment.

3. Preliminaries

3.1. Ontology and Ontology Matching

Definition 1. (ontology). An ontology is a 5-tuple [33]. where is a set of classes that cannot be empty, is a set of properties that cannot be empty, is a set (it could be empty) of individuals that represent the instances of classes in the real world, is a nonempty set of axioms that are used to check the consistency of ontologies or deduce new information, and is a set of annotations that provide information metadata so that the researcher can understand. Particularly, , , and make up the entities in ontologies.

Definition 2. (ontology matching). The ontology matching can be considered as a function , where and are two ontologies to be matched; is an existing initial alignment of and ; is a set of parameters, e.g., threshold, in the process of ontology matching; and is a set of external resources, e.g., background knowledge based and dictionaries, which assisted in ontology matching. The process of ontology matching is depicted in Figure 1, where is the obtained ontology alignment.
An example of matching two ontologies is presented in Figure 2, where and are two ontologies to be matched in this figure. The strings in the rounded rectangle are the classes, e.g., “Reference,” “Entry,” and “Book.” The black lines between two classes of the same ontology represent their relation “has a” or “is a” in turn; e.g., “Reference” has a “Book” and “Book” is a “Reference,” which means “Reference” is the supclass of “Book,” and “Book” is the subclass of “Reference.” There are datatype properties that describe the features of a class; e.g., “Data,” “Title,” and “Human Creator” are the properties of “Reference.” The instances of a class are in a rectangle; e.g., “Of Natures Obvious Laws & Processes in Vegetation” is an instance of “Article.” The relation of the entities between the two ontologies are linked by the lines with double arrowhead, and there are symbols: “,” “” (or “”), and “,” which, respectively, means equivalence, more specific (or less specific), and disjointness relation; e.g., classes “Entry” and “Book” of are the equivalent and hyponym of class “Reference” of , property “Event” of , and property “Year” of are irrelevant.

3.2. Similarity Measures

It is necessary to measure to what extend two ontology entities are similar before finding the reliable entity correspondences. In the ontology domain, we usually use the similarity measure to calculate two entities’ similarity values. Generally, there are three categories of similarity measures, i.e., syntactic, linguistic, and taxonomy-based measures [33].

3.2.1. Syntactic Measures

Two syntactic measures, i.e., SMOA (string metric for ontology alignment) [37] and Levenshtein [38], are employed in this paper. Given two strings and , the SMOA and Levenshtein similarity are, respectively, defined as follows:where stands for the common length of and while for the different lengths and is the improvement to results yielded by the method that proposed by Winkler [39].where and are the cardinality of the letters contained in and , respectively, and is the number of letters that need to be modified from to . The final syntactic similarity is equal to the average of SMOA and Levenshtein.

3.2.2. Linguistic Measures

The linguistic similarity between two strings is worked out by considering the semantic relations (such as synonyms and hypernym) which usually requires using the thesaurus and dictionaries. In this work, WordNet [23, 40], an electronic vocabulary database that has collected every meaning of various words, is used. Given two words and , equals:(i)1, if words and are synonyms in Wordnet.(ii)0.5, if word is the hypernym of word or vice versa in Wordnet.(iii)0, otherwise.

3.2.3. Taxonomy-Based Measures

The core ideal of taxonomy-based measures is to make full use of the hierarchy relationship of ontology to determine two entities’ similarity by considering their neighbor’s similarity. In this work, a mutual reasoning between class and property (MRCP) is proposed as the taxonomy-based measure, which is shown in Figure 3.

(a)

(b)

(c)

(d)

In Figure 3, the circle is the class of the ontology, the triangle is the properties of the ontology, and the one-way arrow represents the hierarchical relationship; i.e., class is the supclass of class , the dividing line arrow between the class and the property indicates that the property belongs to this class, the bidirectional arrow indicates that there is a high similarity between the two entities, and the dashed two-way arrow indicates that the similarity between them is improved after the operation. Subgraph (a) depicts the classes’ similarity gained from their neighbor, i.e., supclass and subclass. There is high similarity between classes and , so the similarity between their subclasses and is supposed to increase. Likewise, similarity of classes and would be increased because their subclasses and are highly similar. Subgraph (b) is the properties’ similarity gained from their supproperty and subproperty. The similarity of properties and will be improved since their supproperties and and subproperties and are highly similar, respectively. Subgraph (c) is the properties’ similarity gained from the classes which they belong to. and are the classes of two ontologies and , , , , , and are their properties, respectively. The similarities between properties of class and properties of class would be improved because of the high similarity of and ; i.e., the similarity of and would be promoted and so are the remaining eight combinations. On the contrary, classes’ similarity will be increased due to the same or highly similar properties they shared, as is depicted in subgraph (d). Since pairs and , and , and and , the similarity of and is increased too.

3.2.4. Aggregation Strategy

Three similarity matrixes are generated when the three measures have been applied. In this work, three matrices need to be aggregated into one matrix. The final similarity value between two entities and is defined as follows:where , , and is, respectively, the syntactic, linguistic, and taxonomy-based similarity of and ; is the average of and ; and is a given parameter to filter the matching pairs with low similarity.

4. Extended Compact Genetic Algorithm-Based Ontology Entity Matching

GA is an excellent methodology to solve the ontology matching problem due to its potential parallel search characteristic and good searchability. In our work, an ECGA to efficiently address the ontology entity matching problem is proposed.

4.1. Optimal Model

The optimal model of ontology entity matching problem is given as follows:where and , respectively, are the cardinalities of two ontologies and ; is the matching pairs. Particularly, it means there is no matching of entity in when . The objective of this work is to maximize , and for the details of it, refer to Section 4.3.3.

4.2. The Framework of ECGA-OEM

Two ontologies are to be matched as input and a reference alignment as output, and the framework of ECGA-OEM is shown in Figure 4, whose critical components are narrated in the rest of this section.Two ontologies, generally in XML or RDF format, are extracted into two hierarchy schema in the preprocessing model. The operation of the ECGA optimization model relies on the similarity matrix obtained in the similarity measure model, which has been stated in detail in Section 3.2. Finally, the alignment is generated by the solution generation model. In detail, the ECGA optimization model is described in Section 4.3.

4.3. ECGA Optimization Model

Given the virtual population’s maximum generation, = 2000 (normally, the number of iterations in the convergence of ECGA is much less than ), = 0.7, and the hierarchy schema of ontology1 and ontology2 as input and final alignment as output. The pseudocode of ECGA is proposed in Algorithm 1, where and are probability vector (see also Section 4.3.1) and building blocks (refer to also Section 4.3.7), respectively.

	Input: the hierarchy schemas of and ; the aggregated similarity matrix, ;
	Output: the best chromosome, ;
(1)
(2)	Whiledo
(3)	;
(4)	;
(5)	;
(6)	;
(7)	;
(8)	end while

4.3.1. Probability Vector Initialization

Different from binary coding, the probability vector (PV) in this work is two-dimensional. The initialized PV and convergent PV are shown in Tables 1 and 2, respectively. The value in the row and column represents the possibility of matching between the entity in and the entity in ; i.e., it means the probability of entity of (reference) and entity of (entry) is , which is shown in Table 1 (peculiarly, the header “−1 (null)” denotes the probability of no matching). The convergence condition is that the probability of taking a unique number on each locus in the PV is 1; i.e., in Table 2, the probability of the entity of (reference) and the entity of (entry) is and so do the rest.

4.3.2. Chromosomes Generation

Certain size chromosomes are produced in each generation through PV. An example of chromosome is given in Figure 5. In particular, subgraph (a) shows the locus of the chromosome and corresponding code, and subgraph (b) illustrates decoding chromosome in subgraph (b); i.e., it denotes that the fourth entity of “Article” correspondent to the third entity of “Article” as the code of the fourth locus is “3” (“−1 (Null)” indicates no matching).

(a)

(b)

4.3.3. Fitness Function

The fitness function is used to determine which chromosomes in the population can better adapt the environment. In the context of ontology matching, the objective of fitness function is to find the best chromosome, whose corresponding alignment’s quality is the highest, with algorithm convergence. The objective function of the optimal model is used as the fitness function of this work, and given a chromosome , its fitness function is defined as follows:where is the fitness function that is used in this paper; is the alignment determined by ; is the cardinality of ; , a fraction in the range [0, 1], is the relative weight of and , which is set to 0.25 in this paper; is a normalization function; and is a function that calculates the mean of the matched entity pairs’ similarity values in . In addition, and are defined as follows:where and are, respectively, the cardinality of and and is the similarity of the matching pair in alignment . In particular, is the ratio of the number of matching pair found to the value of the smaller entity number of the two ontologies and is the average similarity of the matching pairs found, which, respectively, approximates the recall value and precision value.

4.3.4. Selection Operator

The selection operator selects the best chromosomes in the current population to participate in the next step [41] and updates the PV. Firstly, the chromosomes are sorted in descending order according to their fitness scores. Secondly, the first half of the chromosomes will be retained as a temporary population. Finally, with fitness as the weight through roulette, we can select the chromosome from the temporary population for subsequent operation.

4.3.5. Elite Strategy

The goal of elite strategy is to keep the historical optimal solution and prevent the fitness of the optimal chromosome from “degenerating” in the process of evolution. In this work, the elite strategy has two steps: the historical optimal solution first competes with the current optimal solution , and the winner will become the new ; the historical optimal solution then participates in the PV update in each generation.

4.3.6. Probability Vector Updating

An example of updating the PV is presented in Figure 6, where subgraph (a) is a chromosome generated by the initialized PV. The similarity is derived from the similarity matrix according to the chromosome’s code. It should be noted that the code of the third locus is “0,” which means that the entity “Part” with sequence number “3” in does not match any entity in , so its similarity is equal to 1 minus the value of the highest similarity between “Part” and all entities in , i.e., . PV will be updated by the normalized similarity of each locus, which is shown in subgraph (b). The probability of being updated in the PV is bold; i.e., the probability of matching pair “Reference” and “Entry” changed from “1” to “1.2” since the normalized similarity of the corresponding locus of chromosome is “0.20.”

(a)

(b)

4.3.7. Linkage Learning

Building blocks will be saved through linkage learning, thus to improve the efficiency of algorithm and quality of solution [42]. In simple GA, linkage learning is to identify great locus and protect them so that they will not be destroyed in the subsequent crossover and mutation operations; in ECGA, linkage learning keeps good probability distribution so that they are not disturbed in the subsequent update process. A linkage learning approach is proposed in this work, and its detail is shown in Figure 7.

For clarity, only column 1 of the original probability vector is selected for narration. In each generation, a low probability clearing operation is performed. The value of each column is divided by the maximum value of the row, and the value of the column will be cleared if the decimal is less than a specific value (0.2 in this work); i.e., 0, 2, 3, and −1 bits in PV are cleared and marked. The row of PV is a “good” probability distribution when all but one bit of this row are zero. Link learning generates building blocks based on the “good” probability distribution, i.e., a building block, the pair of index “0,” and code “0,” which was included in the rounded rectangle produced by linkage learning. After that, all the building blocks are directly put into each chromosome (the bold numbers), which reduces the consumption of runtime and memory consumption.

5. Experiments and Discussion

5.1. Experiment Setup

In the experiment, the Biblio benchmark provided by the Ontology Alignment Evaluation Initiative (OAEI) is used to verify the effectiveness of our approach. Normally, two ontologies to be matched and a reference alignment are included in each testing case as a standard to evaluate the quality of matching results. The testing cases can be classified into five categories, which are briefly described in Table 3.

In this work, the method is compared with the participants of OAEI-, GA-, and CGA-based ontology matching techniques. The experimental results are the H-mean values of 30 independent runs.

5.2. Alignment Evaluation Metrics according to Reference Alignment

A criterion is needed when evaluating the quality of matching systems. Given an alignment result , two measures, recall and precision, are employed in this work and their formula is as follows [22, 23]:where and are the cardinality of matching pairs in the reference alignment provided in case set and the matching pairs in alignment are produced by the matching system and is the cardinality of matching pairs, which exist in both reference alignment and alignment found. It means that all the matching pairs in reference alignment have been found when recall is 1, while all the matching pairs found is correct when precision is 1.

Both recall and precision are important parameters of the evaluation results and they should be considered at the same time. A weighted harmonic mean of recall and precision, F-measure, is used in this work, which is presented in the following equation[43]:where is the relative weight of recall and precision and it is set as 0.5 in this work, which is named f1-measure.

5.3. Comparison with OAEI’s Participants

The participants from OAEI 2016, 2015, and 2014 are selected to compare with our approach. In particular, if a matching system has participated for more than one year, only the latest results are used. The harmonic mean comparison of participants and ECGA-OEM is shown in Figure 8. The vertical axis represents different matching systems, and the horizontal axis represents the score of their corresponding parameters. In terms of f-measure, ECGA-OEM ranks third, which it is slightly lower than Lily and CroMatch. Wiki has been used as the linguistic measure in Lily and CroMatch, which is able to improve the performance of the algorithm at the expense of efficiency. In addition, ECGA-OEM has an unparalleled performance in maintaining the balance between precision and recall, while one of them is much higher than the other in the participants, which is very important in evaluating results quality.

Further f-measure comparison of OAEI participants and ECGA-OEM in a total of 32 test cases is given. Figure 9 shows the numbers of participants with ECGA-OEM superior, equal, and inferior, respectively. The horizontal axis is the set of different test cases, and the vertical axis is the number of matching systems. In the vast majority of test cases, the number of matching systems with ECGA-OEM superior to is much higher than those ones with ECGA-OEM inferior to. Only in No. 246, No. 247, and No. 254 cases, the ranking of ECGA-OEM is relatively low (refer to also Table 4 for the specific f-measure values in each testing case).

In Table 4, the numbers from 1 to 18 in the first row are edna, AML, CroMatch, Lily, LogMap, LogMapLt, Xmap, LogMapBio, AML-2014, Gmap, LogMap-C, Mamba, AOT-2014, AOTL, MassMatch, OMReasoner, RSDLWB, and Xmap2, respectively. ECGA-OEM is the matching system proposed by us. The value of each column represents the f-measure score of the matching system in the corresponding case. The f-measure of participants higher than that of ECGA-OEM is bold, and the equal ones are underlined. The “+,” “=,” and “−” in the last column, respectively, indicate the number of matching systems, with ECGA-OEM superior, equal, and inferior.

5.4. Comparison among GA, CGA, and ECGA

To verify the performance of linkage learning, we compare ECGA with GA and CGA. The detailed f-measure and runtime of the three competitors are, respectively, shown in Tables 5 and 6. All the GA, CGA, and ECGA’s results are the mean value of 30 independent runs. It can be seen that the replacement of crossover and mutation operators (GA) with probability vector (CGA) improves the f-measure and significantly reduces the runtime. The average f-measure is slightly improved, while the average runtime is reduced from 31.975 seconds to 3.540 seconds; i.e., it largely improves the algorithm’s efficiency with only takes about of runtime. Except testing cases No. 301 and No. 304, CGA is more stable than GA in terms of both f-measure and runtime since their standard deviation is smaller. Testing cases No. 301 and No. 304 are the representatives of real-world cases with unique heterogeneity, which make the f-measure produced by CGA decreased slightly. Linkage learning, the technique applied in ECGA, further increased the score of f-measure and made decrement in runtime with average 1.749 seconds based CGA. A smaller standard deviation than CGA was obtained by ECGA, which certified the strong stability of ECGA. It is worth to be noticed that the f-measure score of ECGA in testing case No. 301 and No. 304 is almost the same as that of GA (only 0.003 score lower in testing case No. 301 and this minor difference could be ignored), which embodied the ability of linkage learning to overcome the CGA’s shortcoming.

6. Conclusions

Ontology matching can effectively solve the problem of data heterogeneity by discovering correspondence between two ontologies’ entities. Compact encoding mechanism shows high efficiency in ontology matching, especially in ontology entity matching. Linkage learning, which is employed in ECGA-OEM proposed in this paper, can produce qualified alignment of ontology matching. The experiment results have shown that our approach outperforms the participants of OAEI in terms of f-measure, recall, and precision. The comparison on terms of f-measure and runtime with GA, CGA, and ECGA shows that ECGA-OEM is able to greatly reduce the runtime consumption while maintaining the alignment’s quality.

In the future, we would like to apply the linkage learning in other compact SIAs for better results. In addition, matching large-scale ontologies, such as anatomy and large biomedical track in OAEI, are an open challenge in the domain of ontology matching. We are interested in using the improved ECGA to match these large-scale ontologies.

Data Availability

The data used to support the findings of this study are available upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the Natural Science Foundation of Fujian Province (no. 2020J01875) and the National Natural Science Foundation of China (nos. 61773415, 61801527, and 61103143).

References

X. Xue, J. Liu, P.-W. Tsai, X. Zhan, and A. Ren, “Optimizing ontology alignment by using compact genetic algorithm,” in Proceedings of the 2015 11th International Conference on Computational Intelligence and Security (CIS), pp. 231–234, IEEE, Shenzhen, China, December 2015.
View at: Publisher Site | Google Scholar
X. Xue, “A compact firefly algorithm for matching biomedical ontologies,” Knowledge and Information Systems, vol. 62, no. 11, pp. 1–17, 2020.
View at: Publisher Site | Google Scholar
J. M. V. Naya, M. M. Romero, J. P. Loureiro, C. R. Munteanu, and A. Pazos Sierra, “Improving ontology alignment through genetic algorithms,” in Soft Computing Methods for Practical Environment Solutions: Techniques and Studies, pp. 240–259, IGI Global, Hershey, PA, USA, 2010.
View at: Publisher Site | Google Scholar
J. Bock and H. Jan, “Discrete particle swarm optimisation for ontology alignment,” Information Sciences, vol. 192, pp. 152–173, 2012.
View at: Publisher Site | Google Scholar
He Yao, X. Xue, and S. Zhang, “Using artificial bee colony algorithm for optimizing ontology alignment,” Journal of Information Hiding and Multimedia Signal Processing, vol. 8, no. 4, pp. 766–773, 2017.
View at: Google Scholar
J.-S. Pan, P.-C. Song, S.-C. Chu, and Y.-J. Peng, “Improved compact cuckoo search algorithm applied to location of drone logistics hub,” Mathematics, vol. 8, no. 3, p. 333, 2020.
View at: Publisher Site | Google Scholar
A.-Q. Tian, S.-C. Chu, J.-S. Pan, H. Cui, and W.-M. Zheng, “A compact pigeon-inspired optimization for maximum short-term generation mode in cascade hydroelectric power station,” Sustainability, vol. 12, no. 3, p. 767, 2020.
View at: Publisher Site | Google Scholar
N. Alboukaey and A. Joukhadar, “Ontology matching as regression problem,” Journal of Digital Information Management, vol. 16, no. 1, 2018.
View at: Google Scholar
M. Ali Khoudja, M. Fareh, and H. Bouarfa, “Ontology matching using neural networks: survey and analysis,” in Proceedings of the 2018 International Conference on Applied Smart Systems (ICASS), pp. 1–6, IEEE, Medea, Algeria, November 2018.
View at: Publisher Site | Google Scholar
M. T. Dhouib, C. F. Zucker, and A. G. B. Tettamanzi, “An ontology alignment approach combining word embedding and the radius measure,” in Proceedings of the 15th International Conference on Semantic Systems, pp. 191–197, Springer, Karlsruhe, Germany, September 2019.
View at: Publisher Site | Google Scholar
A. Ali, H. Mcheick, K. Ahmad, and W. Dhifli, “Context-aware instance matching through graph embedding in lexical semantic space,” Knowledge-Based Systems, vol. 186, Article ID 104925, 2019.
View at: Publisher Site | Google Scholar
F. Ali, K.-S. Kwak, and Y.-G. Kim, “Opinion mining based on fuzzy domain ontology and support vector machine: a proposal to automate online review classification,” Applied Soft Computing, vol. 47, pp. 235–250, 2016.
View at: Publisher Site | Google Scholar
X. Xue and J.-S. Pan, “A segment-based approach for large-scale ontology matching,” Knowledge and Information Systems, vol. 52, no. 2, pp. 467–484, 2017.
View at: Publisher Site | Google Scholar
S. Amrouch, S. Mostefai, and M. Fahad, “Decision trees in automatic ontology matching,” International Journal of Metadata, Semantics and Ontologies, vol. 11, no. 3, pp. 180–190, 2016.
View at: Publisher Site | Google Scholar
T. B. Araújo, C. E. Santos Pires, T. Pereira Da Nóbrega, and D. C. Nascimento, “A fine-grained load balancing technique for improving partition-parallel-based ontology matching approaches,” Knowledge-Based Systems, vol. 111, pp. 17–26, 2016.
View at: Publisher Site | Google Scholar
M. B. Amin, W. A. Khan, S. Hussain et al., “Evaluating large-scale biomedical ontology matching over parallel platforms,” Iete Technical Review, vol. 33, no. 4, pp. 415–427, 2016.
View at: Publisher Site | Google Scholar
X. Xue and J. Liu, “Collaborative ontology matching based on compact interactive evolutionary algorithm,” Knowledge-Based Systems, vol. 137, pp. 94–103, 2017.
View at: Publisher Site | Google Scholar
X. Xue and J.-S. Pan, “A compact co-evolutionary algorithm for sensor ontology meta-matching,” Knowledge and Information Systems, vol. 56, no. 2, pp. 335–353, 2018.
View at: Publisher Site | Google Scholar
X. Xue, J. Chen, J. Chen, and D. Chen, “Using compact coevolutionary algorithm for matching biomedical ontologies,” Computational Intelligence and Neuroscience, vol. 2018, Article ID 2309587, 2018.
View at: Publisher Site | Google Scholar
X. Xue and J. Chen, “A compact co-firefly algorithm for matching ontologies,” in Proceedings of the 2019 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 2629–2632, IEEE, Xiamen, China, December 2019.
View at: Publisher Site | Google Scholar
X. Xue and J. Chen, “Using compact evolutionary tabu search algorithm for matching sensor ontologies,” Swarm and Evolutionary Computation, vol. 48, pp. 25–30, 2019.
View at: Publisher Site | Google Scholar
S.-C. Chu, X. Xue, J.-S. Pan, and X. Wu, “Optimizing ontology alignment in vector space,” Journal of Internet Technology, vol. 21, no. 1, pp. 15–22, 2020.
View at: Google Scholar
X. Xue and J. Chen, “Optimizing sensor ontology alignment through compact co-firefly algorithm,” Sensors, vol. 20, no. 7, p. 2056, 2020.
View at: Publisher Site | Google Scholar
Y. Wang, H. Yao, L. Wan et al., “Optimizing hydrography ontology alignment through compact particle swarm optimization algorithm,” in Proceedings of the International Conference on Swarm Intelligence, pp. 151–162, Springer, Chiang Mai, Thailand, July 2020.
View at: Google Scholar
X. Xue and P.-W. Tsai, “Matching biomedical ontologies with compact evolutionary algorithm,” in Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 3–10, Springer, Singapore, Singapore, May 2020.
View at: Google Scholar
X. Xue, H. Yang, J. Zhang, J. Zhang, and D. Chen, “An automatic biomedical ontology meta-matching technique,” Journal of Network Intelligence, vol. 4, no. 3, pp. 109–113, 2019.
View at: Google Scholar
X. Xue and J.-S. Pan, “An overview on evolutionary algorithm based ontology matching,” ournal of Information Hiding and Multimedia Signal Processing, vol. 9, pp. 75–88, 2018.
View at: Publisher Site | Google Scholar
J. Martinez-Gil, E. Alba, and J. F. Aldana-Montes, “Optimizing ontology alignments by using genetic algorithms,” in Proceedings of the Workshop on Nature Based Reasoning for the Semantic Web, Karlsruhe, Germany, October 2008.
View at: Google Scholar
J. Martinez-Gil and J. F. Aldana-Montes, “Evaluation of two heuristic approaches to solve the ontology meta-matching problem,” Knowledge and Information Systems, vol. 26, no. 2, pp. 225–247, 2011.
View at: Publisher Site | Google Scholar
G. Alexandru-Lucian and I. Adrian, “Using a genetic algorithm for optimizing the similarity aggregation step in the process of ontology alignment,” in Proceedings of the 9th RoEduNet IEEE International Conference, pp. 118–122, IEEE, Sibiu, Romania, June 2010.
View at: Google Scholar
G. Acampora, V. Loia, and A. Vitiello, “Enhancing ontology alignment through a memetic aggregation of similarity measures,” Information Sciences, vol. 250, pp. 1–20, 2013.
View at: Publisher Site | Google Scholar
X. Xue, Y. Wang, and A. Ren, “Optimizing ontology alignment through memetic algorithm based on partial reference alignment,” Expert Systems with Applications, vol. 41, no. 7, pp. 3213–3222, 2014.
View at: Publisher Site | Google Scholar
X. Xue and Y. Wang, “Optimizing ontology alignments through a memetic algorithm using both matchfmeasure and unanimous improvement ratio,” Artificial Intelligence, vol. 223, pp. 65–81, 2015.
View at: Publisher Site | Google Scholar
J. Wang, Z. Ding, and C. Jiang, “Gaom: genetic algorithm based ontology matching,” in Proceedings of the 2006 IEEE Asia-Pacific Conference on Services Computing (APSCC’06), pp. 617–620, IEEE, Guangzhou, China, December 2006.
View at: Publisher Site | Google Scholar
G. Acampora, V. Loia, S. Salerno, and A. Vitiello, “A hybrid evolutionary approach for solving the ontology alignment problem,” International Journal of Intelligent Systems, vol. 27, no. 3, pp. 189–216, 2012.
View at: Publisher Site | Google Scholar
A. Alves, K. Revoredo, and F. Baião, “Ontology alignment based on instances using hybrid genetic algorithm,” in Proceedings of the 7th International Conference on Ontology Matching-Volume 946, pp. 242-243, November 2012.
View at: Google Scholar
X. Xue and Y. Wang, “Using memetic algorithm for instance coreference resolution,” IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 2, pp. 580–591, 2015.
View at: Google Scholar
W. Jan Heeringa, “Measuring dialect pronunciation differences using levenshtein distance,” University Library Groningen, Groningen, Netherlands, 2004, Ph.D thesis.
View at: Google Scholar
G. Stoilos, G. Stamou, and S. Kollias, “A string metric for ontology alignment,” in Proceedings of the International Semantic Web Conference, pp. 624–637, Springer, Galway, Ireland, November 2005.
View at: Publisher Site | Google Scholar
G. A. Miller, “WordNet,” Communications of the ACM, vol. 38, no. 11, pp. 39–41, 1995.
View at: Publisher Site | Google Scholar
D. Cai, Y. Wang, M. Ye, X. Xue, and H. Liu, “An orthogonal evolutionary algorithm with learning automata for multiobjective optimization,” IEEE Transactions on Cybernetics, vol. 46, no. 12, pp. 3306–3319, 2015.
View at: Publisher Site | Google Scholar
G. R. Harik, F. G. Lobo, and K. Sastry, Linkage Learning via Probabilistic Modeling in the ECGA, Springer, Berlin, Germany, 1999.
X. Xue and X. Yao, “Interactive ontology matching based on partial reference alignment,” Applied Soft Computing, vol. 72, pp. 355–370, 2018.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2021 Xingsi Xue et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1189

Downloads

1282

Citations

Complexity

Artificial Intelligence for Smart System Simulation

Optimizing Ontology Alignment through Linkage Learning on Entity Correspondences

Abstract

1. Introduction

2. Related Works

3. Preliminaries

3.1. Ontology and Ontology Matching

3.2. Similarity Measures

3.2.1. Syntactic Measures

3.2.2. Linguistic Measures

3.2.3. Taxonomy-Based Measures

3.2.4. Aggregation Strategy

4. Extended Compact Genetic Algorithm-Based Ontology Entity Matching

4.1. Optimal Model

4.2. The Framework of ECGA-OEM

4.3. ECGA Optimization Model

4.3.1. Probability Vector Initialization

4.3.2. Chromosomes Generation

4.3.3. Fitness Function

4.3.4. Selection Operator

4.3.5. Elite Strategy

4.3.6. Probability Vector Updating

4.3.7. Linkage Learning

5. Experiments and Discussion

5.1. Experiment Setup

5.2. Alignment Evaluation Metrics according to Reference Alignment

5.3. Comparison with OAEI’s Participants

5.4. Comparison among GA, CGA, and ECGA

6. Conclusions

Data Availability

Conflicts of Interest

Acknowledgments

References

Copyright