1 Introduction
2 Background on model matching
- State-based approaches calculate changes by matching model elements in the input models with each other. Model matching—which is the main focus of our survey—explicitly occurs in this case. State-based approaches usually store changes as instances of the metamodel of the input models, or as instances of a difference metamodel [41, 127]. While there is often a significant computational complexity overhead (see the graph isomorphism problem [81]), these approaches are usually tool independent.
- Change-based approaches record changes as they occur, instead of calculating differences at a later time. Therefore, there is no need for model matching. The changes can often be run as model transformations on the initial model(s). In this case, we talk about operation-based approaches. These algorithms are usually easier to develop and comprehend [83], but are more tool dependent.
- Accuracy the algorithm focuses on making the matching process accurate. This means finding as many correct matches and as few incorrect matches as possible. There are different metrics that can be used to measure accuracy. We investigate these metrics later in Sect. 4.2.
- Generality the algorithm focuses on being general, namely, that it can be used with as many modeling languages and tools, and with as few technology-related modifications to the algorithm as possible.
3 Systematic literature review
3.1 Planning the survey
3.1.1 Motivations
- Model matching is an important task in state-based model differencing and is useful in other areas as well, i.e., in model transformation testing.
- There exist few surveys that focus on the categorization of model matching algorithms, especially considering text-based algorithms.
- Finding out the reason why the number of text-based model matching algorithms is lower, and how they compare with graph-based algorithms.
- Analyzing the aforementioned topics can help researchers in identifying trends and open questions in the field of model matching.
3.1.2 Scope and goals
Population | Researchers and practitioners using model-driven methodologies |
Intervention | Analysis of existing matching approaches, with an additional focus on text-based model matching |
Comparison | Differences between different matching techniques, and the comparison of text-based and graph-based approaches |
Outcomes | Better comprehension of the state-of-the-art, the identification of current research trends and possible avenues for future research |
Context | Academic research related to, and industrial applications of model matching |
- The population represents who or what benefits from the SLR. This can be a specific software engineering role, a category of software engineers, an application area, or an industrial group.
- The intervention is a software methodology/tool/technology/procedure that addresses a specific issue.
- The comparison software methodology/tool/technology/procedure with which the intervention is being compared.
- Outcomes relate to factors of importance to practitioners such as improved reliability or reduced production cost.
- The context in which the comparison takes place (e.g., academia or industry), the participants taking part in the study (e.g., practitioners or academics), and the tasks being performed (e.g., small scale or large scale).
Inclusion | Exclusion |
---|---|
Study is a peer-reviewed journal article, conference or workshop paper, or a tech report | Study is not a peer-reviewed journal article, conference or workshop paper, or a tech report (e.g., books, masters theses, doctoral dissertations) |
Study is related to model-driven methodologies | Study is not related to model-driven methodologies |
Study contains research that is related to model matching. The presented algorithm must be able to support the matching of models that conform to the same metamodel. Alternatively, the study focuses on the evaluation of, or on open questions related to model matching | Study does not contain research related to model matching, or it presents an algorithm that does not support the matching of models that conform to the same metamodel. Moreover, it does not focus on the evaluation of, or on open questions related to model matching |
Study contains research that answers at least one of our research questions | Study is not related to any of our research questions |
– | Study is not a primary study, for example, it is a survey. However, references of related secondary studies are checked during the snowballing process |
– | Study is not written in English |
– | Study is not available online (for the authors of this survey) |
- Present the state-of-the-art of model matching algorithms.
- Discover the fundamental differences between matching techniques in order to identify research trends and future research directions.
- Conduct a survey on text-based model matching approaches, and compare them to graph-based ones.
- Identify open questions and challenges in the field of model matching.
3.1.3 Research questions
- RQ1 What is the state-of-the-art of model matching?
- RQ1.1 How common are graph-based and text-based model matching approaches? How do they compare with each other?
- RQ1.2 What model matching techniques are most often used? What are the main differences between them?
- RQ1.3 How often are state-based and operation-based change tracking used? What are the main differences between the two, with regard to the focus of the algorithms using them?
- RQ2 How can model matching approaches be evaluated? What are the main considerations for evaluation?
- RQ3 What are the main open questions related to the field of model matching?
- RQ4 How has interest in model matching developed over time?
3.1.4 Inclusion and exclusion criteria
3.1.5 Search strategy
- MODELS (International Conference on Model Driven Engineering Languages and Systems)
- ME @MODELS (Models and Evolution Workshop) and its predecessors MoDSE and MoDSE-MCCM
- MODELSWARD (International Conference on Model-Driven Engineering and Software Development)
- CVSM (Comparison and Versioning of Software Models)
- ACM DL(+text +model +diff) OR ((keywords.author.keyword:(+model +match) OR keywords.author.keyword:(+model +merge)OR keywords.author.keyword:(+model +difference)OR keywords.author.keyword:(+model +compare)OR keywords.author.keyword:(+model +comparison)OR keywords.author.keyword:(+model +versioning)OR keywords.author.keyword:(+model +composition)OR keywords.author.keyword:(+model +union))AND domain-specific model-driven model-based UML)
- IEEE Xplore((model NEAR/2 differencing OR model NEAR/2 comparing OR “model comparison” OR model NEAR/2 matching OR model NEAR/2 merging OR “model composition” OR model NEAR/2 versioning OR model NEAR/2 union) AND (“UML” OR “model-driven” OR “model-based engineering” OR “domain-specific”))
3.2 Conducting the survey
3.2.1 Study selection process
Automatic search | Manual search | Snowballing | Overall | |
---|---|---|---|---|
Total | 1139 | 1179 | 2375 | 4693 |
Included | 34 | 20 | 65 | 119 |
Surveys | 0 | 1 | 5 | 6 |
Duplicates | 95 | 15 | 1309 | 1419 |
Excluded (soft) | 945 | 1102 | 883 | 2930 |
Excluded (hard) | 65 | 41 | 113 | 219 |
Type (RQ1.1) | Graph/text |
Matching (RQ1.2) | Static/dynamic/similarity/custom |
Change tracking (RQ1.3) | Operation-based/state-based |
Focus (RQ1) | Accuracy/generality/both |
Evaluation (RQ2) | Yes/no |
Open questions (RQ3) | Yes/no |
Year (RQ5) | – |
Author(s) | – |
Version control systems | Yes/no |
3.2.2 Extracted data
- Type (RQ1.1) The main type of the matching algorithm, namely, whether it works using the graph-based structure or the textual representation of the models. Our definition of a textual representation can be found in Sect. 2.
- Change tracking (RQ1.3) State-based or operation-based change tracking, as discussed in Sect. 2.
- Focus (RQ1) The main goal and focus of the algorithm. We consider the main ones to be accuracy and generality (see Sect. 2 for details). If an algorithm focuses on both of them, the possible trade-off for doing so is also marked during data extraction. An algorithm can also focus on other aspects, like the performance or scalability of the algorithm.
- Evaluation (RQ2) Whether the study contains information on the evaluation of model matching approaches.
- Open Questions (RQ3) Whether the study contains information regarding open questions in the field of model matching. This excludes open questions that are only related to the presented algorithm in the paper, like specific improvements to the mentioned algorithm.
- Year (RQ4) The year the study was published.
- Version Control Systems Whether the study focuses on model matching in the context of version control systems. These data are only needed to reinforce our conception of version control systems being one of the primary motivating factors for researching model matching.
Algorithm | Studies |
---|---|
AMOR | S6, S18, S111 |
Chaouni et al. | S16, S63 |
Dijkman et al. | S36, S48, S64, S66, S103, S106 |
Ehrig et al. | S65, S97 |
EMFCompare | S81, S90, S98 |
Epsilon merging language | S68, S78, S116 |
Fleurey et al. | S7, S80, S92 |
Gerth et al. | S33, S34, S35, S56 |
Koegel et al. | S87, S88, S110, S112 |
Nejati et al. | S58, S59 |
Oda and Saeki | S52, S74 |
Ohst et al. | S42, S71 |
Oliveira et al. | S2, S76, S86 |
SiDiff | S9, S17, S39, S43, S113 |
Somogyi and Asztalos | S50, S70 |
UMLDiff | S44, S115 |
Zhang et al. | S26, S27 |
4 Results and discussion
4.1 State-of-the-art (RQ1)
4.1.1 Text-based model matching (RQ1.1)
The studies we found often refer to text-based model matching as either raw text differencing [19, 162, 170] (e.g., GNU Diffutils [98] or KDiff [71]), or as to the XML-based serialization of the models [30, 146] (e.g., XMI [167]). The former has very low accuracy in the case of models due to structural differences (compared with source code), while the latter is often dismissed due to its low abstraction level, which also makes it less accurate. In Sect. 2, we defined the textual representation of a model as a textual notation that is mapped to the model by well-defined mechanisms. We also excluded XML-based serialization from this definition. During the categorization of the studies, we used this definition to label algorithms as text-based. As we have discussed before, there are certain advantages of using the textual notation, like supporting off-line editing, better comprehensibility when focusing on details, or when compared with XMI [67, 68, 144].RQ1.1: How common are graph-based and text-based model matching approaches? How do they compare with each other?
- Lack of text-based algorithms There is a lack of text-based model matching algorithms, and an even greater lack of general text-based algorithms.
- Motivations behind using text-based algorithms The main motivations we found behind text-based model matching are (i) using the simplicity of raw text differencing and enhancing it to be suitable for models [16, 59, 157], (ii) aiming for better accuracy at a higher configuration cost [127, 136], and (iii) supporting the textual notation of the models by solving a related, specific problem [19, 157].
- Generality comes with configuration cost Text-based model matching algorithms tend to focus either on accuracy, or on both accuracy and generality. This often comes with a compromise in the form of an increased configuration cost of the algorithm and the mapping of the textual representation. Only when these costs are acceptable that text-based algorithms can be reliable used in a more general way in practice. Thus, they are either accurate (usually with a single language in mind), or have a higher configuration cost to be used generally. However, more research is needed on the analysis of the trade-off between the configuration cost and the generality/accuracy of the text-based algorithm, as no study we found dealt with this problem in detail. As a final note, the idea of supporting both accuracy and generality at a higher configuration cost is also present in graph-based algorithms.
- Quantifying the configuration cost? What metrics can we use to express the effort required to use a text-based matching algorithm with a particular modeling language? Can it be expressed using standard metrics? The studies we found did not discuss this problem.
4.1.2 Matching techniques (RQ1.2)
Figure 6 summarizes the main matching techniques used by the algorithms we found, according to the categorization detailed in Sect. 2. We would like to note that the sum of the numbers on the figure is greater than the number of unique algorithms. This is due to the fact that algorithms using more than one matching technique are counted multiple times for every used technique. We can see that similarity-based matching is the most popular, followed by static and dynamic matching. Operation-based algorithms usually use no matching technique (hence the category none on the figure). Please note that the number of operation-based algorithms (15, as detailed in Sect. 4.1.3) is higher than the number of algorithms using no matching technique (14), as one particular operation-based algorithm we found [157] uses raw text matching as a form of origin tracking between model and text.RQ1.2: What model matching techniques are most often used? What are the main differences between them?
4.1.3 Change tracking (RQ1.3)
Out of the 72 algorithms the survey identified, the majority (75%) were state-based, some were operation-based (~ 21%), while a small number of algorithms (~ 4%) used both change tracking approaches. Figure 8 depicts the main focus of state-based and operation-based algorithms. Contrary to our preliminary expectations, the operation-based algorithms we found are split evenly with regard to the focus of the algorithm. State-based algorithms tend to focus more often on accuracy (50%). Thus, in general, we can draw two main conclusions from these results: (i) state-based approaches are more commonly used, and (ii) there are no significant correlations between the used change tracking method and the main focus of the algorithm.RQ1.3: How often are state-based and operation-based change tracking used? What are the main differences between the two, with regard to the focus of the algorithms using them?
4.2 Evaluation techniques (RQ2)
The evaluation and benchmarking of model matching, differencing, and merging algorithms is considered to be a difficult task in the literature [8, 152]. Most approaches that we found focus on the accuracy of the matching process in some way [1, 4‐7, 27, 44, 47, 48, 108, 109, 152, 158, 161, 169]. Metrics like precision, recall, or F-measure that are known from pattern recognition and machine learning [26] are often used to measure accuracy. Precision is usually the measure of exactness or fidelity; it is equivalent to the ratio of correctly detected matching pairs to all detected pairs. Recall is often used as the measure of completeness; it is equivalent to the ratio of correctly detected matching pairs to the number of all correct matching pairs. Instead of these values, the mean average precision and recall are often used [47, 48]. These metrics are regularly used in the case of structural models, but they can also be defined for behavioral models. In addition to being used to measure matching accuracy, these metrics can also be used for measuring the accuracy of conflict detection or resolution. The concept is similar, finding the ratio of correctly identified matching pairs or conflicts [169]. As a final note on the topic of conflict detection, Koegel et al. [84] evaluated operation-based conflict detection approaches by running the changes in a batch-mode on predefined models, and evaluating their results manually. Conflicting and non-conflicting changes were separated in their approach.RQ2: How can model matching approaches be evaluated? What are the main considerations for evaluation?
- Measuring metrics are established Matching accuracy is often used to evaluate matching algorithms. It can be measured with metrics borrowed from pattern recognition: precision, recall, and F-score. The accuracy of conflict detection and resolution can also be measured using these metrics. Performance and scalability are also common factors, although large-scale models are somewhat neglected in this regard.
- Lack of automated benchmarking Benchmarking is often mentioned as one of the main challenges in the field of model matching. We found that there is a need for the automatic generation of technology-independent benchmarking model datasets.
- Lack of empirical studies Empirical user studies are rare, although they can be used to verify certain aspects of an algorithm (e.g., human effort required to use) that would otherwise be difficult to quantify.
4.3 Open questions (RQ3)
In the earlier years of research in model differencing and merging, there was a clear need for general algorithms that did not rely on static identifiers [3, 8]. Our findings confirmed that this is no longer the case, as there are numerous algorithms that focus only on generality and do not rely on static identifiers (16 that we found, see Sect. 4.1.2 for details).RQ3: What are the main open questions related to the field of model matching?
- Relatively low industrial appliance Low tool adoption of existing algorithms. Most algorithms have a working prototype, although industrial applications are rare.
- Inadequate change visualization Although loosely related to the field of model matching, difference and conflict visualization is not adequate enough, the concrete syntax of the model should more often be used for this purpose.
- Intention-aware model versioning Intention-aware model versioning is a possible future research direction where the result of a model merge is based on the intentions of the developers committing the changes.
- Semantic model matching Semantic model matching is a relatively new research direction with multiple problems: algorithmic complexity, difference representation, and integration with syntactic matching.
- Lack of benchmarking datasets Benchmarking datasets are uncommon and are usually technology dependent. Automatically generated and technology-independent datasets are needed in order to more objectively evaluate algorithms.
- Large-scale model matching Large-scale model matching needs more research. The trade-off between performance and accuracy also merits further research.
- General algorithms for behavioral models? General model matching algorithms do not seem to perform well regarding accuracy, as behavioral models seem to rely too heavily on semantics. Can more general algorithms be developed for behavioral models?
- Further analysis of text-based algorithms The investigation of the differences between text-based and graph-based algorithms, and identifying and quantifying the aspects that they differ in merits further research. This also includes the analysis of the correlation between the configuration cost and accuracy/generality of the text-based algorithm discussed in Sect. 4.1.1.
4.4 Interest in model matching (RQ4)
Figure 9 illustrates every included primary study per year, from the earliest published paper (2001) to the latest one (2018). We can see that the apex of research related to model matching was between 2007 (16 studies) and 2009 (20 studies). From then, a steady decline in the frequency of published papers can be observed. Nevertheless, we believe that the open questions and challenges we identified in this survey are still relevant and merit further research.RQ4: How has interest in model matching developed over time?