Skip to main content

2016 | Buch

Model and Data Engineering

6th International Conference, MEDI 2016, Almería, Spain, September 21-23, 2016, Proceedings

herausgegeben von: Ladjel Bellatreche, Óscar Pastor, Jesús M. Almendros Jiménez, Yamine Aït-Ameur

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

This book constitutes the refereed proceedings of the 6th International Conference on Model and Data Engineering, MEDI 2016, held in Almería, Spain, in September 2016.

The 17 full papers and 10 short papers presented together with 2 invited talks were carefully reviewed and selected from 62 submissions. The papers range on a wide spectrum covering fundamental contributions, applications and tool developments and improvements in model and data engineering activities.

Inhaltsverzeichnis

Frontmatter
Towards OntoUML for Software Engineering: Transformation of Anti-rigid Sortal Types into Relational Databases
Abstract
OntoUML is an ontologically well-founded conceptual modelling language that distinguishes various types of classifiers and relations providing precise meaning to the modelled entities. Efforts arise to incorporate OntoUML into the Model-Driven Development approach as a conceptual modelling language for the PIM of application data. In our previous research, we outlined our approach to the transformation of an OntoUML PIM into an ISM of a relational database. In a parallel paper, we discuss the details of the transformation of Rigid Sortal Types, while this paper is focused on the transformation of Anti-rigid Sortal Types.
Zdeněk Rybola, Robert Pergl
Automatic Generation of Ecore Models for Testing ATL Transformations
Abstract
Model transformation testing is crucial to detect incorrect transformations. Buggy transformations can lead to incorrect target models, either violating target meta-model requirements or more complex target model properties. In this paper we present a tool for testing ATL transformations. This tool is an extension of a previously developed tool for testing XML-based languages. With this aim an Ecore to XML Schema transformation is defined which makes to automatically generate random Ecore models possible. These randomly generated Ecore models are used to test ATL transformations. Properties to be tested are specified by OCL constraints, describing input and output conditions on source and target models, respectively.
Jesús M. Almendros-Jiménez, Antonio Becerra-Terón
Towards a Methodological Tool Support for Modeling Security-Oriented Processes
Abstract
Development processes for software construction are common knowledge and widely used in most development organizations. Unfortunately, these processes often offer only little or no support in order to meet security requirements. In our work, we propose a methodology to build domain specific process models with security concepts on the foundations of industry-relevant security approaches, backed by a security-oriented process model specification language. Instead of building domain specific security-oriented process models from the ground, the methodology allows process designers to fall back on existing well established security approaches and add domain relevant concepts and repository-centric approaches, as well as supplementary information security risk management standards (e.g., Common Criteria), to fulfill the demand for secure software engineering. Supplementary and/or domain specific concepts can be added trough our process modeling language in an easy and direct way. The methodology and the process modeling language we propose have been successfully evaluated by the TERESA project for specifying development processes for trusted applications and integrating security concepts into existing process models used in the railway domain.
Jacob Geisel, Brahim Hamid, David Gonzales, Jean-Michel Bruel
ResilientStore: A Heuristic-Based Data Format Selector for Intermediate Results
Abstract
Large-scale data analysis is an important activity in many organizations that typically requires the deployment of data-intensive workflows. As data is processed these workflows generate large intermediate results, which are typically pipelined from one operator to the following. However, if materialized, these results become reusable, hence, subsequent workflows need not recompute them. There are already many solutions that materialize intermediate results but all of them assume a fixed data format. A fixed format, however, may not be the optimal one for every situation. For example, it is well-known that different data fragmentation strategies (e.g., horizontal and vertical) behave better or worse according to the access patterns of the subsequent operations. In this paper, we present ResilientStore, which assists on selecting the most appropriate data format for materializing intermediate results. Given a workflow and a set of materialization points, it uses rule-based heuristics to choose the best storage data format based on subsequent access patterns. We have implemented ResilientStore for HDFS and three different data formats: SequenceFile, Parquet and Avro. Experimental results show that our solution gives 18 % better performance than any solution based on a single fixed format.
Rana Faisal Munir, Oscar Romero, Alberto Abelló, Besim Bilalli, Maik Thiele, Wolfgang Lehner
Bulk-Loading xBR-trees
Abstract
Spatial indexes are important in spatial databases for efficient execution of queries involving spatial constraints. The xBR\(^+\)-tree is a balanced disk-resident quadtree-based index structure for point data, which is very efficient for processing such queries. Bulk-loading refers to the process of creating an index from scratch as a whole, when the dataset to be indexed is available beforehand, instead of creating (loading) the index gradually, when the dataset items are available one-by-one. In this paper, we present an algorithm for bulk-loading xBR\(^+\)-trees for big datasets residing on disk, using a limited amount of RAM. Moreover, using real and artificial datasets of various cardinalities, we present an experimental comparison of this algorithm vs. the algorithm loading items one-by-one, regarding performance (I/O and execution time) and the characteristics of the xBR\(^+\)-trees created. We also present experimental results regarding the efficiency of bulk-loaded xBR\(^+\)-trees vs. xBR\(^+\)-trees where items are loaded one-by-one for query processing.
George Roumelis, Michael Vassilakopoulos, Antonio Corral, Yannis Manolopoulos
A Meta-advisor Repository for Database Physical Design
Abstract
The physical design is one of the crucial phases of advanced database design life cycle. This is due to its important role in selecting optimization structures such as materialized views, indexes, and partitioning to speed up the performance of queries. This phase has been amplified by the continually needs of storing and managing in efficient way the deluge of data in storage systems. This situation motivates the editors of commercial and non-commercial Database Management Systems (e.g. SQL Tuning Advisor - Oracle and Parinda - PostgreSQL) to propose tools (called advisors) to assist database administrators in their tasks when selecting their relevant optimization structures for a given database/data warehouse schema and a workload. The maturity of research performed in the physical design motivates us to go further and capitalize the knowledge and expertise in terms of processes, the algorithms, the cost models used to quantify the benefit of the selected optimization structures, etc. used by the research community. In this paper, we first propose a physical design language called PhyDL that allows describing all inputs and outputs of the physical design phase. Secondly, to increase the reuse of the existing advisors, we elaborate a repository called Meta-Advisor that persists all components of the physical design. Finally, a case study of our contribution is presented to stress the meta-advisor repository and highlights its importance.
Abdelkader Ouared, Yassine Ouhammou, Amine Roukh
Linked Service Selection Using the Skyline Algorithm
Abstract
Recently, resource oriented computing has changed the way Web applications are designed. Because of the increasing number of APIs, centralized repositories are no longer a viable option for discovery. As a consequence, a decentralized approach is needed in order to enable value-added applications. In this paper, we propose a client-side QoS-based selection algorithm that can be executed along the discovery stage. Our solution provides different alternatives based on the skyline approach to select resources and maintain acceptable time performance.
Mahdi Bennara, Michael Mrissa, Youssef Amghar
Toward Multi Criteria Optimization of Business Processes Design
Abstract
In enterprise, optimization is seen as making business decisions by varying some parameters to maximize profit and reduce loss. We focus on business processes design optimization. It is known as the problem of creating feasible business processes while optimizing their criteria such as resource cost and execution time. In this paper, we propose an approach that focuses on tasks composing a business process, their resources and attributes rather than a full representation of a business process for its evaluation according to certain criteria. The main contribution of this work is a framework capable of (i) generating business processes using an enhanced version of evolutionary algorithm NSGAII. (ii) Verifying the feasibility of each business process created using an effective algorithm. At last, (iii) selecting Pareto optimal solutions in a multi criteria optimization environment up to three criteria, using an effectual fitness function. The experimental results showed that our proposal generates efficient business processes in terms of qualitative parameters compared with existing solutions.
Nadir Mahammed, Sidi Mohamed Benslimane
Semantic-Enabled and Hypermedia-Driven Linked Service Discovery
Abstract
Automating discovery and composition of RESTful services with the help of semantic Web technologies is a key challenge to exploit today’s Web potential. In this paper, we show how semantic annotations on resource descriptions can drive discovery algorithms on the Web. We propose a semantically-enabled variant of the BFS discovery algorithm that aims at minimizing the number of links explored while maximizing result diversity. Our algorithm calculates semantic distances between resource descriptions and user request concepts to rank explored resources accordingly. We demonstrate the applicability of our solution with a typical scenario and provide an evaluation with a prototype.
Mahdi Bennara, Michael Mrissa, Youssef Amghar
Multi-level Networked Knowledge Base: DDL-Reasoning
Abstract
This paper describes a new formalism based on multi-level networked knowledge (MLNK), a combination of different ontologies describing heterogeneous and complementary domains aligned with semantic correspondences. Ontology alignments make explicit the correspondences between terms from different ontologies and must be taken into account in reasoning, where two explicit form of correspondences are given: mappings represent predefined relations such as subsumption, equivalence, or disjointness, that have a fixed semantics in all interpretations; as well as links that can relate complementary ontologies by introducing terms defined by experts, and their semantics varies according to interpretations. The proposed MLNK formalism can be transformed into a Distributed System capable of supporting DDL semantics. It permits to apply a contextual reasoning where ontologies and alignments by pairs of ontologies are developed in different and incompatible contexts. The semantic of the proposed formalism is extensively described along with an illustrative example.
Sihem Klai, Antoine Zimmermann, Med Tarek Khadir
Maintenance of Profile Matchings in Knowledge Bases
Abstract
A profile describes a set of properties, e.g. a set of skills a person may have or a set of skills required for a particular job. Profile matching aims to determine how well a given profile fits to a requested profile. Profiles can be defined by filters in a lattice of concepts derived from a knowledge base that is grounded in description logic, and matching can be realised by assigning values in [0,1] to pairs of such filters: the higher the matching value the better is the fit. In this paper the problem is investigated, whether given a set of filters together with matching values determined by some human expert a matching measure can be determined such that the computed matching values preserve the rankings given by the expert. In the paper plausibility constraints for the values given by an expert are formulated. If these plausibility constraints are satisfied, the problem of determining a ranking-preserving matching measure can be solved.
Jorge Martinez-Gil, Lorena Paoletti, Gábor Rácz, Attila Sali, Klaus-Dieter Schewe
Distributed Reasoning for Mapped Ontologies Using Rewriting Logic
Abstract
The Ontology Web Language (OWL) implicitly maps the interconnected ontologies by its mechanism of ontology importation and it uses a global view (global reasoning procedure) for their interpretation. In this paper, we generate explicit context mappings from these same heterogeneous and interconnected OWL ontologies to adopt local view interpretation. Each separate OWL ontology is transformed to be described by a Maude module based on the Rewriting Logic (RL) and internally it executes a local reasoning procedure developed with Maude itself as extension to standard Description Logic Tableau. The combination of these local reasoning Maude modules, creates a distributed reasoning system for the heterogeneous and contextualized OWL ontologies.
Mustapha Bourahla
Towards a Formal Validation of ETL Patterns Behaviour
Abstract
The development of ETL systems has been the target of many research efforts to support its development and implementation. In the last few years, we presented a pattern-oriented approach to develop these systems. Basically, patterns are comprised by a set of abstract components that can be configured to enable its instantiation for specific scenarios. Even when using high-level components, the ETL systems are very specific processes that represent complex data requirements and transformation routines. Several operational requirements need to be configured and system correctness is hard to validate, which can result in several implementation problems. In this paper, a set of formal specifications in Alloy is presented to express the structural constraints and behaviour of a slowly changing dimension pattern. Then, specific physical models can be generated based on formal specifications and constraints defined in an Alloy model, helping to ensure the correctness of the configuration provided.
Bruno Oliveira, Orlando Belo, Nuno Macedo
Building OLAP Cubes from Columnar NoSQL Data Warehouses
Abstract
The work presented in this paper aims to build OLAP cubes from big data warehouses implemented by using the columnar NoSQL model. The use of NoSQL models is motivated by the inability of the relational model, usually used to implement data warehousing, to allow data scalability easily. Indeed, the columnar NoSQL model is suitable for storing and managing massive data, especially for decisional queries. However, the column-oriented NoSQL DBMS do not offer online analysis operators (OLAP). Our main contribution is to define a new cube operator called MC-CUBE (MapReduce Columnar CUBE), which allows building columnar NoSQL cubes by taking into account the no relational and distributed aspects when data warehouses are stored.
Khaled Dehdouh
On Representing Interval Measures by Means of Functions
Abstract
Multiple applications e.g., energy consumption meters, temperature or pressure sensors, generate series of discrete data. Such data have two characteristics, namely: they are naturally ordered by time and are frequently represented as intervals. Most of the research contributions, commercial software, or prototypes either (1) allow to analyze set oriented data, neglecting their order and duration or (2) represent intervals as discrete collection of points stored in tables. In this paper, based on our interval OLAP data model, we propose a method for representing interval data by means of functions and show that it is feasible to aggregate such data along hierarchical dimensions - in an OLAP-like style. To this end, we implemented a micro-prototype using Oracle PL/SQL. Its experimental evaluation showed that the concept is more space efficient and offers better performance than traditional approaches for some classes of analytical queries.
Gastón Bakkalian, Christian Koncilia, Robert Wrembel
Automated Data Pre-processing via Meta-learning
Abstract
A data mining algorithm may perform differently on datasets with different characteristics, e.g., it might perform better on a dataset with continuous attributes rather than with categorical attributes, or the other way around. As a matter of fact, a dataset usually needs to be pre-processed. Taking into account all the possible pre-processing operators, there exists a staggeringly large number of alternatives and non-experienced users become overwhelmed. We show that this problem can be addressed by an automated approach, leveraging ideas from meta-learning. Specifically, we consider a wide range of data pre-processing techniques and a set of data mining algorithms. For each data mining algorithm and selected dataset, we are able to predict the transformations that improve the result of the algorithm on the respective dataset. Our approach will help non-expert users to more effectively identify the transformations appropriate to their applications, and hence to achieve improved results.
Besim Bilalli, Alberto Abelló, Tomàs Aluja-Banet, Robert Wrembel
Individual Relocation: A Fuzzy Classification Based Approach
Abstract
Like crisp ontologies, the success of fuzzy ones depends on the availability of effective software allowing their exploitation. Thus, serval reasoners for very expressive fuzzy description logics have been implemented. However, in some cases, applications do not require all the reasoners tasks and would benefit from the efficiency of just certain services. To this scope, we focused on the individual classification task to realize fuzzy ontologies. After their classification, individuals may evolve and change their description. To deal with this evolution, we propose, in this paper, a sufficiently clear and complete process for relocating individuals into fuzzy ontologies. This evolution may be the result of an enrichment, an impoverishment, and/or modification of the individual description. The proposed fuzzy relocation process is based on a fuzzy classification algorithm that supports \( ({\mathcal{Z}}\;{\mathcal{S}\mathcal{H}\mathcal{O}\mathcal{I}\mathcal{N}}({\mathcal{D}})) \) and allows (i) fuzzy domains, (ii) modified and (iii) weighted concepts.
Djellal Asma, Boufaida Zizette
Incremental Approach for Detecting Arbitrary and Embedded Cluster Structures
Abstract
In this paper, we present a new incremental clustering approach (InDEC) capable of detecting arbitrary cluster structures. Cluster may contain embedded structures. Available methods do not address this important issue in the context of continuously growing databases. A density variation concept is used to detect embedded clusters that may occurs after successive updation of database. Unlike popular methods which use distance measure, we use a new affinity score to decide the proximity of a new object with the clusters. We use both synthetic and real datasets to evaluate the performance of our proposed method. Experimental result reveals that proposed method is effective in detecting arbitrary and embedded clusters in dynamic scenario.
Keshab Nath, Swarup Roy, Sukumar Nandi
Annotation of Engineering Models by References to Domain Ontologies
Abstract
Complex engineering systems execute within different contexts and domains. The heterogeneity induced by these contexts is usually implicitly handled in the development cycle of such systems. We claim that reducing this heterogeneity can be achieved by handling explicitly the knowledge mined from these domains and contexts. Verification and validation activities are improved due to the expression and verification of new constraints and properties directly extracted from the context and domains associated to the models. In this paper, we advocate the use of domain ontologies to express both domain and context knowledge. We propose to enrich design models that describe complex information systems, with domain knowledge, expressed by ontologies, provided by their context of use. This enrichment is achieved by annotation of the design models by references to ontologies. Three annotation mechanisms are proposed. The resulting annotated models are checked to validate the new minded domain properties. We have experimented this approach in a model driven engineering (MDE) development setting.
Kahina Hacid, Yamine Ait-Ameur
Unifying Warehoused Data with Linked Open Data: A Conceptual Modeling Solution
Abstract
Linked Open Data (LOD) become one of the most important sources of information allowing enhancing business analyses based on warehoused data with external data. However, Data Warehouses (DWs) do not directly cooperate with LOD datasets due to the differences between data models. In this paper, we describe a conceptual multidimensional model, named Unified Cube, which is generic enough to include both warehoused data and LOD. Unified Cubes provide a comprehensive representation of useful data and, more importantly, support well-informed decisions by including multiple data sources in one analysis. To demonstrate the feasibility of our proposal, we present an implementation framework for building Unified Cubes based on DWs and LOD datasets.
Franck Ravat, Jiefu Song
Correct-by-Construction Evolution of Realisable Conversation Protocols
Abstract
Distributed software systems are often built by composing independent and autonomous peers with cross-organisational interaction and no centralised control. These peers can be administrated and executed by geographically distributed and autonomous companies. In a top-down design of distributed software systems, the peers’ interaction is often described by a global specification called Conversation Protocol (CP) and one have to check its realisability i.e., whether there exists a set of peers implementing this CP. In dynamic environments, CP needs to be updated wrt. new environment changes and end-user interaction requirements. This paper tackles CP evolution such that its realisability must be preserved. We define some evolution patterns and prove that they ensure the realisability. We also show how our proposal can be supported by existing methods and tools based on refinement and theorem proving, using the event-B langage and RODIN development tools.
Sarah Benyagoub, Meriem Ouederni, Neeraj Kumar Singh, Yamine Ait-Ameur
White-Box Modernization of Legacy Applications
Abstract
Software modernization consists of transforming legacy applications into modern technologies, mainly to minimize maintenance costs. This transformation often produces a new application that is a poor copy of the legacy due to the degradation of quality attributes, for example. This paper presents a white-box transformation approach that changes the application architecture and the technological stack without losing business value and quality attributes. This approach obtains a technology agnostic model from the original sources, such a model facilitates the architecture configuration before performing the actual transformation of the application into the new technology. The architecture for the new application can be configured considering aspects such as data access, quality attributes, and process. We evaluate our approach through an industrial case study, the gist of which is the transformation of Oracle Forms applications—where the presentation layer is highly coupled to the data access layer—to Java technologies.
Kelly Garcés, Rubby Casallas, Camilo Álvarez, Edgar Sandoval, Alejandro Salamanca, Fabián Melo, Juan Manuel Soto
Exploring Quality-Aware Architectural Transformations at Run-Time: The ENIA Case
Abstract
Adapting software systems at run-time is a key issue, especially when these systems consist of components used as intermediary for human-computer interaction. In this sense, model transformation techniques have a widespread acceptance as a mechanism for adapting and evolving the software architecture of such systems. However, existing model transformations often focus on functional requirements, and quality attributes are only manually considered after the transformations are done. This paper aims to improve the quality of adaptations and evolutions in component-based software systems by taking into account quality attributes within the model transformation process. To this end, we present a quality-aware transformation process using software architecture metrics to select among many alternative model transformations. Such metrics evaluate the quality attributes of an architecture. We validate the presented quality-aware transformation process in ENIA, a geographic information system whose user interfaces are based on coarse-grained components and need to be adapted at run-time.
Javier Criado, Silverio Martínez-Fernández, David Ameller, Luis Iribarne, Nicolás Padilla
A Credibility and Classification-Based Approach for Opinion Analysis in Social Networks
Abstract
There is an ongoing interest in examining users’ experiences made available through social media. Unfortunately these experiences like reviews on products and/or services are sometimes conflicting and thus, do not help develop a concise opinion on these products and/or services. This paper presents a multi-stage approach that extracts and consolidates reviews after addressing specific issues such as user multi-identity and user limited credibility. A system along with a set of experiments demonstrate the feasibility of the approach.
Lobna Azaza, Fatima Zohra Ennaji, Zakaria Maamar, Abdelaziz El Fazziki, Marinette Savonnet, Mohamed Sadgal, Eric Leclercq, Idir Amine Amarouche, Djamal Benslimane
Engineering Applications Over Social and Open Data with Domain-Specific Languages
Abstract
There is a current trend among governments and organizations to make all sort of information (like budgets, demographic or economic data) public. The information released in this way is called Open Data. Many institutions promote the creation of innovative applications using the data they have released, e.g., in combination with social networks, but only highly skilled engineers can accomplish this task.
Our goal is to facilitate the construction of applications using open data and social networks as communication platform. For this purpose, we propose a family of domain-specific languages directed to automate the different tasks involved, like describing the structure and semantics of the heterogeneous data sets, the patterns to be sought in social network messages, the information to be extracted from static and dynamic data and the messages (over social networks) that the system needs to produce. We have built an extensible working prototype, which allows adding new open data formats and support for different social networks.
Ángel Mora Segura, Juan de Lara
Towards Culture-Sensitive Extensions of CRISs: Gender-Based Researcher Evaluation
Abstract
Current research information systems (CRISs) offer great opportunities for extraction of useful and actionable knowledge based on various data analysis techniques. However, many of these opportunities have not been explored in depth, especially in culture-sensitive areas such as gender-based evaluation of researchers. In this paper, we present GERBER, a methodology and accompanying tool for performing gender-based analysis of CRIS data. The tool enables the extraction of co-authorship networks, computation of various author metrics, and statistical comparison of male and female researchers. Functionality of GERBER is demonstrated on data extracted from the CRIS of the University of Novi Sad (UNS). We also present a plan to integrate GERBER into CRIS UNS in order to facilitate continuous gender-based researcher evaluation. Experiences obtained during such integration will enable us to propose more general methodological guidelines and APIs for culture-sensitive extensions of CRIS systems and standards.
Miloš Savić, Mirjana Ivanović, Miloš Radovanović, Bojana Dimić Surla
Word Similarity Based on Domain Graph
Abstract
In this work we propose a new formalization for word similarity. Assuming that each word corresponds to unit of semantics, called synset, with categorical features, called domain, we construct a domain graph of a synset which is all the hypernyms which belong to the domain of the synset. Here we take an advantage of domain graphs to reflect semantic aspect of words. In experiments we show how well the domain graph approach goes well with word similarity. Then we extend sentence similarity (or Semantic Textual Similarity) independent of Bag-of-Words.
Fumito Konaka, Takao Miura
Backmatter
Metadaten
Titel
Model and Data Engineering
herausgegeben von
Ladjel Bellatreche
Óscar Pastor
Jesús M. Almendros Jiménez
Yamine Aït-Ameur
Copyright-Jahr
2016
Electronic ISBN
978-3-319-45547-1
Print ISBN
978-3-319-45546-4
DOI
https://doi.org/10.1007/978-3-319-45547-1

Premium Partner