Skip to main content

Über dieses Buch

This book constitutes the refereed proceedings of the 34th International Conference on Conceptual Modeling, ER 2015, held in Stockholm, Sweden, in October 2015. The 26 full and 19 short papers presented were carefully reviewed and selected from 131 submissions. The papers are organized in topical sections on business process and goal models, ontology-based models and ontology patterns, constraints, normalization, interoperability and integration, collaborative modeling, variability and uncertainty modeling, modeling and visualization of user generated content, schema discovery and evolution, process and text mining, domain-based modeling, data models and semantics, and applications of conceptual modeling.





Why Philosophize; Why not Just Model?

Conceptual modelling relies on the availability of good quality modelling languages (MLs). However, many of these MLs have not been created in any well-organized and consistent manner leading to identified flaws and ambiguities. These result, in part, from the lack of an ontological commitment, the neglect of language use and speech act theory and possibly incoherent philosophical underpinnings. These various disciplines are examined in the context of their potential integration into the creation of the next generation of modelling languages.

Brian Henderson-Sellers

Methodologies for Semi-automated Conceptual Data Modeling from Requirements

Conceptual modeling is the foundation of system analysis and design methodologies. It is challenging because it requires a clear understanding of an application domain and the ability to translate the requirement specification into a conceptual data model. Semi-automated conceptual data modeling is a process of using an intelligent tool to aid the modeler for the purpose of building a quality conceptual data model. In this paper, we first present six categories of methodologies that can be used for developing conceptual data models. We then describe the characteristics of each category, compare these characteristics, and present related work of each category. We finally suggest a framework for semi-automatically generating conceptual data models from requirements and suggest challenging research topics.

Il-Yeol Song, Yongjun Zhu, Hyithaek Ceong, Ornsiri Thonggoom

Business Process and Goal Models


Aligning Business Goals and Risks in OSS Adoption

Increasing adoption of Open Source Software (OSS) requires a change in the organizational culture and reshaping IT decision-makers mindset. Adopting OSS software components introduces some risks that can affect the adopter organization’s business goals, therefore they need to be considered. To assess these risks, it is required to understand the socio-technical structures that interrelate the stakeholders in the OSS ecosystem, and how these structures may propagate the potential risks to them. In this paper, we study the connection between OSS adoption risks and OSS adopter organizations’ business goals. We propose a model-based approach and analysis framework that combines two existing frameworks: the



framework to model and reason about business goals, and the RiskML notation to represent and analyse OSS adoption risks. We illustrate our approach with data drawn from an industrial partner organization in a joint EU project.

Dolors Costal, Lidia López, Mirko Morandini, Alberto Siena, Maria Carmela Annosi, Daniel Gross, Lucía Méndez, Xavier Franch, Angelo Susi

Pragmatic Requirements for Adaptive Systems: A Goal-Driven Modeling and Analysis Approach

Goal-models (GM) have been used in adaptive systems engineering for their ability to capture the different ways to fulfill the requirements. Contextual GM (CGM) extend these models with the notion of context and context-dependent applicability of goals. In this paper, we observe that the interpretation of a goal achievement is itself contextdependent. Thus, we introduce the notion of Pragmatic Goals which have a dynamic satisfaction criteria. However, the specification of contextdependent goals’ applicability as well as their interpretations make it hard for stakeholders to decide whether the model is achievable for all possible context combinations. Thus we also developed and evaluated an algorithm to decide on the Pragmatic CGM’s achievability.We performed several experiments to evaluate our algorithm regarding correctness and performance and concluded that it can be used for deciding at runtime the tasks to execute under a given context to achieve a quality constraint as well as for pinpointing context sets in which the model is intrinsically unachievable.

Felipe Pontes Guimaraes, Genaina Nunes Rodrigues, Daniel Macedo Batista, Raian Ali

Stress Testing Strategic Goals with SWOT Analysis

Business strategies are intended to guide a company across the mine fields of competitive markets through the fulfilment of strategic objectives. The design of a business strategy generally considers a SWOT operating context consisting of inherent Strengths (S) and Weaknesses (W) of a company, as well as external Opportunities (O) and potential Threats (T) that the company may be facing. Given an ever-changing and uncertain environment, it is important to continuously maintain an updated view of the operating context, in order to determine whether the current strategy is adequate. However, traditional SWOT analysis only provides support for the initial design of business strategy, as opposed to on-going analysis as new, unexpected factors appear and disappear. This paper proposes a systematic analysis for business strategy founded on models of strategic goals and stress test scenarios. Our proposal allows us to improve decision making by (i) supporting continuous scenario analysis based on current and future context and, (ii) identifying and comparing strategic alternatives and courses of action that would lead to better results.

Alejandro Maté, Juan Trujillo, John Mylopoulos

A Method to Align Goals and Business Processes

Business Process Modeling (BPM) has been for a number of years in the spotlight of research and practice, aiming at providing organizations with conceptual modeling-based representations of the flow of activities that generate its main products and services. It is essential that such flow of activities is engineered in a way to satisfy the organization’s goals. However, the work on BPM still makes shy use of goal modeling and the relation between goals and processes is often neglected. In this paper, we propose a method that supports the analyst in identifying which activities in a business process satisfy the organization’s goals. Moreover, our method allows reasoning regarding the impact of each of these activities in the satisfaction of the strategic (i.e. top) goals of the organization. The results of this analysis may lead to reengineering, and grant the analyst with the means to design higher quality BPMs. Besides describing the method, this paper presents a preliminary evaluation of the method by the means of an empirical study made in a controlled environment.

Renata Guizzardi, Ariane Nunes Reis

Detecting the Effects of Changes on the Compliance of Cross-Organizational Business Processes

An emerging challenge for collaborating business partners is to properly define and evolve their cross-organizational processes with respect to imposed global compliance rules. Since compliance verification is known to be very costly, reducing the number of compliance rules to be rechecked in the context of process changes will be crucial. Opposed to intra-organizational processes, however, change effects cannot be easily assessed in such distributed scenarios, where partners only provide restricted public views and assertions on their private processes. Even if local process changes are invisible to partners, they might affect the compliance of the cross-organizational process with the mentioned rules. This paper provides an approach for ensuring compliance when evolving a cross-organizational process. For this purpose, we construct qualified dependency graphs expressing relationships between process activities, process assertions, and compliance rules. Based on such graphs, we are able to determine the subset of compliance rules that might be affected by a particular change. Altogether, our approach increases the efficiency of compliance checking in cross-organizational settings.

David Knuplesch, Walid Fdhila, Manfred Reichert, Stefanie Rinderle-Ma

Enhancing Aspect-Oriented Business Process Modeling with Declarative Rules

When managing a set of inter-related business processes, typically a number of concerns can be distinguished that are applicable to more than one single process, such as security and traceability. The proper enforcement of these




concerns may require a specific configuration effort for each of the business processes involved. Aspect-Oriented Business Process Modelling is an approach that aims at encapsulating these concerns in a model-oriented way. However, stateof- the-art techniques lack efficient mechanisms that allow for the specification of concerns in such a way that they can be executed in


to other parts of the process. Moreover, existing techniques exclusively focus on the formulation of


concerns. To address these limitations, this paper proposes a new approach to encapsulate both optional and mandatory concerns, which can be executed concurrently with other process functionalities. One core element of the new approach is that it extends current Aspect-Oriented Business Process Modelling approaches with declarative rules. Thus, this


approach allows for a sophisticated management of cross-cutting concerns.

Amin Jalali, Fabrizio Maria Maggi, Hajo A. Reijers

Ontology-Based Modeling


Extending the Foundations of Ontology-Based Conceptual Modeling with a Multi-level Theory

Since the late 1980s, there has been a growing interest in the use of foundational ontologies to provide a sound theoretical basis for the discipline of conceptual modeling. This has led to the development of ontology-based conceptual modeling techniques whose modeling primitives reflect the conceptual categories defined in a foundational ontology. The ontology-based conceptual modeling language OntoUML, for example, incorporates the distinctions underlying the taxonomy of types in the Unified Foundational Ontology (UFO) (e.g., kinds, phases, roles, mixins etc.). This approach has focused so far on the support to types whose




in the subject domain, with no provision for




(or categories of categories). In this paper we address this limitation by extending the Unified Foundational Ontology with the MLT multi-level theory. The UFO-MLT combination serves as a foundation for conceptual models that can benefit from the ontological distinctions of UFO as well as MLT’s basic concepts and patterns for multi-level modeling. We discuss the impact of the extended foundation to multi-level conceptual modeling.

Victorio A. Carvalho, João Paulo A. Almeida, Claudenir M. Fonseca, Giancarlo Guizzardi

Logical Design Patterns for Information System Development Problems

Design theories investigate prescriptive and descriptive elements of the activity of design. Central to the descriptive realm are abstract design rules and design goals that are part of the governance of design. On the prescriptive side, models are used on various levels of abstraction for representing different kinds of knowledge for systems engineering. Models for three layers of abstraction are proposed: a business layer, a logical layer, and an implementation layer. At the logical layer, the concept of a logical design pattern is introduced as a natural means for linking business models and technical models as well as design theories and information systems engineering. Ten logical design patterns, extracted from a series of information system development projects, are presented and applied in an example.

Wolfgang Maaß, Veda C. Storey

A Middle-Level Ontology for Context Modelling

Context modelling is one of the stages conducted during the context life cycle. It has the aim of giving meaning and structure to the collected context’s raw data. Although there are different context models proposed in the literature, we have identified some gaps that are not fully covered, particularly related to the reusability of the models themselves and the lack of consolidated and standardized ontological resources. To tackle this problem, we adopt a three-layered context ontology perspective and we focus on this paper in the middle layer, which is defined following a prescriptive process and structured in a modular way for supporting reuse.

Oscar Cabrera, Xavier Franch, Jordi Marco

Ontology Patterns


An Ontology Design Pattern to Represent False-Results

Observations are an important aspect of our society. Arguably, great part of them is captured by means of sensors. Despite the importance of the matter, the ontology of observations and sensors is not well developed, with few efforts dealing with the fundamental questions about their nature. As a result, an important aspect of sensors is overlooked: sensors may fail, producing false-results (i.e. false-positives and false-negatives). The lack of a proper representation of this aspect prevents us from communicating and reasoning about sensor failures, making it harder to assess the correctness of observations and to treat possible errors. In view of this problem, we propose an ontology design pattern (ODP) to represent false-results of sensors. It covers a special case of sensor that exclusively produces positive or negative results regarding the presence of the type of entity the sensor is designed to perceive. The paper introduces the ODP structure as well as its ontological commitments, bringing an example from the biomedical field. Discussion and further research opportunities of research are posed at the end of the paper.

Fabrício Henrique Rodrigues, José Antônio Tesser Poloni, Cecília Dias Flores, Liane Nanci Rotta

Ontology Engineering by Combining Ontology Patterns

Building proper reference ontologies is a hard task. There are a number of methods and tools that traditionally have been used to support this task. These include foundational theories, reuse of domain and core ontologies, development methods, and software tool support. In this context, an approach that has gained increased attention in recent years is the systematic application of ontology patterns. This paper discusses how Foundational and Domain-related Ontology Patterns can be derived, and how they can be applied in combination for building more consistent ontologies in a reuse-centered process.

Fabiano B. Ruy, Cássio C. Reginato, Victor A. Santos, Ricardo A. Falbo, Giancarlo Guizzardi

Towards a Service Ontology Pattern Language

In this paper we partially present an initial version of an Ontology Pattern Language, called S-OPL, describing the core conceptualization of services as a network of interconnected ontology modeling patterns. S-OPL builds on a commitment-based core ontology for services (UFO-S) and has been developed to support the engineering of ontologies involving services in different domains. S-OPL patterns address problems related to the distinction of general kinds of customers and providers, service offering, service negotiation and service delivery. In this paper, we focus on the first two. The use of S-OPL is demonstrated in a real case study in the domain of Information and Communication Technology services.

Glaice K. Quirino, Julio C. Nardi, Monalessa P. Barcellos, Ricardo A. Falbo, Giancarlo Guizzardi, Nicola Guarino, Mario Bochicchio, Antonella Longo, Marco Salvatore Zappatore, Barbara Livieri



Incremental Checking of OCL Constraints with Aggregates Through SQL

Valid states of data are those satisfying a set of constraints. Therefore, efficiently checking whether some constraint has been violated after a data update is an important problem in data management. We tackle this problem by incrementally checking OCL constraint violations by means of SQL queries. Given an OCL constraint, we obtain a set of SQL queries that returns the data that violates the constraint. In this way, we can check the validity of the data by checking the emptiness of these queries. The queries that we obtain are incremental since they are only executed when some relevant data update may violate the constraint, and they only examine the data related to the update.

Xavier Oriol, Ernest Teniente

Probabilistic Cardinality Constraints

Probabilistic databases address well the requirements of an increasing number of modern applications that produce large collections of uncertain data. We propose probabilistic cardinality constraints as a principled tool for controlling the occurrences of data patterns in probabilistic databases. Our constraints help organizations balance their targets for different data quality dimensions, and infer probabilities on the number of query answers. These applications are unlocked by developing algorithms to reason efficiently about probabilistic cardinality constraints, and to help analysts acquire the marginal probability by which cardinality constraints hold in a given application domain. For this purpose, we overcome technical challenges to compute Armstrong PC-sketches as succinct data samples that perfectly visualize any given perceptions about these marginal probabilities.

Tania Roblot, Sebastian Link

SQL Data Profiling of Foreign Keys

Referential integrity is one of the three inherent integrity rules and can be enforced in databases using foreign keys. However, in many real world applications referential integrity is not enforced since foreign keys remain disabled to ease data acquisition. Important applications such as anomaly detection, data integration, data modeling, indexing, reverse engineering, schema design, and query optimization all benefit from the discovery of foreign keys. Therefore, the profiling of foreign keys from dirty data is an important yet challenging task. We raise the challenge further by diverting from previous research in which null markers have been ignored. We propose algorithms for profiling unary and multi-column foreign keys in the real world, that is, under the different semantics for null markers of the SQL standard. While state of the art algorithms perform well in the absence of null markers, it is shown that they perform poorly in their presence. Extensive experiments demonstrate that our algorithms perform as well in the real world as state of the art algorithms perform in the idealized special case where null markers are ignored.

Mozhgan Memari, Sebastian Link, Gillian Dobbie



From Web Tables to Concepts: A Semantic Normalization Approach

Relational Web tables, embedded in HTML or published on data platforms, have become an important resource for many applications, including question answering or entity augmentation. To utilize the data, we require some understanding of what the tables are about. Previous research on recovering Web table semantics has largely focused on simple tables, which only describe a single semantic concept. However, there is also a significant number of de-normalized multi-concept tables on theWeb. Treating these as single-concept tables results in many incorrect relations being extracted. In this paper, we propose a normalization approach to decompose multi-concept tables into smaller single-concept tables. First, we identify columns that represent keys or identifiers of entities. Then, we utilize the table schema as well as intrinsic data correlations to identify concept boundaries and split the tables accordingly. Experimental results on real Web tables show that our approach is feasible and effectively identifies semantic concepts.

Katrin Braunschweig, Maik Thiele, Wolfgang Lehner

Toward RDF Normalization

Billions of RDF triples are currently available on the Web through the Linked Open Data cloud (e.g., DBpedia, LinkedGeoData and New York Times). Governments, universities as well as companies (e.g., BBC, CNN) are also producing huge collections of RDF triples and exchanging them through different serialization formats (e.g., RDF/XML, Turtle, N-Triple, etc.). However, RDF descriptions (i.e., graphs and serializations) are verbose in syntax, often contain redundancies, and could be generated differently even when describing the same resources, which would have a negative impact on their processing. Hence, we propose here an approach to clean and eliminate redundancies from such RDF descriptions as a means of transforming different descriptions of the same information into one representation, which can then be tuned, depending on the target application (information retrieval, compression, etc.). Experimental tests show significant improvements, namely in reducing RDF description loading time and file size.

Regina Ticona-Herrera, Joe Tekli, Richard Chbeir, Sébastien Laborie, Irvin Dongo, Renato Guzman

Design Dimensions for Business Process Architecture

Enterprises employ an array of business processes (BPs) for their operational, supporting, and managerial activities. When BPs are designed to work together to achieve organizational objectives, we refer to them and the relationships among them as the business process architecture (BPA). While substantial efforts have been devoted to designing and analyzing individual BPs, there is little focus on BPAs. As organizations are undergoing changes at many levels, at different times, and at different rates, the BP architect needs to consider how to manage the relationships among multiple BPs. We propose a modeling framework for designing BPAs with a focus on representing and analyzing architectural choices along several design dimensions aimed at achieving various design objectives, such as flexibility, cost, and efficiency.

Alexei Lapouchnian, Eric Yu, Arnon Sturm

Interoperability and Integration


A Conceptual Framework for Large-scale Ecosystem Interoperability

One of the most significant challenges in information system design is the constant and increasing need to establish interoperability between heterogeneous software systems at increasing scale. The automated translation of data between the data models and languages used by information ecosystems built around official or de facto standards is best addressed using model-driven engineering techniques, but requires handling both data and multiple levels of metadata within a single model. Standard modelling approaches are generally not built for this, compromising modelling outcomes. We establish the SLICER conceptual framework built on multilevel modelling principles and the differentiation of basic semantic relations that dynamically structure the model and can capture existing multilevel notions. Moreover, it provides a natural propagation of constraints over multiple levels of instantiation.

Matt Selway, Markus Stumptner, Wolfgang Mayer, Andreas Jordan, Georg Grossmann, Michael Schrefl

Flexible Data Management across XML and Relational Models: A Semantic Approach

Relational model and XML model have their own advantages and disadvantages in data maintenance and sharing. In this paper, we consider a framework that can be used in many applications that need to maintain their potentially large data in relational database and partially publish, exchange and/or utilize the data in different XML formats according to the applications’ preferences. The existing relational-to- XML data transformation relies on pre-defined XML view or transformation rules. Thus the output XML data or views are rigid and invariable, and cannot fulfill the requirement of format flexibility in our considered situations. In this paper, we propose a semantic approach that supports the framework under our consideration. We use conceptual models to design both relational data and XML views, and invent algorithms to transform data in one model to the other via conceptual model transformation. We also demonstrate how XPath queries issued to XML views can be translated and processed against a relational database through conceptual models.

Huayu Wu, Tok Wang Ling, Wee Siong Ng

EMF Views: A View Mechanism for Integrating Heterogeneous Models

Modeling complex systems involves dealing with several heterogeneous and interrelated models defined using a variety of languages (UML, ER, BPMN, DSLs, etc.). These models must be frequently combined in different cross-domain perspectives to provide stakeholders the view of the system they need to best perform their tasks. Several model composition approaches have already been proposed addressing this problem. Nevertheless, they present some important limitations concerning efficiency, interoperability and synchronization between the base models and the composed ones. As an alternative we introduce EMF Views, an approach coming with a dedicated language and tooling for defining views on potentially heterogeneous models. Similarly to views in databases, model views are not materialized but instead redirect all model access and manipulation requests to the base models from which they are obtained. This is realized in a transparent way for both the modeler and the other modeling tools using the concerned (meta)models.

Hugo Bruneliere, Jokin Garcia Perez, Manuel Wimmer, Jordi Cabot

Collaborative Modeling


Gitana: A SQL-Based Git Repository Inspector

Software development projects are notoriously complex and difficult to deal with. Several support tools such as issue tracking, code review and Source Control Management (SCM) systems have been introduced in the past decades to ease development activities. While such tools efficiently track the evolution of a given aspect of the project (e.g., bug reports), they provide just a partial view of the project and often lack of advanced querying mechanisms limiting themselves to command line or simple GUI support. This is particularly true for projects that rely on Git, the most popular SCM system today.

In this paper, we propose a conceptual schema for Git and an approach that, given a Git repository, exports its data to a relational database in order to (1) promote data integration with other existing SCM tools and (2) enable writing queries on Git data using standard SQL syntax. To ensure efficiency, our approach comes with an incremental propagation mechanism that refreshes the database content with the latest modifications. We have implemented our approach in Gitana, an open-source tool available on GitHub.

Valerio Cosentino, Javier Luis Cánovas Izquierdo, Jordi Cabot

Near Real-Time Collaborative Conceptual Modeling on the Web

Collaboration during the creation of conceptual models is an integral pillar of design processes in many disciplines. Synchronous collaboration, in particular, has received little attention in the conceptual modeling literature so far. There are many modeling and meta-modeling tools available, however most of these do not support synchronous collaboration, are offered under restrictive licenses, or build on proprietary libraries and technologies. To close this gap, this paper introduces the lightweight meta-modeling framework SyncMeta, which supports near real-time collaborative modeling, meta-modeling and generation of model editors in the Web browser. It employs well-proven Operational Transformation algorithms in a peer-to-peer architecture to resolve conflicts occurring during concurrent user edits. SyncMeta was successfully used to create meta-models of various conceptual modeling languages. An enduser evaluation showed that the editing tools of SyncMeta are considered usable and useful by collaborative modelers.

Michael Derntl, Petru Nicolaescu, Stephan Erdtmann, Ralf Klamma, Matthias Jarke

Dynamic Capabilities for Sustainable Enterprise IT – A Modeling Framework

A key consideration of researchers and practitioners alike in the field of information systems engineering is the co-development of information systems and business structures and processes that are in alignment, that this alignment reflects the challenges presented by the business ecologies and that the developed systems are sustainable through appropriate responses to pressures for their evolution. These challenges inevitably need to be addressed through development schemes that recognize the intertwining of information systems, business strategy and their ecosystems. The paper presents the conceptual modeling foundations of such a scheme providing a detailed exposition of the issues and solutions for sustainable systems in which


plays an integrative role using examples from an industrial-size application. The contribution of the paper is on its proposition of conceptual modeling techniques that are applicable to both business strategies and information systems development.

Mohammad Hossein Danesh, Pericles Loucopoulos, Eric Yu

Variability and Uncertainty Modeling


A Working Model for Uncertain Data with Lineage

Lineage is important in uncertain data management since it can be used for finding out which part of data contributes to a result and computing the probability of the result. Nonetheless, the existing works consider an uncertain tuple as a set of tuples that can be stored in a relational table. Lineage can derive each tuple in the table, with which one can only find out the tuples rather than specific attributes that contribute to the result. If uncertain tuples have multiple uncertain attributes, for a result tuple with low probability, users cannot know which attribute is the main cause of it. In this paper, we propose an approach to model uncertain data. Compared with the alternative way based on the relational model, our model achieves a low maintenance cost and avoids a large number of redundant storage and join operations. Based on our model, some operations are defined for querying data, generating lineage and computing probability of results. Then we discuss how to correctly compute probability with lineage and an algorithm is proposed to transform lineage for correct probability computation.

Liang Wang, Liwei Wang, Zhiyong Peng

Capturing Variability in Adaptation Spaces: A Three-Peaks Approach

Variability is essential for adaptive software systems, because it captures the space of alternative adaptations a system is capable of when it needs to adapt. In this work, we propose to capture variability for an adaptation space in terms of a three dimensional model. The first dimension captures requirements through goals and reflects all possible ways of achieving these goals. The second dimension captures supported variations of a system’s architectural structure, modeled in terms of connectors and components. The third dimension describes supported system behaviors, by modeling possible sequences for goal fulfillment and task execution. Of course, the three dimensions of a variability model are inter-twined as choices made with respect to one dimension have impact on the other two. Therefore, we propose an incremental design methodology for variability models that keeps the three dimensions aligned and consistent. We illustrate our proposal with a case study involving the meeting scheduling system exemplar.

Konstantinos Angelopoulos, Vítor E. Silva Souza, John Mylopoulos

Taming Software Variability: Ontological Foundations of Variability Mechanisms

Variability mechanisms are techniques applied to adapt software product line (SPL) artifacts to the context of particular products, promoting systematic reuse of those artifacts. Despite the large variety of mechanisms reported in the literature, a catalog of variability mechanisms is built ad-hoc and lacks systematization. In this paper we propose an ontologically-grounded theoretical framework for mathematically characterizing well-known variability mechanisms based on analysis of software behavior. We distinguish between variability in the



, which refers to differences in the


of product’s behaviors, and variability in the



, which focuses on differences in the



Iris Reinhartz-Berger, Anna Zamansky, Yair Wand

Modeling and Visualization of User Generated Content


Personalized Knowledge Visualization in Twitter

In recent years, Twitter has been playing an essential role in broadcasting real-time news and useful information. Many commercial media companies have used Twitter as an effective online tool to broadcast breaking news. Consequently, the tweets are organized in chronological order. In this paper, we investigate the tweet exhibition problem from a new perspective. We treat Twitter as an important source of knowledge gaining and exploit its semantic attributes. In particular, we focus on extracting and distilling knowledge from tweets and proposing a new organization and visualization tool to exhibit the tweets in a semantic manner. We conducted our experiments on Amazon Turk and the user feedback from crowd demonstrated the effectiveness of our new Twitter visualization tool.

Chen Liu, Dongxiang Zhang, Yueguo Chen

A Multi-dimensional Approach to Crowd-Consensus Modeling and Evaluation

In this paper, we propose a multi-dimensional approach to support modeling and consensus management in collective crowdsourcing applications/problems.We define the notion of crowd-consensus, and, for each dimension of analysis, we set pre-defined dimensional levels capturing the different variabilities characterizing the crowd-consensus along that dimension in different applications/problems. The design of a crowdsourcing task for a given target problem requires to characterize the task with respect to each dimension following a pattern-based approach.

Silvana Castano, Alfio Ferrara, Stefano Montanelli

Principles for Modeling User-Generated Content

The increasing reliance of organizations on externally produced information, such as online user-generated content (UGC), challenges the common assumption of representation by abstraction in conceptual modeling research and practice. This paper evaluates these assumptions in the context of online citizen science that relies on UGC to collect data from ordinary people to support scientific research. Using a theoretical approach based in philosophy and psychology, we propose alternative principles for modeling UGC.

Roman Lukyanenko, Jeffrey Parsons

Ranking Friendly Result Composition for XML Keyword Search

This paper addresses an open problem of keyword search in XML trees: given relevant matches to keywords, how to compose query results properly so that they can be effectively ranked and easily understood by users. The approaches adopted in the literature are oblivious to user search intention, making ranking schemes ineffective on such results. Intuitively, each query has a search target and each result should contain exactly one instance of the search target along with its evidence about its relevance to the query. In this paper, we design algorithms that compose atomic and intact query results driven by users’ search targets. To infer search targets, we analyze return specifications in the query, the modifying relationship among keyword matches and the entities involved in the search. Experimental evaluationsvalidate the effectiveness and efficiency of our approach.

Ziyang Liu, Yichuang Cai, Yi Shan, Yi Chen

Schema Discovery and Evolution


How is Life for a Table in an Evolving Relational Schema? Birth, Death and Everything in Between

In this paper, we study the version history of eight databases that are part of larger open source projects, and report on our observations on how evolution-related properties, like the possibility of deletion, or the amount of updates that a table undergoes, are related to observable table properties like the number of attributes or the time of birth of a table. Our findings indicate that (i) most tables live quiet lives; (ii) few top-changers adhere to a profile of long duration, early birth, medium schema size at birth; (iii) tables with large schemata or long duration are quite unlikely to be removed, and, (iv) early periods of the database life demonstrate a higher level of evolutionary activity compared to later ones.

Panos Vassiliadis, Apostolos V. Zarras, Ioannis Skoulis

Inferring Versioned Schemas from NoSQL Databases and Its Applications

While the concept of database schema plays a central role in relational database systems, most NoSQL systems are schemaless: these databases are created without having to formally define its schema. Instead, it is implicit in the stored data. This lack of schema definition offers a greater flexibility; more specifically, the schemaless databases ease both the recording of non-uniform data and data evolution. However, this comes at the cost of losing some of the benefits provided by schemas. In this article, a MDE-based reverse engineering approach for inferring the schema of aggregate-oriented NoSQL databases is presented. We show how the obtained schemas can be used to build database utilities that tackle some of the problems encountered using implicit schemas: a schema diagram viewer and a data validator generator are presented.

Diego Sevilla Ruiz, Severino Feliciano Morales, Jesús García Molina

Schema Discovery in RDF Data Sources

The Web has become a huge information space consisting of interlinked datasets, enabling the design of new applications. The meaningful usage of these datasets is a challenge, as it requires some knowledge about their content such as their types and properties. In this paper, we present an automatic approach for schema discovery in RDF(S)/OWL datasets.

We consider a schema as a set of type and link definitions. Our contribution is twofold: (i) generating the types describing a dataset, along with a description for each of them called type profile; (ii) generating the semantic links between types as well as the hierarchical links through the analysis of type profiles. Our approach relies on a density-based clustering algorithm and it does not require any schema-related information in the dataset. We have implemented the proposed algorithms and we present some evaluation results showing the effectiveness of our approach.

Kenza Kellou-Menouer, Zoubida Kedad

Process and Text Mining


Learning Relationships Between the Business Layer and the Application Layer in ArchiMate Models

Enterprise architecture provides a visualisation tool for stakeholder to manage and improve the current organization strategy to achieve its objectives. However, building an enterprise architecture is a time-consuming and often highly complex task. It involves data collection and analysis in several levels of granularity, from the physical nodes to the business execution. Existing solutions does not provide techniques to learn the relationship between the levels of granularity. In this paper, we proposed a method to correlate the business and application layers in ArchiMate notation.

Ayu Saraswati, Chee-Fon Chang, Aditya Ghose, Hoa Khanh Dam

Mining Process Task Post-Conditions

A large and growing body of work explores the use of semantic annotation of business process designs, but these annotations can be difficult and expensive to acquire. This paper presents a data-driven approach to mining these annotations (and specifically post-conditions) from event logs in process execution histories which describe both task execution events (typically contained in



) and state update events (which we record in



). We present an empirical evaluation, which suggests that the approach provides generally reliable results.

Metta Santiputri, Aditya K. Ghose, Hoa Khanh Dam, Xiong Wen

Conceptual Modeling for Financial Investment with Text Mining

Although text-mining, sentiment analysis, and other forms of analysis have been carried out on financial investment applications, a significant amount of associated research is ad hoc searching for meaningful patterns. Other research in finance develops theory using limited data sets. These efforts are at two extremes. To bridge the gap between financial data analytics and finance domain theory, this research analyzes a specific conceptual model, the Business Intelligence Model (BIM), to identify constructs and concepts that could be beneficial for matching data analytics to domain theory. Doing so, provides a first step towards understanding how to effectively generate and validate domain theories that significantly benefit from data analytics.

Yang Gu, Veda C. Storey, Carson C. Woo

Applications and Domain-based Modeling


Breaking the Recursivity: Towards a Model to Analyse Expert Finders

Expert Finding (EF) techniques help in discovering people having relevant knowledge and skills. But for their validation, EF techniques usually rely on experts, meaning using another EF technique, generally not properly validated, and exploit them mainly for output validations, meaning only at late stages. We propose a model, which builds on literature in Psychology and practice, to identify generic concepts and relations in order to support the analysis and design of EF techniques, thus inferring potential improvements during early stages in an expertfree manner. Our contribution lies in the identification and review of relevant literature, building the conceptual model, and illustrating its use through an analysis of existing EF techniques. Although the model can be improved, we can already identify strengths and limitations in recent EF techniques, thus supporting the usefulness of a model-based analysis and design for EF techniques.

Matthieu Vergne, Angelo Susi

Static Weaving in Aspect Oriented Business Process Management

Separation of concerns is an important topic in business process modelling that aims to reduce complexity, increase the re-usability and enhance the maintainability of business process models. Some concerns cross over several business processes (known as cross-cutting concerns), and they hinder current modularization techniques to encapsulate them efficiently. Aspect Oriented Business Process Modelling aims to encapsulate these concerns from business process models. Although many researchers proposed different aspect-oriented business process modelling approaches, there is no analysis technique to check these models in terms of soundness. Thus, this paper proposes a formal definitions and semantics for aspect-oriented business process models, and it enables the analysis of these models in terms of soundness at design time through defining a static weaving algorithm. The algorithm is implemented as an artefact that support weaving aspect-oriented business process models. The artefact is used to analyse different scenarios, and the result of analysis reveals the situations that can introduce different problems like deadlock. In addition, an example of such scenario is given that shows how the artefact can detect the problems at design time. Such analysis enables process modellers to discover the problems at design time, so the problems will not be left to be discovered at runtime - which apply a lot of costs to correct them.

Amin Jalali

Tangible Modelling to Elicit Domain Knowledge: An Experiment and Focus Group

Conceptual models represent social and technical aspects of the world relevant to a variety of technical and non-technical stakeholders. To build these models, knowledge might have to be collected from domain experts who are rarely modelling experts and don’t usually have the time or desire to learn a modelling language. We investigate an approach to overcome this challenge by using physical tokens to represent the conceptual model. We call the resulting models


models. We illustrate this idea by creating a tangible representation of a sociotechnical modelling language and provide initial evidence of the relative usability and utility of tangible versus abstract modelling. We discuss psychological and social theories that could explain these observations and discuss generalizability and scalability of the approach.

Dan Ionita, Roel Wieringa, Jan-Willem Bullee, Alexandr Vasenev

The REA Accounting Model: Enhancing Understandability and Applicability

The REA accounting model developed by McCarthy conceptualizes the economic logic of the double-entry bookkeeping without referring to debits, credits and accounts. The conceptual core elements of the model are the economic resources, economic events and economic agents as well as the relationships that link the underlying stock flows according to the duality principle. In this paper the debit and credit notations are included as a meta concept to promote the model’s understanding within the traditional accounting logic. By specifying additional economic resource types in form of liabilities and equity the model is completed with respect to the essential balance sheet positions, so that the REA accounting model is ready for accounting applications.

Walter S. A. Schwaiger

Data Models and Semantics


A Schema-Less Data Model for the Web

To extract and represent domain-independent web scale data, we introduce a schema-less and self-describing data model called Objectoriented Web Model (OWM), which is rich in semantics and flexible in structure. It represents web pages as objects with hierarchical structures and links in a web page as relationships to other objects, so that objects form a network. Taking use of web segmentation techniques, data from data-intensive web pages can be extracted, represented and integrated as OWM objects.

Liu Chen, Mengchi Liu, Ting Yu

An Analysis and Characterisation of Publicly Available Conceptual Models

Multiple conceptual data modelling languages exist, with newer version typically having more features to model the universe of discourse more precisely. The question arises, however, to what extent those features are actually used in extant models, and whether characteristic profiles can be discerned. We quantitatively evaluated this with a set of 105 UML Class Diagrams, ER and EER models, and ORM and ORM2 diagrams. When more features are available, they are used, but few times. Only 64% of the entities are the kind of entities that appear in all three language families. Different profiles are identified that characterise how a typical UML, (E)ER and ORM diagram looks like.

C. Maria Keet, Pablo Rubén Fillottrani

An Extended ER Algebra to Support Semantically Richer Queries in ERDBMS

In this paper we present the foundations for a semantically rich main-memory DBMS based on the entity-relationship data model. The DBMS is fully operational and performs all queries that are illustrated in the paper. So far, the ER model is mainly used as a conceptual model and mapped into the relational model. Semantics like the relationships among entities or the cardinality ratio constraints are not explicit in the relational model. This paper treats the ER model as a logical model for the user and we use the relational as the physical model in our ER model based DBMS - ERDBMS. We use CISC (complex instruction set computing) operators but implement them efficiently in main-memory data storage. This paper concentrates on the extended ER algebra. Our high-level query language ERSQL and the main memory implementation are elaborated in [14].

Moritz Wilfer, Shamkant B. Navathe

Enhancing Entity-Relationship Schemata for Conceptual Database Structure Models

The paper aims at development of well-founded notions of database structure models that are specified in entity-relationship modelling languages. These notions reflect the functions a model fulfills in utilisation scenarios.

Bernhard Thalheim, Marina Tropmann-Frick


Weitere Informationen