Skip to main content

2016 | Buch

Web Engineering

16th International Conference, ICWE 2016, Lugano, Switzerland, June 6-9, 2016. Proceedings

insite
SUCHEN

Über dieses Buch

This book constitutes the refereed proceedings of the 16th International Conference on Web Engineering, ICWE 2016, held in Lugano, Switzerland, in June 2016.

The 19 full research papers, 13 short papers, 3 vision papers, 11 demonstrations, 5 posters, 6 PhD Symposium and 4 tutorials presented were carefully reviewed and selected from 120 submissions.

The 16th edition of ICWE accepted contributions related to different research areas revolving around Web engineering, including: Web application modelling and engineering, Human computation and crowdsourcing, Web applications composition and mashups, SocialWeb applications, SemanticWeb, and, for the first time, also the Web of Things.

Inhaltsverzeichnis

Frontmatter

Full Research Papers

Frontmatter
Medley: An Event-Driven Lightweight Platform for Service Composition

Distributed applications are evolving at a frantic pace, critically relying on each other to offer a host of new functionalities. The emergence of the service-oriented paradigm has made it possible to build complex applications as a set of self-contained and loosely coupled services that work altogether in concert. However, the traditional vision of Service-Oriented Architectures (SOA) based on web service specifications does not meet the trend of many major service providers. Instead, they promote microservices, a refinement of SOA focusing on lightweight communication mechanisms such as HTTP. Therefore, existing approaches for orchestrating the composition of various services become unusable in practice.In this paper, we introduce Medley, an event-driven lightweight platform for service composition. Medley is based on a domain-specific language for describing orchestration and a compiler that produces efficient code. We have used Medley to develop various compositions, involving a large number of existing services. Our evaluation shows that it scales both on a mainstream server and an embedded device while consuming a reasonable amount of resources.

Elyas Ben Hadj Yahia, Laurent Réveillère, Yérom-David Bromberg, Raphaël Chevalier, Alain Cadot
REST APIs: A Large-Scale Analysis of Compliance with Principles and Best Practices

Quickly and dominantly, REST APIs have spread over the Web and percolated into modern software development practice, especially in the Mobile Internet where they conveniently enable offloading data and computations onto cloud services. We analyze more than 78 GB of HTTP traffic collected by Italy’s biggest Mobile Internet provider over one full day and study how big the trend is in practice, how it changed the traffic that is generated by applications, and how REST APIs are implemented in practice. The analysis provides insight into the compliance of state-of-the-art APIs with theoretical Web engineering principles and guidelines, knowledge that affects how applications should be developed to be scalable and robust. The perspective is that of the Mobile Internet.

Carlos Rodríguez, Marcos Baez, Florian Daniel, Fabio Casati, Juan Carlos Trabucco, Luigi Canali, Gianraffaele Percannella
MIRA: A Model-Driven Framework for Semantic Interfaces for Web Applications

A currently recognized barrier for the wider adoption and dissemination of Semantic Web technologies is the absence of suitable interfaces and tools to allow suitable access by end-users. In a wider context, it has also been recognized that modern day interfaces must deal with a large number of heterogeneity factors, such as varying user profiles and runtime hardware and software platforms. This paper presents MIRA, a framework for defining and implementing Semantic Interfaces for Web applications, including those on the Semantic Web. A Semantic Interface is defined as being one capable of understanding and adapting to the data it presents and captures, and its schema, if present. Moreover, the interface must also be able to adapt to its context of use – the device being used, any available information about its user, network conditions, and so on. Using a model-driven approach, MIRA allows developers to define such interfaces, and generates code that can run on clients, servers or both. We have carried out a qualitative evaluation that shows that MIRA does indeed provide a better process for developers, without imposing any significant performance overhead.

Ezequiel Bertti, Daniel Schwabe
Volatile Functionality in Action: Methods, Techniques and Assessment

One of the main features of most Web applications today is their great dynamism. They are undoubtedly characterized by a continuous evolution. After implementing and performing the first deployment of a Web application, some new requirements are bound to arise, which involve the need to incorporate new functionalities, generally unknown during the design stage. This type of functionalities, which arise as a response to unexpected requirements of the business layer, have the peculiarity that they eventually need to be removed due to the expiration of their commercial value. The continuous incorporation and removal of these functionalities, which we will call “volatile functionalities”, usually has a negative impact on some important aspects of the Web application. Volatile Functionality meta-framework is a conceptual framework that permits to support the lifespan of volatile functionalities in Web applications. We have developed diverse techniques enabling full support of volatile functionalities for enterprise application. Moreover, we have performed an evaluation for assessing developers’ experience and solutions’ performance.

Darian Frajberg, Matías Urbieta, Gustavo Rossi, Wieland Schwinger
Abstracting and Structuring Web Contents for Supporting Personal Web Experiences

This paper presents a novel approach for supporting abstraction and structuring mechanisms of Web contents. The goal of this approach is to enable users to create/extract Web contents in the form of objects that they can manipulate to create Personal Web experiences. We present an architecture that not only allows the user interaction with individual objects but also supports the integration of many objects found in diverse Web sites. We claim that once Web contents have been organized as objects it is possible to create many types of Personal Web interactions. The approach involves end-users and developers and it is fully supported by dedicated tools. We show how end-users can use our tools to identify contents and transform them into objects stored in our platform. We show how developers can use of objects to create Personal Web applications.

Sergio Firmenich, Gabriela Bosetti, Gustavo Rossi, Marco Winckler, Tomas Barbieri
CTAT: Tilt-and-Tap Across Devices

Motion gestures have been proposed as an interaction para-digm for pairing, and sharing data between, mobile devices. They have also been used for interaction with large screens such as semi-public displays where a mobile phone can be used as a form of remote control in an eyes-free manner. Yet, so far, little attention has been paid to their potential use in cross-device web applications. We therefore decided to develop a framework that would support investigations into the use of a combination of touch and tilt interactions in cross-device scenarios. We first report on a study that motivated the development of the framework and informed its design. We then present the resulting Cross-Tilt-and-Tap (CTAT) framework for the rapid development of applications that make use of various motion gestures for communication between two or more devices. We conclude by describing an applications developed using CTAT.

Linda Di Geronimo, Maria Husmann, Abhimanyu Patel, Can Tuerk, Moira C. Norrie
Revisiting Web Data Extraction Using In-Browser Structural Analysis and Visual Cues in Modern Web Designs

Recent trends in website design have an impact on methods used for web data extraction. Many existing methods rely on structural analysis of web pages and, with the introduction of CSS, table-based layouts are no longer used, while responsive design means that layout and presentation are dependent on browsing context which also makes the use of visual clues more complex. We present DeepDesign, a system that semi-automatically extracts data records from web pages based on a combination of structural and visual features. It runs in a general-purpose browser, taking advantage of direct access to the complete CSS3 spectrum and the capability to trigger and execute JavaScript in the page. The user sees record matching in real-time and dynamically adapts the process if required. We present the details of the matching algorithms and provide an evaluation of them based on the top ten Alexa websites.

Alfonso Murolo, Moira C. Norrie
Clustering-Aided Page Object Generation for Web Testing

To decouple test code from web page details, web testers adopt the Page Object design pattern. Page objects are facade classes abstracting the internals of web pages (e.g., form fields) into high-level business functions that can be invoked by test cases (e.g., user authentication). However, writing such page objects requires substantial effort, which is paid off only later, during software evolution. In this paper we propose a clustering-based approach for the identification of meaningful abstractions that are automatically turned into Java page objects. Our clustering approach to page object identification has been integrated into our tool for automated page object generation, Apogen. Experimental results indicate that the clustering approach provides clusters of web pages close to those manually produced by a human (with, on average, only three differences per web application). 75 % of the code generated by Apogen can be used as-is by web testers, breaking down the manual effort for page object creation. Moreover, a large portion (84 %) of the page object methods created automatically to support assertion definition corresponds to useful behavioural abstractions.

Andrea Stocco, Maurizio Leotta, Filippo Ricca, Paolo Tonella
Coverage Patterns-Based Approach to Allocate Advertisement Slots for Display Advertising

Display advertising is one of the predominant modes of online advertising. A publisher makes efforts to allocate the available ad slots/page views to meet the demands of the maximum number of advertisers for maximizing the revenue. Investigating efficient approaches for ad slot allocation to advertisers is a research issue. In the literature, efforts are being made to propose approaches by extending optimization techniques. In this paper, we propose an improved approach for ad slot allocation by exploiting the notion of coverage patterns. In the literature, an approach is proposed to extract the knowledge of coverage patterns from the transactional databases. In the display advertising scenario, we propose an efficient ad slot allocation approach by exploiting the knowledge of coverage patterns extracted from the click stream transactions. The proposed allocation framework, in addition to the step of extraction of coverage patterns, contains mapping, ranking and allocation steps. The experimental results on both synthetic and real world click stream datasets show that the proposed approach could meet the demands of increased number of advertisers and reduces the boredom faced by user by reducing the repeated display of advertisements.

Vaddadi Naga Sai Kavya, P. Krishna Reddy
Enabling Fine-Grained RDF Data Completeness Assessment

Nowadays, more and more RDF data is becoming available on the Semantic Web. While the Semantic Web is generally incomplete by nature, on certain topics, it already contains complete information and thus, queries may return all answers that exist in reality. In this paper we develop a technique to check query completeness based on RDF data annotated with completeness information, taking into account data-specific inferences that lead to an inference problem which is $$\varPi ^P_2$$-complete. We then identify a practically relevant fragment of completeness information, suitable for crowdsourced, entity-centric RDF data sources such as Wikidata, for which we develop an indexing technique that allows to scale completeness reasoning to Wikidata-scale data sources. We verify the applicability of our framework using Wikidata and develop COOL-WD, a completeness tool for Wikidata, used to annotate Wikidata with completeness statements and reason about the completeness of query answers over Wikidata. The tool is available at http://cool-wd.inf.unibz.it/.

Fariz Darari, Simon Razniewski, Radityo Eko Prasojo, Werner Nutt
Benchmarking Web API Quality

Web APIs are increasingly becoming an integral part of web or mobile applications. As a consequence, performance characteristics and availability of the APIs used directly impact the user experience of end users. Still, quality of web APIs is largely ignored and simply assumed to be sufficiently good and stable. Especially considering geo-mobility of today’s client devices, this can lead to negative surprises at runtime.In this work, we present an approach and toolkit for benchmarking the quality of web APIs considering geo-mobility of clients. Using our benchmarking tool, we then present the surprising results of a geo-distributed 3-month benchmark run for 15 web APIs and discuss how application developers can deal with volatile quality both from an architectural and engineering point of view.

David Bermbach, Erik Wittern
Correlation of Ontology-Based Semantic Similarity and Human Judgement for a Domain Specific Fashion Ontology

Evaluation of semantic similarity is difficult because semantic similarity values are highly subjective. There are several approaches that compare automatically computed similarities with values assigned by humans for general purpose terms and ontologies that contain general purpose terms. However, ontologies should be as domain specific as possible to capture the maximal amount of semantic knowledge about a domain. To evaluate the semantic knowledge captured by a custom fashion ontology we conducted a survey and crowdsourced similarity values for fashion terms. In this article we compare the manually assigned similarities to those computed automatically with several ontology-based similarity measures. We show that our proposed feature-based measure achieves the highest correlation with human judgement and give some insight into why this kind of similarity measure most resembles human similarity assessments. To evaluate the influence of the ontology on similarities we compare the results achieved with our fashion ontology to similarity values computed using a fragment of DBpedia.

Edgar Kalkowski, Bernhard Sick
Co-evolution of RDF Datasets

Linking Data initiatives have fostered the publication of large number of RDF datasets in the Linked Open Data (LOD) cloud, as well as the development of query processing infrastructures to access these data in a federated fashion. However, different experimental studies have shown that availability of LOD datasets cannot be always ensured, being RDF data replication required for envisioning reliable federated query frameworks. Albeit enhancing data availability, RDF data replication requires synchronization and conflict resolution when replicas and source datasets are allowed to change data over time, i.e., co-evolution management needs to be provided to ensure consistency. In this paper, we tackle the problem of RDF data co-evolution and devise an approach for conflict resolution during co-evolution of RDF datasets. Our proposed approach is property-oriented and allows for exploiting semantics about RDF properties during co-evolution management. The quality of our approach is empirically evaluated in different scenarios on the DBpedia-live dataset. Experimental results suggest that proposed proposed techniques have a positive impact on the quality of data in source datasets and replicas.

Sidra Faisal, Kemele M. Endris, Saeedeh Shekarpour, Sören Auer, Maria-Esther Vidal
LinkSUM: Using Link Analysis to Summarize Entity Data

The amount of structured data published on the Web is constantly growing. A significant part of this data is published in accordance to the Linked Data principles. The explicit graph structure enables machines and humans to retrieve descriptions of entities and discover information about relations to other entities. In many cases, descriptions of single entities include thousands of statements and for human users it becomes difficult to comprehend the data unless a selection of the most relevant facts is provided.In this paper we introduce LinkSUM, a lightweight link-based approach for the relevance-oriented summarization of knowledge graph entities. LinkSUM optimizes the combination of the PageRank algorithm with an adaption of the Backlink method together with new approaches for predicate selection. Both, quantitative and qualitative evaluations have been conducted to study the performance of the method in comparison to an existing entity summarization approach. The results show a significant improvement over the state of the art and lead us to conclude that prioritizing the selection of related resources leads to better summaries.

Andreas Thalhammer, Nelia Lasierra, Achim Rettinger
Beyond Established Knowledge Graphs-Recommending Web Datasets for Data Linking

With the explosive growth of the Web of Data in terms of size and complexity, identifying suitable datasets to be linked, has become a challenging problem for data publishers. To understand the nature of the content of specific datasets, we adopt the notion of dataset profiles, where datasets are characterized through a set of topic annotations. In this paper, we adopt a collaborative filtering-like recommendation approach, which exploits both existing dataset profiles, as well as traditional dataset connectivity measures, in order to link arbitrary, non-profiled datasets into a global dataset-topic-graph. Our experiments, applied to all available Linked Datasets in the Linked Open Data (LOD) cloud, show an average recall of up to $$81\,\%$$, which translates to an average reduction of the size of the original candidate dataset search space to up to $$86\,\%$$. An additional contribution of this work is the provision of benchmarks for dataset interlinking recommendation systems.

Mohamed Ben Ellefi, Zohra Bellahsene, Stefan Dietze, Konstantin Todorov
YABench: A Comprehensive Framework for RDF Stream Processor Correctness and Performance Assessment

RDF stream processing (RSP) has become a vibrant area of research in the semantic web community. Recent advances have resulted in the development of several RSP engines that leverage semantics to facilitate reasoning over flows of incoming data. These engines vary greatly in terms of implemented query syntax, their evaluation and operational semantics, and in various performance dimensions. Existing benchmarks tackle particular aspects such as functional coverage, result correctness, or performance. None of them, however, assess RSP engine behavior comprehensively with respect to all these dimensions. In this paper, we introduce YABench, a novel benchmarking framework for RSP engines. YABench extends the concept of correctness checking and provides a flexible and comprehensive tool set to analyze and evaluate RSP engine behavior. It is highly configurable and provides quantifiable and reproducible results on correctness and performance characteristics. To validate our approach, we replicate results of the existing CSRBench benchmark with YABench. We then assess two well-established RSP engines, CQELS and C-SPARQL, through more comprehensive experiments. In particular, we measure precision, recall, performance, and scalability characteristics while varying throughput and query complexity. Finally, we discuss implications on the development of future stream processing engines and benchmarks.

Maxim Kolchin, Peter Wetz, Elmar Kiesling, A Min Tjoa
When a FILTER Makes the Difference in Continuously Answering SPARQL Queries on Streaming and Quasi-Static Linked Data

We are witnessing a growing interest for Web applications that (i) require to continuously combine highly dynamic data stream with background data and (ii) have reactivity as key performance indicator. The Semantic Web community showed that RDF Stream Processing (RSP) is an adequate framework to develop this type of applications.However, when the background data is distributed over the Web, even RSP engines risk losing reactiveness due to the time necessary to access the background data. State-of-the-art RSP engines remain reactive using a local replica of the background data, but such a replica progressively become stale if not updated to reflect the changes in the remote background data.For this reason, recently, the RSP community investigated maintenance policies (collectively named Acqua) that guarantee reactiveness while maximizing the freshness of the replica. Acqua’s policies apply to queries that join a basic graph pattern in a window clause with another basic graph pattern in a service clause. In this paper, we extend the class of queries considered in Acqua adding a FILTER clause that selects mapping in the background data. We propose a new maintenance policy (namely, the Filter Update Policy) and we show how to combine it with Acqua policies. A set of experimental evaluations empirically proves the ability of the proposed policies to guarantee reactiveness while keeping the replica fresher than with the Acqua policies.

Shima Zahmatkesh, Emanuele Della Valle, Daniele Dell’Aglio
Aspect-Based Sentiment Analysis on the Web Using Rhetorical Structure Theory

Fine-grained sentiment analysis on the Web has received much attention in recent years. In this paper we suggest an approach to Aspect-Based Sentiment Analysis that incorporates structural information of reviews by employing Rhetorical Structure Theory. First, a novel way of determining the context of an aspect is presented, after which a full path analysis is performed on the found context tree to determine the aspect sentiment. Comparing the proposed method to a baseline model, which does not use the discourse structure of the text and solely relies on a sentiment lexicon to assign sentiments, we find that the proposed method consistently outperforms the baseline on three different datasets.

Rowan Hoogervorst, Erik Essink, Wouter Jansen, Max van den Helder, Kim Schouten, Flavius Frasincar, Maite Taboada
Diversity in Urban Social Media Analytics

Social media has emerged as one of the data backbones of urban analytics systems. Thanks to geo-located microposts (text-, image-, and video-based) created and shared through portals such as Twitter and Instagram, scientists and practitioners can capitalise on the availability of real-time and semantically rich data sources to perform studies related to cities and the people inhabiting them. Urban analytics systems usually consider the micro posts originating from within a city’s boundary uniformly, without consideration for the demographic (e.g. gender, age), geographic, technological or contextual (e.g. role in the city) differences among a platform’s users. It is well-known though, that the usage and adoption of social media profoundly differ across user segments, cities, as well as countries. We thus advocate for a better understanding of the intrinsic diversity of social media users and contents.This paper presents an observational study of the geo-located activities of users across two social media platforms, performed over a period of three weeks in four European cities. We show how demographic, geographical, technological and contextual properties of social media (and their users) can provide very different reflections and interpretations of the reality of an urban environment.

Jie Yang, Claudia Hauff, Geert-Jan Houben, Christiaan Titos Bolivar

Short Research Papers

Frontmatter
Data-Aware Service Choreographies Through Transparent Data Exchange

Our focus in this paper is on enabling the decoupling of data flow, data exchange and management from the control flow in service compositions and choreographies through novel middleware abstractions and realization. This allows us to perform the data flow of choreographies in a peer-to-peer fashion decoupled from their control flow. Our work is motivated by the increasing importance and business value of data in the fields of business process management, scientific workflows and the Internet of Things, all of which profiting from the recent advances in data science and Big data. Our approach comprises an application life cycle that inherently introduces data exchange and management as a first-class citizen and defines the functions and artifacts necessary for enabling transparent data exchange. Moreover, we present an architecture of the supporting system that contains the Transparent Data Exchange middleware, which enables the data exchange and management on behalf of service choreographies and provides methods for the optimization of the data exchange during their execution.

Michael Hahn, Dimka Karastoyanova, Frank Leymann
Formal Specification of RESTful Choreography Properties

BPM community has developed a rich set of languages for modeling interactions. In previous work, we argue that business process choreographies are suited for modeling REST-based interactions. To this end, RESTful choreographies have been introduced as an extension of business process choreographies. However, RESTful choreographies do not provide information about the validity of interactions. In this paper, we introduce formal completeness properties. These properties support developers to verify REST-based interactions. The approach is motivated by an example of an examination procedure in the context of a massive open online course.

Adriatik Nikaj, Mathias Weske
Analysis of an Access Control System for RESTful Services

RestACL is an access control system for RESTful Services and describes a policy specification language as well as an architecture that shows how access control can be integrated with RESTful Services. The language is based on the ideas of the attribute based access control model allowing rich variations of security policies with a great diversity of access rules. Its structure utilizes the concepts of REST enabling a quick identification of security policies that have to be evaluated in order to find an access decision. This work analyzes the requirements on such a language and gives a brief introduction over the RestACL concepts. Evidence is provided that the language enables the implementation of an appropriate and efficient access control system that fulfills the requirements.

Marc Hüffmeyer, Ulf Schreier
Operating System Compositor and Hardware Usage to Enhance Graphical Performance in Web Runtimes

Web runtimes are an essential part of the modern operating systems and their role will further grow in the future.Many web runtime implementations need to support multiple platforms and the design choices are driven by portability instead of optimized use of the underlying hardware.Thus, the implementations do not fully utilize the GPU and other graphics hardware.The consequence is reduced performance and increased power consumption. In this paper, we describe a way to improve the graphical performance of Chromium web runtime dramatically. In addition, the implementation aspects are discussed.

Antti Peuhkurinen, Andrey Fedorov, Kari Systä
QwwwQ: Querying Wikipedia Without Writing Queries

Wikipedia contains a wealth of data, some of which somes in a structured form. There have been initiatives to extract such structured knowledge, incorporating it in RDF triples. This allows running queries against the body of knowledge. Unfortunately, writing such queries is an unfeasible task for non-technical people, and even those who are familiar with the SPARQL language face the difficulty of not knowing the logical data schema. The problem has been attacked in many ways, mostly by attempting to provide user interfaces which make it possible to graphically navigate the see of RDF triples. We present an alternative user interface, which allows users to start from a Wikipedia page, and to simply express queries by saying “find me something like this, but with these properties having a value in the [A-B] range”.

Massimiliano Battan, Marco Ronchetti
A Quality Model for Linked Data Exploration

Linked (Open) Data (LD) offer the great opportunity to interconnect and share large amounts of data on a global scale, creating added value compared to data published via pure HTML. However, this enormous potential is not completely accessible. In fact, LD datasets are often affected by errors, inconsistencies, missing values and other quality issues that may lower their usage. Users are often not aware of the quality and characteristics of the LD datasets that they use for various and diverse tasks; thus they are not conscious of the effects that poor quality datasets may have on the results of their analyses. In this paper we present our initial results aimed to unleash LD usefulness, by providing a set of quality dimensions able to drive the selection and evaluation of LD sources. As a proof of concepts, we applied our model for assessing the quality of two LD datasets.

Cinzia Cappiello, Tommaso Di Noia, Bogdan Alexandru Marcu, Maristella Matera
Please Stay vs Let’s Play: Social Pressure Incentives in Paid Collaborative Crowdsourcing

Crowdsourcing via paid microtasks has traditionally been approached as an individual activity with units of work created and completed independently. Other forms of crowdsourcing have however, embraced a mixed model that further allows for interaction and collaboration. In this paper, we expand the model of collaborative crowdsourcing to explore the role of social pressure and social flow generated by partners, as sources of incentives for improved output. We designed experiments wherein a worker could request their partner to collaboratively complete more tasks than required, either not to be abandoned and lose money (social pressure), or for fun (social flow). Our experiments reveal that these socially motivated incentives can act as furtherance mechanisms improving output by over 30 % and accuracy by about 5 %.

Oluwaseyi Feyisetan, Elena Simperl
On the Invitation of Expert Contributors from Online Communities for Knowledge Crowdsourcing Tasks

The successful execution of knowledge crowdsourcing (KC) tasks requires contributors to possess knowledge or mastery in a specific domain. The need for expert contributors limits the capacity of online crowdsourcing marketplaces to cope with KC tasks. While online social platforms emerge as a viable alternative source of expert contributors, how to successfully invite them remains an open research question. We contribute an experiment in expert contributors invitation where we study the performance of two invitation strategies: one addressed to the individual expert contributors, and one addressed to communities of knowledge. We target reddit, a popular social bookmarking platform, to seek expert contributors in the botany and ornithology domains of knowledge, and to invite them to contribute an artwork annotation KC task. Results provide novel insights on the effectiveness of direct invitations strategies, but show how soliciting collaboration through communities yields, in the context of our experiment, more contributions.

Jasper Oosterman, Geert-Jan Houben
Analysis of a Cultural Heritage Game with a Purpose with an Educational Incentive

In this paper, we present Indomilando, a Cultural Heritage Game with a Purpose (GWAP) with the aim of ranking the photos of the architectural assets in the city of Milan, according to their recognizability. Besides evaluating the ability of Indomilando to achieve its ranking purpose, we also analyze the effect of an educational incentive on the players’ engagement. Indeed, discovering new cultural assets appeared to be a valuable reason to continue playing.

Irene Celino, Andrea Fiano, Riccardo Fino
Semantic Measures: How Similar? How Related?

There are two main types of semantic measures (SM): similarity and relatedness. There are also two main types of datasets, those intended for similarity evaluations and those intended for relatedness. Although they are clearly distinct, they are similar enough to generate some misconceptions.Is there a confusion between similarity and relatedness among the semantic measure community, both the designers of SMs and the creators of benchmarks? This is the question that the research presented in this paper tries to answer. Authors performed a survey of both the SMs and datasets and executed a cross evaluation of those measures and datasets. The results show different consistency of measures with datasets of the same type. This research enabled us to conclude not only that there is indeed some confusion but also to pinpoint the SMs and benchmarks less consistent with their intended type.

Teresa Costa, José Paulo Leal
Design of CQA Systems for Flexible and Scalable Deployment and Evaluation

Successfulness of Community Question Answering (CQA) systems on the open web (e.g. Yahoo! Answers) motivated for their utilization in new contexts (e.g. education or enterprise) and environments (e.g. inside organizations). In spite of initial research how their specifics influence design of CQA systems, many additional problems have not been addressed so far. Especially a poor flexibility and scalability which hamper: (1) CQA essential features to be employed in various settings (e.g. in different educational organizations); and (2) collaboration support methods to be effectively evaluated (e.g. in offline as well as in live experiments). In this paper, we provide design recommendations how to achieve flexible and scalable deployment and evaluation by means of a case study on educational and organizational CQA system Askalot. Its universal and configurable features allow us to deploy it at two universities as well as in MOOC system edX. In addition, by means of its experimental infrastructure, we can integrate various collaboration support methods which are loosely coupled and can be easily evaluated online as well as offline with datasets from Askalot itself or even from all CQA systems built on the top of the Stack Exchange platform.

Ivan Srba, Maria Bielikova
A Matter of Words: NLP for Quality Evaluation of Wikipedia Medical Articles

Automatic quality evaluation of Web information is a task with many fields of applications and of great relevance, especially in critical domains, like the medical one. We move from the intuition that the quality of content of medical Web documents is affected by features related with the specific domain. First, the usage of a specific vocabulary (Domain Informativeness); then, the adoption of specific codes (like those used in the infoboxes of Wikipedia articles) and the type of document (e.g., historical and technical ones). In this paper, we propose to leverage specific domain features to improve the results of the evaluation of Wikipedia medical articles, relying on Natural Language Processing (NLP) and dictionaries-based techniques. The results of our experiments confirm that, by considering domain-oriented features, it is possible to improve existing solutions, mainly with those articles that other approaches have less correctly classified.

Vittoria Cozza, Marinella Petrocchi, Angelo Spognardi
Middleware Mediated Semantic Sensor Networks

This paper investigates how the Internet of Things (IoT) can take advantage of Semantic Technologies when combined with a prototyping middleware. Although Sensor Networks within the IoT offer enormous potential, the surprising variability in terms of communication and interoperability among Things is a challenging problem. This paper proposes to move from the classic meaning of Sensor Networks to the concept of Semantic Sensor Networks. The proposed methodology uses Semantic Web technologies, based on machine-interpretable representation formalisms, for combining Sensor Networks and the Paraimpu middleware. We propose to prototype and deploy a Semantic Sensor Network, focusing on the semantic aspects, with the tangible advantage to delegate to Paraimpu the low-level network operations. As a result we obtain a shared RDF Knowledge Base useful for improving integration and communication between different networks.

Cristian Lai, Antonio Pintus

Vision Papers

Frontmatter
I am a Machine, Let Me Understand Web Media!

The majority of web assets cannot be understood by machines, because of the lack of available explicit and machine readable semantics. By enabling machines to understand the meaning of web media, fully automated discovery, processing, and linking become feasible. Semantic Web technologies offer the possibility to enhance web resources with explicit semantics via linking to ontologies encoded in RDF. We demand to make the content of every web asset explicit for machines with the least possible effort for any content provider. Web servers should deliver RDF descriptions for any web document on request. To achieve this, we propose a framework that enables web content providers to connect to content-wise descriptions of their web assets via simple HTTP content negotiation in connection with on-the-fly automated multimedia analysis services. We demonstrate the feasibility of our approach with a prototype implementation.

Magnus Knuth, Jörg Waitelonis, Harald Sack
Situational-Context: A Unified View of Everything Involved at a Particular Situation

As the interest in the Web of Things increases, specially for the general population, the barriers to entry for the use of these technologies should decrease. Current applications can be developed to adapt their behaviour to predefined conditions and users preferences, facilitating their use. In the future, Web of Things software should be able to automatically adjust its behaviour to non-predefined preferences or context of its users. In this vision paper we define the Situational-Context as the combination of the virtual profiles of the entities (things or people) that concur at a particular place and time. The computation of the Situational-Context allow us to predict the expected system behaviour and the required interaction between devices to meet the entities’ goals, achieving a better adjustment of the system to variable contexts.

Javier Berrocal, Jose Garcia-Alonso, Carlos Canal, Juan M. Murillo
The Direwolf Inside You: End User Development for Heterogeneous Web of Things Appliances

Mobile computing devices like smartphones have become a commodity. They are very convenient when connecting to ubiquitous Web of Things (WoT) appliances. However, WoT manufacturers are challenged to provide Web application interfaces for a multitude of mobile platforms in a short time. Moreover, end users are required to install dedicated Web apps for giving them access to these emerging technologies. To overcome this situational overburdening efforts, end user development in the form of component-based Web mashups has already been applied successfully in various domains. In this paper, we envision a framework for letting users create situational applications for opportunistic device usage. We explore the recent Web Component group of W3C recommendations as a foundation for peer-to-peer cross-platform, cross-application and cross-user Web applications. Our preliminary experiences may help the Web engineering community to build better Web infrastructures for a heterogeneous device landscape.

István Koren, Ralf Klamma

PhD Symposium Papers

Frontmatter
A Semantic Model for Friend Segregation in Online Social Networks

Online Social Networks exhibit many of the characteristics of human societies in terms of forming relationships and sharing personal information. However, the major online social networks lack an effective mechanism to represent diverse social relationships of the users. This leads to undesirable consequences of disclosing personal information of the users with unintended audiences. We propose a semantic model for friend segregation in online social networks. The relationship strength and social context of the users play vital role in friend segregation. The model infers relationship strength and social context from interaction pattern and profile similarity attributes of the users. We also conducted a research study with online social networks users. The study gives insight on user’s information sharing behaviour and interaction pattern in online social networks. The findings reveal that personal information disclosure depends on relationship strength among the users.

Javed Ahmed
Bootstrapping an Online News Knowledge Base

News retrieval systems facilitate the process of quickly learning about events or stories reported in various online news providers. The traditional approach involves clustering articles that report about the same event using bag-of-words or concept based similarity measures, and offering personalized recommendations using various user modeling approaches. Knowledge bases have been extensively used in the recent years for powering search engines on entity based searches. The success of this approach, demonstrated by a now de-facto way of searching and browsing offered by commercial search engines and mobile applications, has created the need to incorporate semantic capabilities to news retrieval systems. In this paper we present a proposal for creating a knowledge base of entities, events and facts reported in Albanian online news providers. We aim to provide a news stream processing pipeline based in generally available open source toolkits and state-of-the-art research works about event and fact oriented knowledge bases.

Klesti Hoxha, Artur Baxhaku, Ilia Ninka
Integrating Big Spatio-Temporal Data Using Collaborative Semantic Data Management

Good decision support of geographical information systems depends on the accuracy, consistency and completeness of the provided data. This work introduces the hypothesis that the increasing amount of geographic data will significantly improve the decision support of geographical information systems, providing that a smart data integration approach considers provenance, schema and format of the gathered data accordingly. Sources for spatial data are distributed and quality of the data is varying, especially when considering uncertain data like volunteered geographic information and participatory sensing data. In our approach, we address the challenge of integrating Big Data in geographical information systems by describing sources and data transformation services for spatio-temporal data using a collaborative system for managing meta data based on Semantic MediaWiki. These machine interpretable descriptions are used to compose workflows of data sources and data transformation services adopted to the requirements of geographical information systems.

Matthias Frank
Extending Kansei Engineering for Requirements Consideration in Web Interaction Design

In our paper we consider how the eminent Kansei Engineering (KE) method can be applied in computer-aided development of websites. Although principally used for exploring emotional dimension of users’ experience with products, KE can be extended to incorporate other types of software requirements. In conjunction with AI Neural Networks (Kansei Type II), it then becomes possible to automate, up to a certain degree, evaluation of website quality in terms of functionality, usability, and appeal. We provide an overview of existing works related to KE application in web design, and note its certain gap with systematic Web Engineering. Then we summarize approaches for auto-validation of different types of requirements, with particular focus on computer-aided usability evaluation. Finally, we describe the ongoing experimental study we undertook with 82 participants, in which a Kansei-based survey with 21 university websites was performed, and outline preliminary results and prospects.

Maxim Bakaev, Martin Gaedke, Vladimir Khvorostov, Sebastian Heil
Improving Automated Fact-Checking Through the Semantic Web

The Internet supplies information that can be used to automatically populate knowledge bases and to keep them updated, but the facts contained in these automatically managed knowledge bases must be validated before being trustfully used by applications. So far, this process, known as fact-checking, has been performed by humans curators with experience in the investigated domain, however, the big increase of the speed to which the internet provides information makes this way of doing inadequate. Nowadays techniques exist for automatic fact-checking, but they lack on modeling the domain of the information to be checked, thus losing the experience feature humans curators provide. This work designs a Semantic Web platform for automatic fact-checking, which uses OWL Ontology to create a specific knowledge base modeled on the domain concerning the facts to be checked, and it extends the knowledge available by linking this knowledge base to external repository of information and by reasoning about this extended knowledge. The fact-checking task is performed using a machine learning algorithm trained using the information of this extended knowledge base.

Alex Carmine Olivieri
Using Spatiotemporal Information to Integrate Heterogeneous Biodiversity Semantic Data

Biodiversity is essential to life on Earth and motivates many efforts to collect data about species. These data are collected in different places and published in different formats. Researchers use it to extract new knowledge about living things, but it is difficult to retrieve, combine and integrate data sources from different places. This work will investigate how to integrate biodiversity information from heterogeneous sources using Semantic Web technologies. Its main objective is to propose an architecture to link biodiversity data using mainly their spatiotemporal dimension, effectively search these linked data sets and test them using real use cases, defined with the help of biodiversity experts. It is also an important objective to propose a suitable provenance model that captures not only data origin but also temporal information. This architecture will be tested on a set of representative data from important Brazilian institutions that are involved in studies of biodiversity.

Flor Amanqui, Ruben Verborgh, Erik Mannens, Rik Van de Walle, Dilvan Moreira

Demonstration Papers

Frontmatter
Automatic Page Object Generation with APOGEN

Page objects are used in web test automation to decouple the test cases logic from their concrete implementation. Despite the undeniable advantages they bring, as decreasing the maintenance effort of a test suite, yet the burden of their manual development limits their wide adoption. In this demo paper, we give an overview of Apogen, a tool that leverages reverse engineering, clustering and static analysis, to automatically generate Java page objects for web applications.

Andrea Stocco, Maurizio Leotta, Filippo Ricca, Paolo Tonella
SnowWatch: A Multi-modal Citizen Science Application

The demo presents SnowWatch, a citizen science system that supports the acquisition and processing of mountain images for the purpose of extracting snow information, predicting the amount of water available in the dry season, and supporting a multi-objective lake regulation problem. We discuss how the proposed architecture has been rapidly prototyped using a general-purpose architecture to collect sensor and user-generated Web content from heterogeneous sources, process it for knowledge extraction, relying on the contribution of voluntary crowds, engaged and retained with gamification techniques.

Roman Fedorov, Piero Fraternali, Chiara Pasini
CroKnow: Structured Crowd Knowledge Creation

This demo presents the Crowd Knowledge Curator (CroKnow), a novel web-based platform that streamlines the processes required to enrich existing knowledge bases (e.g. Wikis) by tapping on the latent knowledge of expert contributors in online platforms. The platform integrates a number of tools aimed at supporting the identification of missing data from existing structured resources, the specification of strategies to identify and invite candidate experts from open communities, and the visualisation of the knowledge creation process status. CroKnow will be demonstrated through a case study focusing on the enrichment of the Rijksmuseum Amsterdams digital collection.

Jasper Oosterman, Alessandro Bozzon, Geert-Jan Houben
ELES: Combining Entity Linking and Entity Summarization

The automatic annotation of textual content with entities from a knowledge base is a well established field. Applications, such as DBpedia Spotlight and GATE enable to identify and disambiguate entities of text at high levels of accuracy. The output of such systems can be used in many different ways. One way is to show knowledge panels which provide a fact-based summary of an entity and provides further information as well as browsing options. Such fact-based summaries are produced by entity summarization systems.This paper presents ELES, a lightweight combination of DBpedia Spotlight and the SUMMA entity summarization interface. DBpedia Spotlight analyzes text and links fragments to entities of the DBpedia knowledge base. The LinkSUM summarizer (interfaced via the SUMMA API definition) produces fact-based summaries of DBpedia entities. The two applications are combined on the client side through the “Internationalization Tag Set 2.0” W3C recommendation and lightweight jQuery-based interfaces.

Andreas Thalhammer, Achim Rettinger
Liquid, Autonomous and Decentralized Stream Processing for the Web of Things

In recent years we have witnessed the rise in number of smart devices and sensors connected through the Web. This led researchers to explore the World Wide Web as a platform to orchestrate such devices. In this demo we show how we are able to harmonize heterogeneous hardware for home automation systems with the Web Liquid Streams (WLS) framework. The WLS framework lets developers implement topologies of data streams across a heterogeneous pool of devices thanks to Node.JS and the Web browser. By using JavaScript, the lingua franca of the Web, we are able to write the stream operators once and run them anywhere a Web browser or Node.JS can run. The demo shows a home automation system application that can seamlessly run on different kind of devices.

Masiar Babazadeh
Migrating and Pairing Recursive Stateful Components Between Multiple Devices with Liquid.js for Polymer

With the continuous development of new Web-enabled devices, we are heading toward an era in which users connect to the Web with multiple devices at the same time. Users expect Web applications to be able to flow between all the devices they own, however the majority of the current Web applications was not designed considering this use case scenario. As the number of devices owned by a user increases, we have to find new ways to give Web developers the tools to easily implement the expected liquid behaviour into their software. We present a new contribution provided by the Liquid.js for Polymer framework, allowing the migration of recursive component-based Web applications from a device to another. In this demo paper we will show how to create recursive components, how to migrate them among devices, and how their state can be paired among the various components.

Andrea Gallidabino
A Universal Socio-Technical Computing Machine

This is an attempt to develop a universal socio-technical computing machine that captures and coordinates human input to let collective problem solving activities emerge on the Web without the need for an a priori composition of a dedicated task or human collective.

Markus Luczak-Roesch, Ramine Tinati, Saud Aljaloud, Wendy Hall, Nigel Shadbolt
Web Objects Ambient: An Integrated Platform Supporting New Kinds of Personal Web Experiences

The Personal Web arose to empower end users with the ability to drive and integrate the Web by themselves, according to their own interests. This is usually achieved through Web Augmentation, Mashups or Personal Information Managers (PIM), but despite the diversity of approaches, there are still scenarios that require to be solved through the combination of their features, which implies the end user knowing diverse tools and being able to coordinate them. This paper presents WOA, a platform conceived under the foundations of the Personal Web for supporting the harvesting and materialization of information objects from existing Web content, and their enhancement through the addition of specialized behaviour. This makes it possible to conceive multiple Web information objects coexisting in a same space of information and offering the end user with different modes of interaction, therefore, with multiple kinds of personal Web experiences.

Gabriela Bosetti, Sergio Firmenich, Gustavo Rossi, Marco Winckler, Tomas Barbieri
WeatherUSI: User-Based Weather Crowdsourcing on Public Displays

Contemporary public display systems hold a significant potential to contribute to in situ crowdsourcing. Recently, public display systems have surpassed their traditional role as static content projection hotspots by supporting interactivity and hosting applications that increase overall perceived user utility. As such, we developed WeatherUSI, a web-based interactive public display application that enables passers-by to input subjective information about current and future weather conditions. In this demo paper, we present the functionality of the app, describe the underlying system infrastructure and present how we combine input streams originating from WeatherUSI app on a public display together with its mobile app counterparts for facilitating user based weather crowdsourcing.

Evangelos Niforatos, Ivan Elhart, Marc Langheinrich
Discovering and Analyzing Alternative Treatments Hypothesis via Social Health Data

User-generated social health data can provide valuable information to extend the status of the medical knowledge. We present a tool geared towards social health data exploration and reasoning. Starting from a repository of semantically linked social health data, we enable researchers to discover alternative treatments as well as similar conditions by exploring the semantic repository via potentially compatible concepts. Researchers are prompted with the features of the concepts under investigation to analyze similarities and contradictions, when present. Concepts are enriched with confidence values that help researchers in assessing the reliability of the information they are analyzing.

Paolo Cappellari, Soon Ae Chun, Dennis Shpits
Towards Handling Constraint Network Conditions Between WoT Entities Using Conflict-Free Anti-Entropy Communication

Deploying and composing Web of Things entities in scenarios where connections are subject to network constraints, like disconnected operations, intermittent connections or limited bandwidth, requires handling changing network conditions properly. Therefore, this work proposes to utilize both eventual consistent data structures and corresponding eventual consistent communication for such scenarios. This enables composition and collaboration of WoT entities in network constraint scenarios.

Markus Ast, Martin Gaedke

Poster Papers

Frontmatter
RESTful Conversation with RESTalk
The Use Case of Doodle

With the availability of multiple Web services, offering identical or similar utilities, their ease of use has become a valuable success factor, highly influenced by API’s documentation quality. Tools are available for documenting the various technical details pertaining to the static structure of RESTful services. Additionally, we have identified interest in and usefulness of also depicting API’s behaviour, i.e., the viable RESTful conversations defined as multiple client-server interactions necessary to utilize certain service functionality. RESTalk, the REST domain specific language we have designed for modeling RESTful conversations, facilitates the conceptual modeling and visualisation of API’s behaviour. In this poster paper, we extend RESTalk with new language constructs and apply it on a real RESTful API, the Doodle API, which refers to RESTful conversations between multiple clients and one server.

Ana Ivanchikj
Supporting Personalization in Legacy Web Sites Through Client-Side Adaptation

Immersed in social and mobile Web, users are expecting personalized browsing experiences, based on their needs, goals, and preferences. However, adding personalization to an existing Web site is not a simple task for Web owners who are not personalization experts. Most of the existing personalization approaches imply extending the backend application or paying for a personalization as a service solution, which are more focused on improving conversion rates than on improving the user browsing experience. In this work we present a methodology to add client-based personalization to an existing Web site oriented to non-developer designers or Web sites owners. This approach allows them to define a set of personalization rules to be applied in the client-side with minimum alterations on the backend application.

Jesús López Miján, Irene Garrigós, Sergio Firmenich
A Lightweight Semi-automated Acceptance Test-Driven Development Approach for Web Applications

Applying Acceptance Test Driven Development (ATDD) in the context of web applications is a difficult task due to the intricateness of existing tools/frameworks and, more in general, of the proposed approaches. In this work, we present a simple approach for developing web applications in ATDD mode, based on the usage of Screen Mockups and Selenium IDE.

Diego Clerissi, Maurizio Leotta, Gianna Reggio, Filippo Ricca
The WoT as an Awareness Booster in Agile Development Workspaces

Continuous feedback is one of the most important concepts in agile development. We argue for the need to increase the awareness that the team maintains of various facets of the development activity. We then introduce iFLUX, an event-driven middleware designed for the Web of Things. We explain how it provides a platform that facilitates the creation of augmented workplaces that connect various information sources with physical displays, also known as information radiators.

Olivier Liechti, Jacques Pasquier, Laurent Prévost, Pascal Gremaud
A Model-Driven Process to Migrate Web Content Management System Extensions

Developing and maintaining software extensions for Web Content Management Systems (WCMSs) like Joomla, WordPress, or Drupal can be a difficult and time consuming process. This poster presents a model-driven process which addresses typical challenges during the migration of software extensions for WCMSs. We introduce JooMDD as a prototypical environment for the development and maintenance of Joomla extensions. JooMDD consists of a domain-specific modelling language for WCMS extensions, a reverse engineering tool to create models based on existing WCMS extensions, and a code generator for software extensions, which can be used to enrich Joomla-based applications. The use of JooMDD within our research demonstrates the application of a model-driven migration process for WCMS extensions.

Dennis Priefer, Peter Kneisel, Gabriele Taentzer

Tutorials

Frontmatter
Using Docker Containers to Improve Reproducibility in Software and Web Engineering Research

The ability to replicate and reproduce scientific results has become an increasingly important topic for many academic disciplines. In computer science and, more specifically, software and web engineering, contributions of scientific work rely on developed algorithms, tools and prototypes, quantitative evaluations, and other computational analyses. Published code and data come with many undocumented assumptions, dependencies, and configurations that are internal knowledge and make reproducibility hard to achieve. This tutorial presents how Docker containers can overcome these issues and aid the reproducibility of research artifacts in software and web engineering and discusses their applications in the field.

Jürgen Cito, Vincenzo Ferme, Harald C. Gall
A Declarative Approach to Information Extraction Using Web Service API

The number of diverse web services that we use regularly is significantly increasing. Most of these services are managed by autonomous service providers. However it has become very difficult to get a unified view of this widespread data, which in all likelihood is substantially important to enterprises. A classical approach followed by the enterprises is to write applications using imperative languages making use of the web service API. Such an approach is not scalable and is difficult to maintain considering the ever-evolving web services landscape. This tutorial explores a semi-automated declarative approach to information extraction from the web services using a classical virtual data integration approach, namely mediation, that relies on a well-known query rewriting algorithm, namely the inverse-rules algorithm. It is targeted to audience from both industry as well as academia and requires a basic understanding of database principles and web technologies.

John Samuel, Christophe Rey
Distributed Web Applications with IPFS, Tutorial

The contents of this document describe the tutorial session delivered at ICWE 2016, focused on Building Distributed Web Applications with IPFS. IPFS, the InterPlanetary File System, is the distributed and permanent Web, a protocol to make the Web faster, more secure and open. The tutorial format focuses in key elements of IPFS and how to use it to build applications with

David Dias, Juan Benet
Recommender Systems Meet Linked Open Data

Information overload is a problem we daily experience when accessing information channels such as a Web site, a mobile application or even our set-top box. There is a clear need for applications able to guide users through an apparently chaotic information space thus filtering, in a personalized way, only those elements that may result of interest to them. Together with the transformation of the Web from a distributed and hyperlinked repository of documents to a distributed repository of structured knowledge, in the last years, a new generation of recommendation engines has emerged. As of today, we have a huge amount of RDF data published as Linked Open Data (LOD) and available via a SPARQL endpoint and the number of applications able to exploit the knowledge they encoe is growing consistently. Among these new applications and services, recommender systems are gaining positions in the LOD arena.

Tommaso Di Noia
Backmatter
Metadaten
Titel
Web Engineering
herausgegeben von
Alessandro Bozzon
Philippe Cudre-Maroux
Cesare Pautasso
Copyright-Jahr
2016
Electronic ISBN
978-3-319-38791-8
Print ISBN
978-3-319-38790-1
DOI
https://doi.org/10.1007/978-3-319-38791-8

Premium Partner