scroll identifier for mobile
main-content

## Über dieses Buch

This book constitutes revised proceedings of the 17th International Conference on E-Commerce and Web Technologies, EC-Web 2016, held in Porto, Portugal, in September 2016, in conjunction with DEXA.
The 13 papers presented in this volume were carefully reviewed and selected from 21 submissions. They were organized in topical sections named: recommender systems; data management and data analysis; and business processes, Web services and cloud computing.

## Inhaltsverzeichnis

### Scalable Online Top-N Recommender Systems

Given the large volumes and dynamics of data that recommender systems currently have to deal with, we look at online stream based approaches that are able to cope with high throughput observations. In this paper we describe work on incremental neighborhood based and incremental matrix factorization approaches for binary ratings, starting with a general introduction, looking at various approaches and describing existing enhancements. We refer to recent work on forgetting techniques and multidimensional recommendation. We will also focus on adequate procedures for the evaluation of online recommender algorithms.
Alípio M. Jorge, João Vinagre, Marcos Domingues, João Gama, Carlos Soares, Pawel Matuszyk, Myra Spiliopoulou

### User Control in Recommender Systems: Overview and Interaction Challenges

Recommender systems have shown to be valuable tools that help users find items of interest in situations of information overload. These systems usually predict the relevance of each item for the individual user based on their past preferences and their observed behavior. If the system’s assumption about the users’ preferences are however incorrect or outdated, mechanisms should be provided that put the user into control of the recommendations, e.g., by letting them specify their preferences explicitly or by allowing them to give feedback on the recommendations. In this paper we review and classify the different approaches from the research literature of putting the users into active control of what is recommended. We highlight the challenges related to the design of the corresponding user interaction mechanisms and finally present the results of a survey-based study in which we gathered user feedback on the implemented user control features on Amazon.
Dietmar Jannach, Sidra Naveed, Michael Jugovac

### How to Combine Visual Features with Tags to Improve Movie Recommendation Accuracy?

Previous works have shown the effectiveness of using stylistic visual features, indicative of the movie style, in content-based movie recommendation. However, they have mainly focused on a particular recommendation scenario, i.e., when a new movie is added to the catalogue and no information is available for that movie (New Item scenario). However, the stylistic visual features can be also used when other sources of information is available (Existing Item scenario).
In this work, we address the second scenario and propose a hybrid technique that exploits not only the typical content available for the movies (e.g., tags), but also the stylistic visual content extracted form the movie files and fuse them by applying a fusion method called Canonical Correlation Analysis (CCA). Our experiments on a large catalogue of 13 K movies have shown very promising results which indicates a considerable improvement of the recommendation quality by using a proper fusion of the stylistic visual features with other type of features.
Yashar Deldjoo, Mehdi Elahi, Paolo Cremonesi, Farshad Bakhshandegan Moghaddam, Andrea Luigi Edoardo Caielli

### Explorative Analysis of Recommendations Through Interactive Visualization

Even though today’s recommender algorithms are highly sophisticated, they can hardly take into account the users’ situational needs. An obvious way to address this is to initially inquire the users’ momentary preferences, but the users’ inability to accurately state them upfront may lead to the loss of several good alternatives. Hence, this paper suggests to generate the recommendations without such additional input data from the users and let them interactively explore the recommended items on their own. To support this explorative analysis, a novel visualization tool based on treemaps is developed. The analysis of the prototype demonstrates that the interactive treemap visualization facilitates the users’ comprehension of the big picture of available alternatives and the reasoning behind the recommendations. This helps the users get clear about their situational needs, inspect the most relevant recommendations in detail, and finally arrive at informed decisions.
Christian Richthammer, Günther Pernul

### An E-Shop Analysis with a Focus on Product Data Extraction

E-commerce is a constantly growing and competitive market. Online prices are updated daily or even more frequently, and it is very important for e-shoppers to find the lowest price online. Therefore, e-shop owners need to know the prices of their competitors and must be able to adjust their own prices in order to remain competitive. The manual monitoring of all prices of all products and competitors is too time-consuming; hence, the e-shop owners need software support for that task. For the development of such software tools the developers need a profound comprehension of the structure and design of e-shop websites. Existing software tools for Web data extraction are based on the findings of different website analyzes. The existing tools show: The more specific and detailed the analysis and the analyzed websites, the better the data extraction results. This paper presents the results and the derived findings of a deep analysis of 50 different e-shop websites in order to provide new insights for the development and improvement of software tools for product data extraction from e-shop websites.
Andrea Horch, Andreas Wohlfrom, Anette Weisbecker

### The WDC Gold Standards for Product Feature Extraction and Product Matching

Finding out which e-shops offer a specific product is a central challenge for building integrated product catalogs and comparison shopping portals. Determining whether two offers refer to the same product involves extracting a set of features (product attributes) from the web pages containing the offers and comparing these features using a matching function. The existing gold standards for product matching have two shortcomings: (i) they only contain offers from a small number of e-shops and thus do not properly cover the heterogeneity that is found on the Web. (ii) they only provide a small number of generic product attributes and therefore cannot be used to evaluate whether detailed product attributes have been correctly extracted from textual product descriptions. To overcome these shortcomings, we have created two public gold standards: The WDC Product Feature Extraction Gold Standard consists of over 500 product web pages originating from 32 different websites on which we have annotated all product attributes (338 distinct attributes) which appear in product titles, product descriptions, as well as tables and lists. The WDC Product Matching Gold Standard consists of over $$75\,000$$ correspondences between 150 products (mobile phones, TVs, and headphones) in a central catalog and offers for these products on the 32 web sites. To verify that the gold standards are challenging enough, we ran several baseline feature extraction and matching methods, resulting in F-score values in the range 0.39 to 0.67. In addition to the gold standards, we also provide a corpus consisting of 13 million product pages from the same websites which might be useful as background knowledge for training feature extraction and matching methods.
Petar Petrovski, Anna Primpeli, Robert Meusel, Christian Bizer

### MFI-TransSW+: Efficiently Mining Frequent Itemsets in Clickstreams

Data stream mining is the process of extracting knowledge from massive real-time sequence of data items arriving at a very high data rate. It has several practical applications, such as user behavior analysis, software testing and market research. However, the large amount of data generated may offer challenges to process and analyze data at nearly real time. In this paper, we first present the MFI-TransSW+ algorithm, an optimized version of MFI-TransSW algorithm that efficiently processes clickstreams, that is, data streams where the data items are the pages of a Web site. Then, we outline the implementation of a news articles recommender system, called ClickRec, to demonstrate the efficiency and applicability of the proposed algorithm. Finally, we describe experiments, conducted with real world data, which show that MFI-TransSW+ outperforms the original algorithm, being up to two orders of magnitude faster when processing clickstreams.
Franklin A. de Amorim, Bernardo Pereira Nunes, Giseli Rabello Lopes, Marco A. Casanova

### Reranking Strategies Based on Fine-Grained Business User Events Benchmarked on a Large E-commerce Data Set

As traditional search engines based on the text content often fail to efficiently display the products that the customers really desire, web companies commonly resort to reranking techniques in order to improve the products’ relevance given a user query. For that matter, one may take advantage of fine-grained past user events it is now feasible to collect and process, such as the clicks, add-to-basket or purchases. We use a real-world data set of such events collected over a five-month period on a leading e-commerce company in order to benchmark reranking algorithms. A simple strategy consists in reordering products according to the clicks they gather. We also propose a more sophisticated method, based on an autoregressive model to predict the number of purchases from past events. Since we work with retail data, we assert that the most relevant and objective performance metric is the percent revenue generated by the top reranked products, rather than subjective criteria based on relevance scores assigned manually. By evaluating in this way the algorithms against our database of purchase events, we find that the top four products displayed by a state-of-the-art search engine capture on average about 25% of the revenue; reordering products according to the clicks they gather increases this percentage to about 48%; the autoregressive method reaches approximately 55%. An analysis of the coefficients of the autoregressive model shows that the past user events lose most of their predicting power after 2–3 days.
Yang Jiao, Bruno Goutorbe, Matthieu Cornec, Jeremie Jakubowicz, Christelle Grauer, Sebastien Romano, Maxime Danini

### Feature Selection Approaches to Fraud Detection in e-Payment Systems

Due to the large amount of data generated in electronic transactions, to find the best set of features is an essential task to identify frauds. Fraud detection is a specific application of anomaly detection, characterized by a large imbalance between the classes, which can be a detrimental factor for feature selection techniques. In this work we evaluate the behavior and impact of feature selection techniques to detect fraud in a Web Transaction scenario. To measure the effectiveness of the feature selection approach we use some state-of-the-art classification techniques to identify frauds, using real application data. Our results show that the imbalance between the classes reduces the effectiveness of feature selection and that resampling strategy applied in this task improves the final results. We achieve a very good performance, reducing the number of features and presenting financial gains of up to 57.5% compared to the actual scenario of the company.
Rafael Franca Lima, Adriano C. M. Pereira

### Multimodal Indexing and Search of Business Processes Based on Cumulative and Continuous N-Grams

Reuse of business processes may contribute to the efficient deployment of new services. However, due to the large volume of process repositories, finding a particular process may become a difficult task. Most of the existing works in processes search are focused on textual information and graph matching. This paper presents a multimodal indexing and search model of business processes based on cumulative and continuous n–grams. The present method considers linguistic and behavior information represented as codebooks. Codebooks describe structural components based on the n-gram concept. Obtained results outperform the precision, recall and F-Measure of previous approaches considerably.
Hugo Ordoñez, Armando Ordoñez, Carlos Cobos, Luis Merchan

### Scoring Cloud Services Through Digital Ecosystem Community Analysis

Cloud service selection is a complex process that requires assessment of not only individual features of a cloud service but also its ability to interoperate with an ecosystem of cloud services. In this position paper, we address the problem by devising metrics to measure the impact of interoperability among the cloud services to guide the cloud service selection process. We introduce concrete definitions and metrics to contribute to measuring the level of interoperability between cloud services. We also demonstrate a methodology to evaluate the metrics via a use case example. Our contributions prove that the proposed metrics cover critical aspects related to interoperability in multi-cloud arena and therefore form a robust baseline to compare cloud services in systematic decision making environments.
Jaume Ferrarons, Smrati Gupta, Victor Muntés-Mulero, Josep-Lluis Larriba-Pey, Peter Matthews

### Handling Branched Web Service Composition with a QoS-Aware Graph-Based Method

The concept of Service-Oriented Architecture, where individual services can be combined to accomplish more complex tasks, provides a flexible and reusable approach to application development. Their composition can be performed manually, however doing so may prove to be challenging if many service alternatives with differing qualities are available. Evolutionary Computation (EC) techniques have been employed successfully to tackle this problem, especially Genetic Programming (GP), since it is capable of encoding conditional constraints on the composition’s execution paths. While compositions can naturally be represented as Directed Acyclic Graphs (DAGs), GP needs to encode candidates as trees, which may pose conversion difficulties. To address that, this work proposes an extension to an existing EC approach that represents solutions directly as DAGs. The tree-based and extended graph-based composition approaches are compared, showing significant gains in execution time when using graphs, sometimes up to two orders of magnitude. The quality levels of the solutions produced, however, are somewhat higher for the tree-based approach. This, in addition to a convergence test, shows that the genetic operators employed by the graph-based approach can be potentially improved. Nevertheless, the extended graph-based approach is shown to be capable of handling compositions with multiple conditional constraints, which is not possible when using the tree-based approach.
Alexandre Sawczuk da Silva, Hui Ma, Mengjie Zhang, Sven Hartmann

### Recommendation in Interactive Web Services Composition: A State-of-the-Art Survey

With the increasing adoption of Web services, designing novel approaches for recommending relevant Web services has become of paramount importance especially to support many practical applications such as Web services composition. In this paper, a survey aiming at encompassing the state-of-the-art of interactive Web services composition recommendation approaches is presented. Both Web services composition and recommender systems concepts are introduced and their particular challenges are also discussed. Moreover, the need of using recommendation techniques to support Web services composition is also highlighted. The most relevant approaches dedicated to address this need are presented, categorized and compared.
Meriem Kasmi, Yassine Jamoussi, Henda Hajjami Ben Ghézala

### Backmatter

Weitere Informationen

## BranchenIndex Online

Die B2B-Firmensuche für Industrie und Wirtschaft: Kostenfrei in Firmenprofilen nach Lieferanten, Herstellern, Dienstleistern und Händlern recherchieren.

## Whitepaper

- ANZEIGE -

### Best Practices für die Mitarbeiter-Partizipation in der Produktentwicklung

Unternehmen haben das Innovationspotenzial der eigenen Mitarbeiter auch außerhalb der F&E-Abteilung erkannt. Viele Initiativen zur Partizipation scheitern in der Praxis jedoch häufig. Lesen Sie hier  - basierend auf einer qualitativ-explorativen Expertenstudie - mehr über die wesentlichen Problemfelder der mitarbeiterzentrierten Produktentwicklung und profitieren Sie von konkreten Handlungsempfehlungen aus der Praxis.