Skip to main content
main-content

Über dieses Buch

To large organizations, business intelligence (BI) promises the capability of collecting and analyzing internal and external data to generate knowledge and value, thus providing decision support at the strategic, tactical, and operational levels. BI is now impacted by the “Big Data” phenomena and the evolution of society and users. In particular, BI applications must cope with additional heterogeneous (often Web-based) sources, e.g., from social networks, blogs, competitors’, suppliers’, or distributors’ data, governmental or NGO-based analysis and papers, or from research publications. In addition, they must be able to provide their results also on mobile devices, taking into account location-based or time-based environmental data.

The lectures held at the Second European Business Intelligence Summer School (eBISS), which are presented here in an extended and refined format, cover not only established BI and BPM technologies, but extend into innovative aspects that are important in this new environment and for novel applications, e.g., machine learning, logic networks, graph mining, business semantics, large-scale data management and analysis, and multicriteria and collaborative decision making.

Combining papers by leading researchers in the field, this volume equips the reader with the state-of-the-art background necessary for creating the future of BI. It also provides the reader with an excellent basis and many pointers for further research in this growing field.

Inhaltsverzeichnis

Frontmatter

Managing Complex Multidimensional Data

Abstract
Multidimensional database concepts such as cubes, dimensions with hierarchies, and measures are a cornerstone of business intelligence. However, the standard data models and system implementations (OLAP) for multidimensional databases are sometimes not able to capture the complexities of advanced real-world application domains. This lecture will focus on how to manage such complex multidimensional data, including complex dimension hierarchies, complex measures, and integration of multidimensional data with complex external data. We will look at how complex multidimensional data emerge in complex application domains such as medical data, location-based services, music data, web data, and text data, and present solutions for these domains that support multidimensional business intelligence.
Torben Bach Pedersen

An Introduction to Business Process Modeling

Abstract
Business Process Modeling (BPM) is the activity of representing the processes of an organization, so that they can be analyzed and improved. Nowadays, with increased globalization, BPM techniques are used, for example, to optimize the way in which organizations react to business events, in order to enhance competitiveness. Starting from the underlying notion of workflow modeling, this paper introduces the basic concepts of modeling and implementing business processes using current information technologies and standards, such as Business Process Modeling Notation (BPMN) and Business Process Execution Language (BPEL). We also address the novel, yet growing, topic of Business Process Mining, and point out to open research challenges in the area.
Alejandro Vaisman

Machine Learning Strategies for Time Series Forecasting

Abstract
The increasing availability of large amounts of historical data and the need of performing accurate forecasting of future behavior in several scientific and applied domains demands the definition of robust and efficient techniques able to infer from observations the stochastic dependency between past and future. The forecasting domain has been influenced, from the 1960s on, by linear statistical methods such as ARIMA models. More recently, machine learning models have drawn attention and have established themselves as serious contenders to classical statistical models in the forecasting community. This chapter presents an overview of machine learning techniques in time series forecasting by focusing on three aspects: the formalization of one-step forecasting problems as supervised learning tasks, the discussion of local learning techniques as an effective tool for dealing with temporal data and the role of the forecasting strategy when we move from one-step to multiple-step forecasting.
Gianluca Bontempi, Souhaib Ben Taieb, Yann-Aël Le Borgne

Knowledge Discovery from Constrained Relational Data: A Tutorial on Markov Logic Networks

Abstract
This tutorial paper gives an overview of Markov logic networks (MLNs) in theory and in practice. The basic concepts of MLNs are introduced in a semi-formal way and examined for their significance in the broader context of statistical relational learning approaches in general and Bayesian logic networks in particular. A sandbox example is discussed in order to explain in detail the meanings of input theories with weighted clauses for a MLN. Then, the setup needed for real-world applications using a recent open source prototype is introduced. Processing steps of inferencing and learning are explained in detail together with the best scaling algorithms known today. An overview on existing and upcoming application areas concludes the paper.
Marcus Spies

Large Graph Mining: Recent Developments, Challenges and Potential Solutions

Abstract
With the recent growth of the graph-based data, the large graph processing becomes more and more important. In order to explore and to extract knowledge from such data, graph mining methods, like community detection, is a necessity. Although the graph mining is a relatively recent development in the Data Mining domain, it has been studied extensively in different areas (biology, social networks, telecommunications and Internet). The legacy graph processing tools mainly rely on single machine computational capacity, which cannot process large graph with billions of nodes. Therefore, the main challenge of new tools and frameworks lies on the development of new paradigms that are scalable, efficient and flexible. In this paper, we will review the new paradigms of large graph processing and their applications to graph mining domain using the distributed and shared nothing approach used for large data by Internet players. The paper will be organized as a walk through different industrial needs in terms of graph mining passing by the existing solutions. Finally, we will expose a set of open research questions linked with several new business requirements as the graph data warehouse.
Sabri Skhiri, Salim Jouili

Big Data Analytics on Modern Hardware Architectures: A Technology Survey

Abstract
Big Data Analytics has the goal to analyze massive datasets, which increasingly occur in web-scale business intelligence problems. The common strategy to handle these workloads is to distribute the processing utilizing massive parallel analysis systems or to use big machines able to handle the workload. We discuss massively parallel analysis systems and their programming models. Furthermore, we discuss the application of modern hardware architectures for database processing. Today, many different hardware architectures apart from traditional CPUs can be used to process data. GPUs or FPGAs, among other new hardware, are usually employed as co-processors to accelerate query execution. The common point of these architectures is their massive inherent parallelism as well as a different programming model compared to the classical von Neumann CPUs. Such hardware architectures offer the processing capability to distribute the workload among the CPU and other processors, and enable systems to process bigger workloads.
Michael Saecker, Volker Markl

An Introduction to Multicriteria Decision Aid: The PROMETHEE and GAIA Methods

Abstract
Most strategic decision problems involve the evaluation of potential solutions according to multiple conflicting criteria. The aim of this chapter is to introduce some basic concepts of Multicriteria Decision Aid (MCDA) with a special emphasis on the PROMETHEE and GAIA methods. First, we will introduce the specific vocabulary of this research area as well as traditional modelling issues. The main part of the presentation will be dedicated to explain in detail the PROMETHEE and GAIA methods. Finally, an illustrative example will be analyzed with the D-Sight software. This will highlight the added value of using interactive and visual tools in complex decision processes.
Yves De Smet, Karim Lidouh

Knowledge Harvesting for Business Intelligence

Abstract
With the growth rate of information volume, information access and knowledge management in enterprises has become challenging. This paper aims at describing the importance of semantic technologies (ontologies) and knowledge extraction techniques for knowledge management, search and capture in e-business processes. We will present the state of the art of ontology learning approaches from textual data and web environment and their integration in enterprise systems to perform personalized and incremental knowledge harvesting.
Nesrine Ben Mustapha, Marie-Aude Aufaure

Business Semantics as an Interface between Enterprise Information Management and the Web of Data: A Case Study in the Flemish Public Administration

Abstract
Conceptual modeling captures descriptions of business entities in terms of their attributes and relations with other business entities. When those descriptions are needed for interoperability tasks between two or more autonomously developed information systems ranging from Web of Data with no a priori known purposes for the data to Enterprise Information Management in which organizations agree on (strict) rules to ensure proper business, those descriptions are often captured in a shared formal specification called an ontology. We present the method Business Semantics Management (BSM), a fact-oriented approach to knowledge modeling grounded in natural language. We first show how fact-oriented approaches differ from approaches in terms of, amongst others, expressiveness, complexity, and decidability and how this formalism is easier for users to render their knowledge. We then explain the different processes in BSM and how the tool suite supports those processes. Finally, we show how the ontologies can be transformed into other formalisms suitable for particular interoperability tasks. All the processes and examples will be taken from industry cases throughout the lecture.
Christophe Debruyne, Pieter De Leenheer

Backmatter

Weitere Informationen

Premium Partner

    Bildnachweise