main-content

This book introduces the basic methodologies for successful data analytics. Matrix optimization and approximation are explained in detail and extensively applied to dimensionality reduction by principal component analysis and multidimensional scaling. Diffusion maps and spectral clustering are derived as powerful tools. The methodological overlap between data science and machine learning is emphasized by demonstrating how data science is used for classification as well as supervised and unsupervised learning.

### Chapter 1. Introduction

Abstract
Data Analytics is the science of exploring (big) data and designing methods and algorithms for detecting structures and information in the data. More specifically, we define Data Analytics as the discovery of models that capture the behavior of data and can be used to extract information, draw conclusions and make decisions. We conceive the concept of a “model” in a rather wide sense. For example, all of the following are regarded as a model which can be fitted to data.
Rudolf Mathar, Gholamreza Alirezaei, Emilio Balda, Arash Behboodi

### Chapter 2. Prerequisites from Matrix Analysis

Abstract
Linear algebra and matrix algebra provide the methodology for mapping high-dimensional data onto low-dimensional spaces. The combination of matrix analysis and optimization theory is of particular interest. This chapter focuses on elaborating tools which are prerequisite for data analytics and data processing. We will not only provide a vast overview, but will also introduce relevant theorems in detail with the derivation of proofs. We think that having deep insight into the general mathematical structure of matrix functions is extremely useful for dealing with unknown future problems.
Rudolf Mathar, Gholamreza Alirezaei, Emilio Balda, Arash Behboodi

### Chapter 3. Multivariate Distributions and Moments

Abstract
Probability theory provides mathematical laws for randomness and is hence an essential tool for quantitative analysis of nondeterministic or noisy data. It allows the description of complex systems when only partial knowledge of the state is available. For example, supervised learning is performed on the basis of training data. To assess robustness and reliability of derived decision and classification rules, knowledge of the underlying distributions is essential.
Rudolf Mathar, Gholamreza Alirezaei, Emilio Balda, Arash Behboodi

### Chapter 4. Dimensionality Reduction

Abstract
In many cases data analytics has to cope with the extremely high dimension of the input. Structures may be well hidden not only by the sheer amount of data but also by very high-dimensional noise added to relatively low-dimensional signals. The aim of this chapter is to introduce methods which represent high-dimensional data in a low-dimensional space in a way that only a minimum of core information is lost. Optimality will mostly refer to projections in Hilbert spaces. If dimension one, two or three is sufficient to represent the raw data, a computer aided graphical visualization may help to identify clusters or outlying objects.
Rudolf Mathar, Gholamreza Alirezaei, Emilio Balda, Arash Behboodi

### Chapter 5. Classification and Clustering

Abstract
Classifying objects according to certain features is one of the fundamental problems in machine learning. Binary classification by supervised learning will be the topic of Chap. 6. In this chapter we will start with some elementary classification rules which are derived by a training set. The goal is to find a classifier that predicts the class correspondence of future observations as accurately as possible.
Rudolf Mathar, Gholamreza Alirezaei, Emilio Balda, Arash Behboodi

### Chapter 6. Support Vector Machines

Abstract
In 1992, Boser, Guyon and Vapnik [7] introduced a supervised algorithm for classification that after numerous extensions is now known as Support Vector Machines (SVMs). Support Vector Machines denotes a class of algorithms for classification and regression, which represent the current state of the art. The algorithm determines a small subset of points—the support vectors—in a Euclidean space such that a hyperplane determined solely by these vectors separates two large classes of points at its best.  The purpose of this chapter is to introduce the key methodology based on convex optimization and kernel functions.
Rudolf Mathar, Gholamreza Alirezaei, Emilio Balda, Arash Behboodi

### Chapter 7. Machine Learning

Abstract
The rate of publications on machine learning has significantly increased over the last few years. Recent comprehensive books on the material are [32, 38, 41].
Rudolf Mathar, Gholamreza Alirezaei, Emilio Balda, Arash Behboodi