2012 | OriginalPaper | Chapter
Comparison of Batches
Authors : Wolfgang Karl Härdle, Léopold Simar
Published in: Applied Multivariate Statistical Analysis
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
Multivariate statistical analysis is concerned with analysing and understanding data in high dimensions. We suppose that we are given a set
$\{x_{i}\}^{n}_{i=1}$
of
n
observations of a variable vector
X
in
$\mathbb {R}^{p}$
. That is, we suppose that each observation
x
i
has
p
dimensions:
$$x_i = (x_{i1}, x_{i2}, \ldots , x_{ip}),$$
and that it is an observed value of a variable vector
$X \in \mathbb {R}^{p}$
. Therefore,
X
is composed of
p
random variables:
$$X = (X_{1}, X_{2}, \ldots , X_{p})$$
where
X
j
, for
j
=1,…,
p
, is a one-dimensional random variable. How do we begin to analyse this kind of data? Before we investigate questions on what inferences we can reach from the data, we should think about how to look at the data. This involves descriptive techniques. Questions that we could answer by descriptive techniques are:
Are there components of
X
that are more spread out than others?
Are there some elements of
X
that indicate sub-groups of the data?
Are there outliers in the components of
X
?
How “normal” is the distribution of the data?
Are there “low-dimensional” linear combinations of
X
that show “non-normal” behaviour?