2012 | OriginalPaper | Buchkapitel
Comparison of Batches
verfasst von : Wolfgang Karl Härdle, Léopold Simar
Erschienen in: Applied Multivariate Statistical Analysis
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Multivariate statistical analysis is concerned with analysing and understanding data in high dimensions. We suppose that we are given a set
$\{x_{i}\}^{n}_{i=1}$
of
n
observations of a variable vector
X
in
$\mathbb {R}^{p}$
. That is, we suppose that each observation
x
i
has
p
dimensions:
$$x_i = (x_{i1}, x_{i2}, \ldots , x_{ip}),$$
and that it is an observed value of a variable vector
$X \in \mathbb {R}^{p}$
. Therefore,
X
is composed of
p
random variables:
$$X = (X_{1}, X_{2}, \ldots , X_{p})$$
where
X
j
, for
j
=1,…,
p
, is a one-dimensional random variable. How do we begin to analyse this kind of data? Before we investigate questions on what inferences we can reach from the data, we should think about how to look at the data. This involves descriptive techniques. Questions that we could answer by descriptive techniques are:
Are there components of
X
that are more spread out than others?
Are there some elements of
X
that indicate sub-groups of the data?
Are there outliers in the components of
X
?
How “normal” is the distribution of the data?
Are there “low-dimensional” linear combinations of
X
that show “non-normal” behaviour?